Melting and Casting in R:
One of the most interesting aspects of R programming is about changing the shape of the data to get a desired shape.Melting and casting in R, are the functions that can be used efficiently to reshape the data. The functions used to do this are called melt() and cast().
Melt Function in R:
The melt function takes data in wide format and stacks a set of columns into a single column of data. To make use of the function we need to specify a data frame, the id variables (which will be left at their settings) and the measured variables (columns of data) to be stacked. The default assumption on measured variables is that it is all columns that are not specified as id variables.
We will use the inbuilt data in R to understand how melt and cast function works.
library(MASS) library(reshape2) library(reshape) print(head(ships,n=10))
This will print first 10 values of the inbuilt ships data
1 A 60 60 127 0
2 A 60 75 63 0
3 A 65 60 1095 3
4 A 65 75 1095 4
5 A 70 60 1512 6
6 A 70 75 3353 18
7 A 75 60 0 0
8 A 75 75 2244 11
9 B 60 60 44882 39
10 B 60 75 17176 29
Now lets keep type and year as constant(id variable) and melt (stack) the other three variables namely period, service and incidents.
shipdata<-(head(ships,n=10)) molten.ships <- melt(shipdata, id = c("type","year")) print(molten.ships)
As the result type and year column are kept constant. Columns named period, service and incidents are stacked under the column named variable and their values are stacked under the column named value. The result of melt function is shown below
1 A 60 period 60
2 A 60 period 75
3 A 65 period 60
4 A 65 period 75
5 A 70 period 60
6 A 70 period 75
7 A 75 period 60
8 A 75 period 75
9 B 60 period 60
10 B 60 period 75
11 A 60 service 127
12 A 60 service 63
13 A 65 service 1095
14 A 65 service 1095
15 A 70 service 1512
16 A 70 service 3353
17 A 75 service 0
18 A 75 service 2244
19 B 60 service 44882
20 B 60 service 17176
21 A 60 incidents 0
22 A 60 incidents 0
23 A 65 incidents 3
24 A 65 incidents 4
25 A 70 incidents 6
26 A 70 incidents 18
27 A 75 incidents 0
28 A 75 incidents 11
29 B 60 incidents 39
30 B 60 incidents 29
Cast Function in R:
Aggregation occurs when the combination of variables in the cast function does not identify Individual observations. In this case cast function reduces the multiple values to a single one by summing up the values in the value column. Cast function example is shown below
recasted.ship <- cast(molten.ships, type+year~variable,sum) print(recasted.ship)
As the result cast function sums up the different variables for each type and year and those variables are casted back as columns and result is shown below.
1 A 60 135 190 0
2 A 65 135 2190 7
3 A 70 135 4865 24
4 A 75 135 2244 11
5 B 60 135 62058 68
For example Type A year 60 has two periods 60 and 75. This is summed up and result 135 is recorded under the column name period with the help of cast function.
Also refer Reshape from wide to long and long to wide