Melting and Casting in R

Melting and Casting in R:

One of the most interesting aspects of R programming is about changing the shape of the data to get a desired shape.Melting and casting in R, are the functions that can be used efficiently to reshape the data. The functions used to do this are called melt() and cast().

Melt Function in R:

The melt function takes data in wide format and stacks a set of columns into a single column of data. To make use of the function we need to specify a data frame, the id variables (which will be left at their settings) and the measured variables (columns of data) to be stacked. The default assumption on measured variables is that it is all columns that are not specified as id variables.

We will use the inbuilt data in R to understand how melt and cast function works.

library(MASS)
library(reshape2)
library(reshape)
print(head(ships,n=10))

This will print first 10 values of the inbuilt ships data

type    year   period   service   incidents
1    A         60       60          127             0
2    A         60       75          63               0
3    A         65       60          1095           3
4    A         65       75          1095           4
5    A         70       60          1512           6
6    A         70       75          3353           18
7    A         75       60          0                  0
8    A         75       75          2244           11
9    B         60       60          44882         39
10  B         60       75          17176         29

Now lets keep type and year as constant(id variable) and melt (stack) the other three variables namely period, service and incidents.

shipdata<-(head(ships,n=10))
molten.ships <- melt(shipdata, id = c("type","year"))
print(molten.ships)

 

As the result type and year column are kept constant. Columns named period, service and incidents are stacked under the column named variable and their values are stacked under the column named value. The result of melt function is shown below

type     year      variable      value
1   A          60         period         60
2   A          60         period         75
3   A          65         period         60
4   A          65         period         75
5   A          70         period         60
6   A          70         period         75
7   A          75         period         60
8   A          75         period         75
9   B          60         period         60
10 B          60         period         75
11 A          60         service        127
12 A          60         service        63
13 A          65         service        1095
14 A          65         service        1095
15 A          70         service        1512
16 A          70         service        3353
17 A          75         service        0
18 A          75         service        2244
19 B          60         service        44882
20 B          60         service        17176
21 A          60         incidents     0
22 A          60         incidents     0
23 A          65         incidents     3
24 A          65         incidents     4
25 A          70         incidents     6
26 A          70         incidents     18
27 A          75         incidents     0
28 A          75         incidents     11
29 B          60         incidents     39
30 B          60         incidents     29

Cast Function in R:

Aggregation occurs when the combination of variables in the cast function does not identify Individual observations. In this case cast function reduces the multiple values to a single one by summing up the values in the value column. Cast function example is shown below

 

recasted.ship <- cast(molten.ships, type+year~variable,sum)
 print(recasted.ship)

As the result cast function sums up the different variables for each type and year and those variables are casted back as columns and result is shown below.

type    year   period    service    incidents
1   A        60      135           190             0
2   A        65      135           2190           7
3   A        70      135           4865           24
4   A        75      135           2244           11
5   B        60      135           62058         68

For example Type A year 60 has two periods 60 and 75. This is summed up and result 135 is recorded under the column name period with the help of cast function.

Also refer Reshape from wide to long and long to wide

 previous small melting and casting in r                                                                                                               next small melting and casting in r