Apply Function in R – apply vs lapply vs sapply vs mapply vs tapply vs rapply vs vapply

    The Apply family comprises: apply, lapply , sapply, vapply, mapply, rapply, and tapplyThe Family of Apply functions pertains to the R base package, and is populated with functions to manipulate slices of data from matrices, arrays, lists and data frames in a repetitive way. Apply Function in R are designed to avoid explicit use of loop constructs. They act on an input list, matrix or array, and apply a named function with one or several optional arguments.

An apply function could be:

  • an aggregating function, like for example the mean, or the sum (that return a number or scalar);
  • other transforming or sub-setting functions;
  • and other vectorized functions, which return more complex structures like list, vectors, matrices and arrays.

The apply functions form the basis of more complex combinations and helps to perform operations with very few lines of code.

 

Let’s examine each of those

1. Apply Function in R:

Returns a vector or array or list of values obtained by applying a function to margins of an array or matrix.

Syntax for Apply function in R:  

Apply(x,1,sum)
  • Where the first Argument X is a data frame or matrix
  • Second argument 1 indicated Processing along rows .if it is 2 then it indicated processing along the columns
  • Third Argument is some aggregate function like sum, mean etc or some other user defined functions.

 

 

Example 1: Apply function in R:

Create data frame:

Age<-c(56,34,67,33,25,28)
Weight<-c(78,67,56,44,56,89)
Height<-c(165, 171,167,167,166,181)

BMI_df<-data.frame(Age,Weight,Height)
BMI_df

So the resultant dataframe will be

Apply function in R 1

Apply function in R 2


# row wise sum up of dataframe using apply function in R
apply(BMI_df,1,sum)

row wise sum up of the dataframe has been done and the output of apply function is

 [1] 299 272 290 244 247 298    

 

Example 2 Apply function in R:

Apply function in R 3

# column wise sum up of dataframe using apply function in R
apply(BMI_df,2,sum)

column wise sum up of the dataframe has been done and the output of apply function is

Age Weight Height

   243    390   1017

 

Example 3 Apply function in R:

# column wise mean of dataframe using apply function in R
apply(BMI_df,2,mean)

column wise mean of the dataframe has been done and the output of apply function is

   Age   Weight   Height

  40.5    65.0    169.5  

 

2. lapply function in R:

lapply function takes list, vector or Data frame  as input and returns only list as output.

We will be using same dataframe for depicting example on lapply function

Apply function in R 1

Example 1 for Lapply function in R: 
lapply(BMI_df, function(BMI_df) BMI_df/2)

the above lapply function divides the values in the dataframe by 2 and the
output will be in form of list

$Age
[1] 28.0 17.0 33.5 16.5 12.5 14.0

$Weight
[1] 39.0 33.5 28.0 22.0 28.0 44.5

$Height
[1] 82.5 85.5 83.5 83.5 83.0 90.5

 

Example 2 for Lapply function in R: 
# lapply function in R
lapply(BMI_df, mean)

the above lapply function applies mean function to the columns of the dataframe and the output will be in the form of list  

$Age
[1] 40.5

$Weight
[1] 65

$Height
[1] 169.5

 

 

3. Sapply function in R

sapply function takes list, vector or Data frame  as input. It is similar to lapply function but returns only vector as output.

We will be using same dataframe for depicting example on sapply function

Apply function in R 1

 

Example 1 for Sapply function in R: 
sapply(BMI_df, function(BMI_df) BMI_df/2)

the above Sapply function divides the values in the dataframe by 2 and the
output will be in form of vector

       Age Weight Height

[1,] 28.0   39.0   82.5

[2,] 17.0   33.5   85.5

[3,] 33.5   28.0   83.5

[4,] 16.5   22.0   83.5

[5,] 12.5   28.0   83.0

[6,] 14.0   44.5   90.5

 

Example 2 for Sapply function in R: 
# sapply function in R
sapply(BMI_df, mean)

the above sapply function applies mean function to the columns of the dataframe and the output will be in the form of vector  

Age     Weight      Height
40.5     65.0           169.5

 

Example 3 of sapply function in R:

# sapply function in R
random <- ("This","is","random","vector")
sapply(random,nchar)

the above sapply function applies nchar function and the output will be

This      is     random     vector

4          2           6                6

4 mapply function in R:

mapply is a multivariate version of sapply. mapply applies FUN to the first elements of each (…) argument, the second elements, the third elements, and so on.

i.e. For when you have several data structures (e.g. vectors, lists) and you want to apply a function to the 1st elements of each, and then the 2nd elements of each, etc., coercing the result to a vector/array as in sapply

This is multivariate in the sense that your function must accept multiple arguments.

Example of mapply function in R:

# mapply function in R
mapply(sum, 1:4, 1:4, 1:4)

mapply sums up all the first elements(1+1+1) ,sums up all the

second elements(2+2+2) and so on so the result will be

[1]  3   6   9   1

Example of mapply function in R:

# mapply function in R
mapply(rep,1:4,1:4)

it repeats the first element once , second element twice and so on. So the output will be

[[1]]
[1] 1

[[2]]
[1] 2 2

[[3]]
[1] 3 3 3

[[4]]
[1] 4 4 4 4

 

5. tapply function in R:

For when you want to apply a function to subsets of a vector and the subsets are defined by some other vector, usually a factor.

Lets go back to the famous iris data. Species is a factor with 3 values namely Setosa, versicolor and virginica. If we want to find the mean of sepal length of these 3 species(subsets). we can use tapply function

# tapply function in R
attach(iris)
# mean sepal length by species
tapply(iris$Sepal.Length, Species, mean)

first argument of tapply function takes the vector for which we need to perform the function. second argument is a vector by which we need to perform the function and third argument is the function, here it is mean. So the output will be

setosa      versicolor        virginica
5.006          5.936               6.588

In other words mean of all the sepal length where Species=”Setosa” is 5.006. Mean of all the sepal length where species=”Versicolor” is 5.936 and so on.

 

6. rapply function in R:

rapply function in R is nothing but recursive apply, as the name suggests it is used to apply a function to all elements of a list recursively.

# rapply function in R
x=list(1,2,3,4)
rapply(x,function(x){x^2},class=c("numeric"))
  • first argument in the rapply function is the list, here it is x.
  • the second argument is the function that needs to be applied over the list.
  • last argument gives the classes to which the function should be applied

so the output will be

   [1]  1  4  9  16

To understand the power of rapply function lets create a list that contains few Sublists

# rapply function in R
x=list(3,list(4,5),6,list(7,list(8,9)))
str(x)
rapply(x,function(x) x^2,class=c("numeric"))

rapply function is applied even for the sublists and output will be

   [1]  9   16   25   36   49   64   81

 

7. vapply function in R:

vapply function in R is similar to sapply, but has a pre-specified type of return value, so it can be safer (and sometimes faster) to use.

# vapply function in R
vapply(1:5, sqrt, 1i)

output will be

   [1] 1.000000 0i      1.414214 0i     1.732051 0i         2.000000 0i         2.236068 0i

 

previous small apply function in r                                                                                                           next small apply function in r

Author

  • Sridhar Venkatachalam

    With close to 10 years on Experience in data science and machine learning Have extensively worked on programming languages like R, Python (Pandas), SAS, Pyspark.