The Apply family comprises: apply, lapply , sapply, vapply, mapply, rapply, and tapply. The Family of Apply functions pertains to the R base package, and is populated with functions to manipulate slices of data from matrices, arrays, lists and data frames in a repetitive way. Apply Function in R are designed to avoid explicit use of loop constructs. They act on an input list, matrix or array, and apply a named function with one or several optional arguments.
An apply function could be:
- an aggregating function, like for example the mean, or the sum (that return a number or scalar);
- other transforming or sub-setting functions;
- and other vectorized functions, which return more complex structures like list, vectors, matrices and arrays.
The apply functions form the basis of more complex combinations and helps to perform operations with very few lines of code.
Let’s examine each of those
1. Apply Function in R:
Returns a vector or array or list of values obtained by applying a function to margins of an array or matrix.
Syntax for Apply function in R:
- Where the first Argument X is a data frame or matrix
- Second argument 1 indicated Processing along rows .if it is 2 then it indicated processing along the columns
- Third Argument is some aggregate function like sum, mean etc or some other user defined functions.
Example 1: Apply function in R:
Create data frame:
Age<-c(56,34,67,33,25,28) Weight<-c(78,67,56,44,56,89) Height<-c(165, 171,167,167,166,181) BMI_df<-data.frame(Age,Weight,Height) BMI_df
So the resultant dataframe will be
# row wise sum up of dataframe using apply function in R apply(BMI_df,1,sum)
row wise sum up of the dataframe has been done and the output of apply function is
[1] 299 272 290 244 247 298
Example 2 Apply function in R:
# column wise sum up of dataframe using apply function in R apply(BMI_df,2,sum)
column wise sum up of the dataframe has been done and the output of apply function is
Age Weight Height
243 390 1017
Example 3 Apply function in R:
# column wise mean of dataframe using apply function in R apply(BMI_df,2,mean)
column wise mean of the dataframe has been done and the output of apply function is
Age Weight Height
40.5 65.0 169.5
2. lapply function in R:
lapply function takes list, vector or Data frame as input and returns only list as output.
We will be using same dataframe for depicting example on lapply function
Example 1 for Lapply function in R:
lapply(BMI_df, function(BMI_df) BMI_df/2)
the above lapply function divides the values in the dataframe by 2 and the
output will be in form of list
$Age
[1] 28.0 17.0 33.5 16.5 12.5 14.0
$Weight
[1] 39.0 33.5 28.0 22.0 28.0 44.5
$Height
[1] 82.5 85.5 83.5 83.5 83.0 90.5
Example 2 for Lapply function in R:
# lapply function in R lapply(BMI_df, mean)
the above lapply function applies mean function to the columns of the dataframe and the output will be in the form of list
$Age
[1] 40.5
$Weight
[1] 65
$Height
[1] 169.5
3. Sapply function in R
sapply function takes list, vector or Data frame as input. It is similar to lapply function but returns only vector as output.
We will be using same dataframe for depicting example on sapply function
Example 1 for Sapply function in R:
sapply(BMI_df, function(BMI_df) BMI_df/2)
the above Sapply function divides the values in the dataframe by 2 and the
output will be in form of vector
Age Weight Height
[1,] 28.0 39.0 82.5
[2,] 17.0 33.5 85.5
[3,] 33.5 28.0 83.5
[4,] 16.5 22.0 83.5
[5,] 12.5 28.0 83.0
[6,] 14.0 44.5 90.5
Example 2 for Sapply function in R:
# sapply function in R sapply(BMI_df, mean)
the above sapply function applies mean function to the columns of the dataframe and the output will be in the form of vector
Age Weight Height
40.5 65.0 169.5
Example 3 of sapply function in R:
# sapply function in R random <- ("This","is","random","vector") sapply(random,nchar)
the above sapply function applies nchar function and the output will be
This is random vector
4 2 6 6
4 mapply function in R:
mapply is a multivariate version of sapply. mapply applies FUN to the first elements of each (…) argument, the second elements, the third elements, and so on.
i.e. For when you have several data structures (e.g. vectors, lists) and you want to apply a function to the 1st elements of each, and then the 2nd elements of each, etc., coercing the result to a vector/array as in sapply
This is multivariate in the sense that your function must accept multiple arguments.
Example of mapply function in R:
# mapply function in R mapply(sum, 1:4, 1:4, 1:4)
mapply sums up all the first elements(1+1+1) ,sums up all the
second elements(2+2+2) and so on so the result will be
Example of mapply function in R:
# mapply function in R mapply(rep,1:4,1:4)
it repeats the first element once , second element twice and so on. So the output will be
[[1]]
[1] 1
[[2]]
[1] 2 2
[[3]]
[1] 3 3 3
[[4]]
[1] 4 4 4 4
5. tapply function in R:
For when you want to apply a function to subsets of a vector and the subsets are defined by some other vector, usually a factor.
Lets go back to the famous iris data. Species is a factor with 3 values namely Setosa, versicolor and virginica. If we want to find the mean of sepal length of these 3 species(subsets). we can use tapply function
# tapply function in R attach(iris) # mean sepal length by species tapply(iris$Sepal.Length, Species, mean)
first argument of tapply function takes the vector for which we need to perform the function. second argument is a vector by which we need to perform the function and third argument is the function, here it is mean. So the output will be
5.006 5.936 6.588
In other words mean of all the sepal length where Species=”Setosa” is 5.006. Mean of all the sepal length where species=”Versicolor” is 5.936 and so on.
6. rapply function in R:
rapply function in R is nothing but recursive apply, as the name suggests it is used to apply a function to all elements of a list recursively.
# rapply function in R x=list(1,2,3,4) rapply(x,function(x){x^2},class=c("numeric"))
- first argument in the rapply function is the list, here it is x.
- the second argument is the function that needs to be applied over the list.
- last argument gives the classes to which the function should be applied
so the output will be
To understand the power of rapply function lets create a list that contains few Sublists
# rapply function in R x=list(3,list(4,5),6,list(7,list(8,9))) str(x) rapply(x,function(x) x^2,class=c("numeric"))
rapply function is applied even for the sublists and output will be
[1] 9 16 25 36 49 64 81
7. vapply function in R:
vapply function in R is similar to sapply, but has a pre-specified type of return value, so it can be safer (and sometimes faster) to use.
# vapply function in R vapply(1:5, sqrt, 1i)
output will be