by() Function in R

by() function in R applies a function to specified subsets of a data frame.First parameter of by() function, takes up the data and second parameter is by which the function is applied and third parameter is the function.

Syntax of by() function in R: 

by(data, data$byvar, FUN)

 

data an R object, normally a data frame, possibly a matrix.
data$byvar a factor or a list of factors by which the function is applied
FUN a function to be applied to the subsets of data.

Example of by() function in R:

Lets use the iris and mtcars data set to demonstrate  R by() function. If we want to find the mean of sepal.length of the different species, we can use by() function.

# by() function in R with mean

mydata <- iris
by(mydata$Sepal.Length,list(mydata$Species),mean)

in the above example, mean of sepal.length is calculated for distinct Species, so the output will be

: setosa

[1] 5.006

——————————————————————————-

: versicolor

[1] 5.936

——————————————————————————-

: virginica

[1] 6.588

 

by() function in R with more than one list:

Lets use mtcars table to demonstrate one more example.

# by() function in R with more than one list

by(mydata$hp,list(mydata$gear,mydata$cyl),mean)

in the above example, mean of hp is calculated for distinct gear and cyl combination, so the output will be

: 3

: 4

[1] 97

——————————————————————————-

: 4

: 4

[1] 76

——————————————————————————-

: 5

: 4

[1] 102

——————————————————————————-

: 3

: 6

[1] 107.5

——————————————————————————-

: 4

: 6

[1] 116.5

——————————————————————————-

: 5

: 6

[1] 175

——————————————————————————-

: 3

: 8

[1] 194.1667

——————————————————————————-

: 4

: 8

[1] NA

——————————————————————————-

: 5

: 8

[1] 299.5

 

Mean of hp with gear =5 and cyl=8 is 299.5 and so on.

 

previous small by() function in r                                                                                                           next small by() function in r

Author

  • Sridhar Venkatachalam

    With close to 10 years on Experience in data science and machine learning Have extensively worked on programming languages like R, Python (Pandas), SAS, Pyspark.