Aggregate() Function in R

Aggregate() Function in R Splits the data into subsets, computes summary statistics for each subsets and returns the result in a group by form. Aggregate function in R is similar to group by in SQL.

Syntax for Aggregate() Function in R:

aggregate(x, by, FUN, simplify = TRUE)


x an R object, mostly a data frame
by a list of grouping elements, by which the subsets are grouped by
FUN a function to compute the summary statistics which can be applied to all data subsets
Simplify logical indicating the simplification of results

Example for aggregate() function in R:

Let’s use the iris data set to demonstrate a simple example of aggregate function in R. We all know about iris dataset. Suppose if want to find the mean of all the metrics (Sepal.Length Sepal.Width Petal.Length Petal.Width) for the distinct species then we can use aggregate function

# Aggregate function in R with mean summary statistics
agg_mean <- aggregate(iris[,1:4],by=list(iris$Species),FUN=mean, na.rm=TRUE)

the above code takes first 4 columns of iris data set and groups by “species” by computing the mean for each group, so the output will be

      Group.1   Sepal.Length   Sepal.Width   Petal.Length   Petal.Width

1     setosa          5.006             3.428               1.462                0.246

2  versicolor       5.936              2.770               4.260               1.326

3  virginica          6.588              2.974               5.552               2.026

note: When using the aggregate() function, the by variables must be in a list.

Example for aggregate() function in R with sum: 

Let’s use the aggregate() function in R to create the sum of all the metrics across species and group by species.

# Aggregate function in R with sum summary statistics
agg_sum<-aggregate(iris[,1:4],by=list(iris$Species),FUN=sum, na.rm=TRUE)

When we execute the above code, the output will be

       Group.1   Sepal.Length   Sepal.Width   Petal.Length   Petal.Width

1       setosa         250.3                171.4                 73.1                12.3

2      versicolor    296.8               138.5                 213.0              66.3

3       virginica      329.4               148.7                 277.6              101.3

previous small aggregate() function in r                                                                                                           next small aggregate() function in r