# Groupby function in R using Dplyr – group_by

Groupby Function in R – group_by is used to group the dataframe in R.  Dplyr package in R is provided with group_by() function which groups the dataframe by multiple columns with mean, sum and other functions like count, maximum and minimum.

• dplyr group by can be done by using pipe operator (%>%) or by using aggregate() function or by summarise_at() Example of each is shown below.
• Special weightage on dplyr pipe operator (%>%) is given in this tutorial with all the groupby functions like  groupby minimum & maximum, groupby count & mean, groupby sum is depicted with an example of each.
• Groupby mean in R using dplyr pipe operator.
• Groupby count in R using dplyr pipe operator.
• Groupby minimum and Groupby maximum in R using dplyr pipe operator.
• Groupby sum in R using dplyr pipe operator.

Pictographical example of a groupby sum in Dplyr #### Groupby function in R with dplyr using summarize_at() function:

We will be using iris data to depict the example of group_by() function

```library(dplyr)
mydata2 <-iris

# Groupby function for dataframe in R
summarise_at(group_by(mydata2,Species),vars(Sepal.Length),funs(mean(.,na.rm=TRUE)))
```

Mean of Sepal.Length is grouped by Species variable. #### Groupby function in R with dplyr pipe operator %>%:

```library(dplyr)
mydata2 = iris

# Group by function for dataframe in R using pipe operator
mydata2 %>% group_by(Species) %>% summarise_at(vars(Sepal.Length),funs(sum(.,na.rm=TRUE)))
```

Sum of Sepal.Length is grouped by Species variable with the help of pipe operator (%>%) in dplyr package. As the result we will getting the sum of all the Sepal.Lengths of each species

So the output will be #### Groupby in R without dplyr using aggregate function:

In this example we will be using aggregate function in R to do group by operation as shown below

```mydata2 <-iris

# Group by in R using aggregate function

aggregate(mydata2\$Sepal.Length, by=list(Species=mydata2\$Species), FUN=sum)
```

Sum of Sepal.Length is grouped by Species variable with the help of aggregate function in R #### Groupby mean in R with dplyr pipe operator %>%:

```library(dplyr)
mydata2 = iris

# Group by function for dataframe in R using pipe operator
mydata2 %>% group_by(Species) %>% summarise_at(vars(Sepal.Length),funs(mean(.,na.rm=TRUE)))
```

mean of Sepal.Length is grouped by Species variable with the help of pipe operator (%>%) in dplyr package. As the result we will getting the mean Sepal.Length of each species

So the output will be #### Groupby count in R with dplyr pipe operator %>%:

```library(dplyr)
mydata2 = iris

# Group by function for dataframe in R using pipe operator
mydata2 %>% group_by(Species) %>% summarise_at(vars(Sepal.Length),funs(length))
```

count  of Sepal.Length column is grouped by Species variable with the help of pipe operator (%>%) in dplyr package. As the result we will getting the count of observations of Sepal.Length for each species

So the output will be #### Groupby max in R with dplyr pipe operator %>%:

```library(dplyr)
mydata2 = iris

# Group by function for dataframe in R using pipe operator
mydata2 %>% group_by(Species) %>% summarise_at(vars(Sepal.Length),funs(max(.,na.rm=TRUE)))
```

max of Sepal.Length column is grouped by Species variable with the help of pipe operator (%>%) in dplyr package. As the result we will getting the max value of Sepal.Length variable for each species

So the output will be #### Groupby min in R with dplyr pipe operator %>%:

```library(dplyr)
mydata2 = iris

# Group by function for dataframe in R using pipe operator
mydata2 %>% group_by(Species) %>% summarise_at(vars(Sepal.Length),funs(min(.,na.rm=TRUE)))
```

min of Sepal.Length column is grouped by Species variable with the help of pipe operator (%>%) in dplyr package. As the result we will getting the min value of Sepal.Length variable for each species

So the output will be For further understanding of group_by() function in R using dplyr one can refer the dplyr documentation