Median function in R – median()

Median function in R – median() calculates the sample median. The median is the value at the middle when the data is sorted in ascending order. Median of a group can also calculated using median() function in R by providing it inside the aggregate function. with median() function we can also find row wise median using dplyr package and also column wise median lets see an example of each.

• Median of the list of vector elements with NA values
• Median of a particular column of the dataframe in R
• column wise median of the dataframe using median() function
• Median of the group in R dataframe using aggregate() and dplyr package
• Row wise median of the dataframe in R using median() function

Syntax for median function in R:

median(x, na.rm = FALSE, …)
• x – numeric vector
• rm- whether NA should be removed, if not, NA will be returned

Example of Median function in R with odd observation:

```# R median function with 7(odd) observation

x <-c(1.234,2.342,3.4,-4.562,5.671,12.345,-14.567)
median(x)
```

There are 7 observations in above examples. When arranged in ascending order 4th value is the median value so the output will be

[1] 2.342

Example of Median function in R with even observation:

```# R median function with 6(even) observation

x <-c(1.234,2.342,-4.562,5.671,12.345,-14.567)
median(x)
```

There are 6 observations in above example. So the median will be average of 3rd and 4th value when arranged in ascending order. So the output will be (1.234+2.342)/2

output:
[1] 1.788

Example of Median function with NA:

Median function doesn’t give desired output, If NAs are present in the vector. so it has to be handled by using na.rm=TRUE in median() function

```# R median function for input vector which has NA.

x <-c(1.234,2.342,-4.562,5.671,12.345,-14.567,NA)
median(x,na.rm=TRUE)
```
[1] 1.788

Example of median() function in R dataframe:

Lets create the data frame to demonstrate median function – median() in r

```### create the dataframe
my_basket = data.frame(ITEM_GROUP = c("Fruit","Fruit","Fruit","Fruit","Fruit","Vegetable","Vegetable","Vegetable","Vegetable","Dairy","Dairy","Dairy","Dairy","Dairy"),
Price = c(100,80,80,90,65,70,60,70,25,60,40,35,50,120),
MRP = c(101,85,85,96,67,73,65,71,33,64,45,36,54,123))
```

so the resultant dataframe will be

Median of a column in R data frame using median() function :

median() function takes the column name as argument and finds the median of that particular column

```# median() function in R : median of a column in data frame

```

so the resultant median of “Price” column will be

output:

[1] 67.5

column wise median using median() function:

median() function is applied to the required column through mapply() function, so that it  calculates the meidan of required column as shown below.

```
# median() function in R : median of multiple column in data frame

```

so the resultant median of “Price” and “MRP” columns will be

Median of the column by group using median() function

aggregate() function along with the median() function calculates the median of a group. here median of “Price” column, for “Item_Group” is calculated.

```##### Median of the column by group
FUN=median)
```

Item_group has three groups “Dairy”,”Fruit” & “Vegetable”. median of price for each group is calculated as shown below

median of the column by group  and populate it by using median() function:

group_by() function along with the median() function calculates the median of a group. here median of “Price” column, for “Item_Group” is calculated and populated across as shown below

```#### median of the column by group and populate it using dplyr

library(dplyr)

group_by(ITEM_GROUP) %>%
mutate(median_by_group = median(Price))
```

Item_group has three groups “Dairy”,”Fruit” & “Vegetable”. median of price for each group is calculated and populated as shown below

Row wise median using median() function along with dplyr

Row wise median is calculated with the help rowwise() function of dplyr package  and median() function as shown below

```## row wise median using dplyr
library(dplyr)

rowwise() %>%
mutate(
Median_price = median(c(Price,MRP))
)
```

row wise median of “Price” and “MRP” is calculated and  populated for each row as shown below

For further understanding of median() function in R using dplyr one can refer the dplyr documentation

Related Topics:

Author

• With close to 10 years on Experience in data science and machine learning Have extensively worked on programming languages like R, Python (Pandas), SAS, Pyspark.