Summary or Descriptive statistics in R

Descriptive Statistics of the dataframe in R can be calculated by 3 different methods. Let’s see how to calculate summary statistics of each column of dataframe  in R with an example for each method.

  • Descriptive statistics with summary function in R
  • summary statistics in R using stat.desc() function from “pastecs” package
  • Descriptive statistics with describe() function from “Hmisc” package

Let’s first create the dataframe.

### Create Data Frame
df1 = data.frame(Name = c('George','Andrea', 'Micheal','Maggie','Ravi','Xien','Jalpa'), 
                 Grade_score=c(4,6,2,9,5,7,8),
                 Mathematics1_score=c(45,78,44,89,66,49,72),
                 Science_score=c(56,52,45,88,33,90,47))
df1

So the resultant dataframe will be

Descriptive or summary statistics in R 0

 

Descriptive statistics in R (Method 1):

Descriptive statistics in R with simple summary function calculates

  • minimum value of each column
  • maximum value of each column
  • mean value of each column
  • median value of each column
  • 1st quartile  of each column
  • 3rd quartile of each column

as shown below

# Summary statistics of dataframe in R

summary(df1)

summary statistics is

Descriptive or summary statistics in R 1

 

Summary / Descriptive statistics in R (Method 2):

Descriptive statistics in R with pastecs package does bit more than simple describe () function. It also Calculates

  • number of missing values and null of each column in R
  • number of non missing values of each column
  • sum , range ,variance and standard deviation etc for each column
# descripive statistics of dataframe in R 

install.packages("pastecs")  
library(pastecs)
stat.desc(df1)

summary statistics is

Descriptive or summary statistics in R 2

 

Summary statistics in R (Method 3):

Descriptive statistics in R with Hmisc package calculates the  distinct value of each column, frequency of each value and proportion of that value in that column. as shown below

# Summary statistics of dataframe in R 

install.packages("Hmisc")
library(Hmisc)
describe(df1)

summary statistics is

Descriptive or summary statistics in R 3