groupby function in pandas – Group a dataframe in python pandas

groupby function in pandas python: In this section we will learn how to groupby in python pandas and perform aggregate functions. we will be finding the mean of a group in pandas, sum of a group in pandas python and count of a group.

We will be working on

  • getting mean score of a group using groupby function in python
  • getting sum of score of a group using groupby function in python
  • descriptive statistics of a group using pandas groupby function

 

Create dataframe :

import pandas as pd
import numpy as np

#Create a DataFrame
d = {
    'Name':['Alisa','Bobby','Cathrine','Alisa','Bobby','Cathrine',
            'Alisa','Bobby','Cathrine','Alisa','Bobby','Cathrine'],
    'Exam':['Semester 1','Semester 1','Semester 1','Semester 1','Semester 1','Semester 1',
            'Semester 2','Semester 2','Semester 2','Semester 2','Semester 2','Semester 2'],
    
    'Subject':['Mathematics','Mathematics','Mathematics','Science','Science','Science',
               'Mathematics','Mathematics','Mathematics','Science','Science','Science'],
   'Score':[62,47,55,74,31,77,85,63,42,67,89,81]}

df = pd.DataFrame(d,columns=['Name','Exam','Subject','Score'])
print df

so the resultant dataframe will be

Group a dataframe in python pandas – group by function in pandas 1

 

Get mean score of a group using groupby function in pandas:

Now lets group by name of the student and find the average score of students in the following code

# mean score of Students

df['Score'].groupby([df['Name']]).mean()

result will be

Group a dataframe in python pandas – group by function in pandas 2

 

Get sum of score of a group using groupby function in pandas:

Now lets group by name of the student and Exam and find the sum of score of students across the groups

# sum of score group by Name and Exam

df['Score'].groupby([df['Name'],df['Exam']]).sum()

so the result will be

Group a dataframe in python pandas – group by function in pandas 3

 

Group the entire dataframe by Subject and Exam:

Now lets group the entire dataframe by subject and exam and then find the sum of score of students

# group the entire dataframe by Subject and  Exam

df.groupby(['Subject', 'Exam']).sum()

so the result will be

Group a dataframe in python pandas – group by function in pandas 4

 

Descriptive statistics of the group :

Now lets group by subject and find the descriptive statistics of that group as shown below


# descriptive statistics by group - subject

df['Score'].groupby(df['Subject']).describe()

so the result will be

Group a dataframe in python pandas – group by function in pandas 5

 

previous-small Group a dataframe in python pandas – group by function in pandas                                                                                                           next_small Group a dataframe in python pandas – group by function in pandas

Author

  • Sridhar Venkatachalam

    With close to 10 years on Experience in data science and machine learning Have extensively worked on programming languages like R, Python (Pandas), SAS, Pyspark.