Binning or Bucketing of column in pandas python

Bucketing or Binning of continuous variable in pandas python to discrete chunks is depicted.Lets see how to bucket or bin the column of a dataframe in pandas python.

First let’s create a dataframe.

import pandas as pd
import numpy as np

#Create a DataFrame
df1 = {
    'Name':['George','Andrea','micheal','maggie','Ravi','Xien','Jalpa','Tyieren'],    
    'Score':[63,48,56,75,32,77,85,22]
    
   }


df1 = pd.DataFrame(df1,columns=['Name','Score'])
print(df1)

so the dataframe will be

Binning of a column in pandas python 1

 

Binning or bucketing in pandas python with range values:

By binning with the predefined values we will get binning range as a resultant column which is shown below

''' binning or bucketing with range'''

bins = [0, 25, 50, 75, 100]
df1['binned'] = pd.cut(df1['Score'], bins)
print (df1)

so the result will be

Binning of a column in pandas python 2

 

Binning or bucketing in pandas python with labels:

We will be assigning customized label to each bin. So labels will appear in column instead of bin range as shown below

''' binning or bucketing with labels''' 

bins = [0, 25, 50, 75, 100]
labels =[1,2,3,4]
df1['binned'] = pd.cut(df1['Score'], bins,labels=labels)
print (df1)

so the result will be

Binning of a column in pandas python 3

 

p Binning or Bucketing of column in pandas python

Author

  • Sridhar Venkatachalam

    With close to 10 years on Experience in data science and machine learning Have extensively worked on programming languages like R, Python (Pandas), SAS, Pyspark.