Delete or Drop the duplicate row of a dataframe in python pandas

In this tutorial we will learn how to delete or drop the duplicate row of a dataframe in python pandas with example using drop_duplicates() function. lets learn how to

  • Drop the duplicate rows
  • Drop the duplicate by a column name

Create dataframe:

import pandas as pd
import numpy as np

#Create a DataFrame
d = {
    'Name':['Alisa','Bobby','jodha','jack','raghu','Cathrine',
            'Alisa','Bobby','kumar','Alisa','Alex','Cathrine'],
    'Age':[26,24,23,22,23,24,26,24,22,23,24,24],
     
    'Score':[85,63,55,74,31,77,85,63,42,62,89,77]}

df = pd.DataFrame(d,columns=['Name','Age','Score'])
df

so the resultant dataframe will be

Delete or Drop the duplicate row of a dataframe in python pandas 1

 

 

Drop the duplicate rows:

Now lets simply drop the duplicate rows in pandas as shown below

# drop duplicate rows

df.drop_duplicates()

In the above example first occurrence of the duplicate row is kept and subsequent occurrence will be deleted, so the output will be

Delete or Drop the duplicate row of a dataframe in python pandas 2

 

Drop the duplicate by retaining last occurrence:

# drop duplicate rows

df.drop_duplicates(keep='last')

In the above example keep=’last’ argument . Keeps the last duplicate row and delete the rest duplicated rows. So the output will be

Delete or Drop the duplicate row of a dataframe in python pandas 3

 

Drop the duplicate by column:

Now let’s drop the rows by column name. Rows are dropped in such a way that unique column value is retained for that column as shown below


# drop duplicate by a column name

df.drop_duplicates(['Name'], keep='last')

In the above example rows are deleted in such a way that, Name column contains only unique values

So the result will be

Delete or Drop the duplicate row of a dataframe in python pandas 4

 

previous-small Delete or Drop the duplicate row of a dataframe in python pandas                                                                                                                next_small Delete or Drop the duplicate row of a dataframe in python pandas