Intersection of two dataframe in Pandas python

Intersection of two dataframe in pandas is carried out using merge() function. merge() function with “inner” argument keeps only the values which are present in both the dataframes. It will become clear when we explain it with an example

Intersection of two dataframe in pandas Python:

Intersection of two dataframes in pandas can be achieved in roundabout way using merge() function. Let’s see with an example.

First let’s create two data frames

import pandas as pd
import numpy as np

#Create a DataFrame
df1 = {
    'Subject':['semester1','semester2','semester3','semester4','semester1',
               'semester2','semester3'],
   'Score':[62,47,55,74,31,77,85]}

df2 = {
    'Subject':['semester1','semester2','semester3','semester4'],
   'Score':[90,47,85,74]}


df1 = pd.DataFrame(df1,columns=['Subject','Score'])
df2 = pd.DataFrame(df2,columns=['Subject','Score'])

print(df1)
print(df2)

df1 will be

Intersection of two dataframe in Pandas python 1

df2 will be

Intersection of two dataframe in Pandas python 2

 

Intersection of dataframes in pandas:

merge() function in pandas can be used to create the intersection of two dataframe, along with inner argument as shown below.


intersected_df = pd.merge(df1, df2, how='inner')
print(intersected_df)

so the intersected dataframe will be

Intersection of two dataframe in Pandas python 3

 

Intersection of two dataframe in Pandas python                                                                                                           Intersection of two dataframe in Pandas python

Author

  • Sridhar Venkatachalam

    With close to 10 years on Experience in data science and machine learning Have extensively worked on programming languages like R, Python (Pandas), SAS, Pyspark.