How to join (merge) data frames (inner, outer, right, left join) in pandas python

We can merge two data frames in pandas python by using the merge() function. The different arguments to merge() allow you to perform natural join,  left join, right join, and full outer join in pandas.

  • left− Dataframe1.
  • right– Dataframe2.
  • on− Columns (names) to join on. Must be found in both the left and right DataFrame objects.
  • how – type of join needs to be performed – ‘left’, ‘right’, ‘outer’, ‘inner’, Default is inner join

The data frames must have same column names on which the merging happens. Merge() Function in pandas is similar to database join operation in SQL.

UNDERSTANDING THE DIFFERENT TYPES OF MERGE:

  • Natural join: To keep only rows that match from the data frames, specify the argument how=‘inner’.
  • Full outer join:To keep all rows from both data frames, specify how=‘outer’.
  • Left outer join:To include all the rows of your data frame x and only those from y that match, specify how=‘left’.
  • Right outer join:To include all the rows of your data frame y and only those from x that match, specify how=‘right’.

join or merge in python pandas 1

 

Lets try different Merge or join operation with an example:

Create dataframe:

import pandas as pd
import numpy as np

# data frame 1
d1 = {'Customer_id':pd.Series([1,2,3,4,5,6]),
  'Product':pd.Series(['Oven','Oven','Oven','Television','Television','Television'])}
df1 = pd.DataFrame(d1)


# data frame 2
d2 = {'Customer_id':pd.Series([2,4,6]),
    'State':pd.Series(['California','California','Texas'])}
df2 = pd.DataFrame(d2)

so we will get following two data frames

df1:

  Customer_id   Product
0   1        Oven
1   2        Oven
2   3        Oven
3   4       Television
4   5       Television
5   6       Television

df2:

  Customer_id   State
0   2        California
1   4        California
2   6        Texas

 

 

Inner join pandas: Return only the rows in which the left table have matching keys in the right table

#inner join in python pandas
print pd.merge(df1, df2, on='Customer_id', how='inner')

the resultant data frame df will be

    Customer_id       Product           State

0            2                    Oven             California

1            4                    Television     California

2            6                    Television     Texas

 

 

Outer join pandas: Returns all rows from both tables, join records from the left which have matching keys in the right table.

# outer join in python pandas
print pd.merge(df1, df2, on='Customer_id', how='outer')

the resultant data frame df will be

    Customer_id        Product           State

0            1                   Oven               NaN

1            2                    Oven              California

2            3                     Oven              NaN

3            4                 Television          California

4            5                 Television          NaN

5            6                 Television          Texas

 

 

Left outer join pandas: Return all rows from the left table, and any rows with matching keys from the right table.

# left join in python
print pd.merge(df1, df2, on='Customer_id', how='left')

the resultant data frame df will be

  Customer_id      Product           State

0            1               Oven              NaN

1            2               Oven              California

2            3               Oven               NaN

3            4            Television          California

4            5            Television          NaN

5            6            Television          Texas

 

 

Right outer join pandas: Return all rows from the right table, and any rows with matching keys from the left table.

# right join in python pandas

print pd.merge(df1, df2, on='Customer_id', how='right')

the resultant data frame df will be

     Customer_id         Product          State

0            2                       Oven             California

1            4                   Television         California

2            6                    Television        Texas

 

previous How to join (merge) data frames (inner, outer, left, right) in pandas python                                                                                                                next How to join (merge) data frames (inner, outer, left, right) in pandas python