Union and Union all in Pandas dataframe python

Union all of two data frame in pandas is carried out in simple roundabout way using concat() function. Union function in pandas is similar to union all but removes the duplicates. union in pandas is carried out using concat() and drop_duplicates() function. It will become clear when we explain it with an example.Lets see how to use Union and Union all in Pandas dataframe python

Union and union all in Pandas dataframe Python:

Union all of two data frames in pandas can be easily achieved by using concat() function. Lets see with an example. First lets create two data frames

import pandas as pd
import numpy as np

#Create a DataFrame
df1 = {
    'Subject':['semester1','semester2','semester3','semester4','semester1',
               'semester2','semester3'],
   'Score':[62,47,55,74,31,77,85]}

df2 = {
    'Subject':['semester1','semester2','semester3','semester4'],
   'Score':[90,47,85,74]}


df1 = pd.DataFrame(df1,columns=['Subject','Score'])
df2 = pd.DataFrame(df2,columns=['Subject','Score'])

df1
df2

df1 will be

union of dataframes in pandas 1

df2 will be

union of dataframes in pandas 2

 

Union all of dataframes in pandas:

Union and union all in pandas 11

UNION ALL

concat() function in pandas creates the union of two dataframe.

""" Union all in pandas"""
df_union_all= pd.concat([df1, df2])
df_union_all

union all of two dataframes  df1 and df2 is created with duplicates. So the resultant dataframe will be

union of dataframes in pandas 3

 

Union all of dataframes in pandas and reindex :

concat() function in pandas creates the union of two dataframe with ignore_index = True will reindex the dataframe

""" Union all with reindex in pandas"""
df_union_all= pd.concat([df1, df2],ignore_index=True)
df_union_all

union all of two dataframes  df1 and df2 is created with duplicates and the index is changed. So the resultant dataframe will be

Union and union all in pandas 14

 

 

Union of dataframes in pandas:

Union and union all in pandas 12

                                                                      UNION 

concat() function in pandas along with drop_duplicates() creates the union of two dataframe without duplicates which is nothing but union of dataframe.

""" union in pandas"""
df_union= pd.concat([df1, df2]).drop_duplicates()
print(df_union)

union of two dataframes  df1 and df2 is created by removing duplicates. So the resultant dataframe will be

union of dataframes in pandas 4

 

Union of dataframes in pandas with reindexing:

concat() function in pandas along with drop_duplicates() creates the union of two dataframe without duplicates which is nothing but union of dataframe. Also with ignore_index = True it will reindex the dataframe

""" union in pandas"""
df_union= pd.concat([df1, df2],ignore_index=True).drop_duplicates()
df_union

union of two dataframes  df1 and df2 is created by removing duplicates and index is also changed. So the resultant dataframe will be

Union and union all in pandas 15

 

Union and Union all in Pandas dataframe python p                                                                                                           Union and Union all in Pandas dataframe python n

Author

  • Sridhar Venkatachalam

    With close to 10 years on Experience in data science and machine learning Have extensively worked on programming languages like R, Python (Pandas), SAS, Pyspark.