Union and union_all Function in R using Dplyr (union of data frames):

Union and union_all Function in R : Union of two data frames in R can be easily achieved by using union Function and union all function in Dplyr package . Dplyr package in R is provided with union(), union_all() function. Union of the dataframes can also accomplished using other functions like merge() and rbind().

 

Union function in R:

UNION function in R combines all rows from both the tables and removes duplicate records from the combined datasetUnion and Union all in R 1

union_all function in R:

 UNION_ALL function in R combines all rows from both the tables without removing the duplicate records from the combined dataset.Union and Union all in R 2

 

Union Function in R example: First lets create two data frames

#Create two data frames

df1 = data.frame(CustomerId = c(1:6), Product = c(rep("Oven", 3), rep("Television", 3)))
df2 = data.frame(CustomerId = c(4:7), Product = c(rep("Television", 2), rep("Air conditioner", 2)))

df1 will be

    CustomerId  Product
1        1                 Oven
2        2                 Oven
3        3                 Oven
4        4                 Television
5        5                 Television
6        6                 Television

df2 will be 

CustomerId         Product

1          4           Television
2          5           Television
3          6          Air conditioner
4          7          Air conditioner

 

Union Function in R :UNION function in R combines all rows from both the tables and removes duplicate records from the combined dataset. So the resultant dataframe will not have any duplicates.

library(dplyr)

#  union two dataframes  without duplicates
union(df1,df2)

union and union_all function in R using dplyr 2

 

union_all Function in R example: UNION_ALL function in R combines all rows from both the tables without removing the duplicate records from the combined dataset. So the resultant dataframe will have duplicates.

library(dplyr)

#  union two dataframes  with duplicates
union_all(df1,df2)

union and union_all function in R using dplyr 1

 

 

Other methods for union all of the dataframe in R : rbind()

There is an indirect way for union of data frames in R. it can done In using rbind() function. row bind (rbind) binds all the data frames as shown below

 # union all in R - union all of data frames by rbind() in R 

df_unionall = rbind(df1,df2)  
df_unionall
 

so the resultant dataframe will be

union and union_all function in R using dplyr 1

 

 

Other methods for union of the dataframe in R : Merge()

The merge() function takes up the these two data frames as argument with an option all=TRUE as shown below, which finds union of the dataframe in R

# union in R - union of data frames in R

df_union1 = merge(df1,df2,all=TRUE)
df_union1

so the resultant data frame will be

union and union_all function in R using dplyr 2

 

Other methods for union of the dataframe in R : rbind() with unique()

There is an indirect way for union of two data frames in R. it can done In two steps.

  •    First row bind (rbind) all the data frames ·
  •   Then Remove the duplicates

These two step has to be done sequentially and has been explained with an example.

 
#row bind the data frames. 

df_cat = rbind(df1,df2) 
df_union = unique(df_cat)

df_union

So the final output will be

union and union_all function in R using dplyr 2

 

For further understanding of  union of dataframes using dplyr package in R refer the dplyr documentation


Other Related Topics:

 

union union_all function in R using dplyr                                                                                                          union union_all function in R using dplyr

Author

  • Sridhar Venkatachalam

    With close to 10 years on Experience in data science and machine learning Have extensively worked on programming languages like R, Python (Pandas), SAS, Pyspark.