Union in R

In this chapter, let’s learn how to perform union in R for Vector and data frame. union() function in R performs union of two or more vectors or data frames. union of two dataframes in R can also accomplished using other roundabout methods which will be discussing below.

Union and Union all in R 1

union() function in R – union of vectors:

# union in R - Union of two vectors in R

x <- c(1:4)
y <- c(2:7)
union(x,y)

on execution of above code the output will be union of two vectors

[1] 1 2 3 4 5 6 7

union of data frames in R : 

Union of two data frames can be easily achieved by using merge() function. Lets see with an example. First lets create two data frames

# Create two data frames

df1 = data.frame(CustomerId = c(1:6), Product = c(rep("Oven", 3), rep("Television", 3)))
df2 = data.frame(CustomerId = c(4:7), Product = c(rep("Television", 2), rep("Air conditioner", 2)))
df1 will be 

CustomerId   Product

1           1         Oven

2           2         Oven

3           3         Oven

4           4     Television

5           5     Television

6           6     Television

 

  df2 will be 

 CustomerId     Product

1          4           Television

2          5           Television

3          6         Air conditioner

4          7         Air conditioner

 

 

Example 1 : union of two dataframes using merge()

The merge() function takes these two data frames as argument which unions these two dataframes with an option all=TRUE as shown below

# union in R - union of data frames in R

df_union1<-merge(df1,df2,all=TRUE)
df_union1

so the resultant data frame will be

union and union_all function in R using dplyr 2

Thus we have applied union in R for data frames

 

Example 2 on union function in R of data frames using union() function:

UNION function combines all rows from both the tables and removes duplicate records from the combined dataset. So the resultant dataframe will not have any duplicates.

library(dplyr)

#  union two dataframes  without duplicates
union(df1,df2)

so the resultant dataframe will be

union and union_all function in R using dplyr 2

 

 

Example 3 on union of data frames using rbind() function:

There is an indirect way for union of two data frames in R. it can done In two steps.

  •    First row bind (rbind) all the data frames ·
  •   Then Remove the duplicates

These two step has to be done sequentially and has been explained with an example.

Row bind these two data frames as shown below

#row bind the data frames.

df_cat<-rbind(df1,df2)
df_cat

so the resultant df_cat data frame will be

   CustomerId     Product

1           1            Oven

2           2            Oven

3           3            Oven

4           4        Television

5           5        Television

6           6        Television

7           4        Television

8           5        Television

9           6    Air conditioner

10          7    Air conditioner

Retrieve only unique rows from the above df_cat data frame as shown below.

#unique function.

df_union <- unique(df_cat)
df_union

So the final output will be

   CustomerId    Product

1           1             Oven

2           2             Oven

3           3             Oven

4           4        Television

5           5        Television

6           6        Television

9           6    Air conditioner

10          7    Air conditioner

Thus we have learned how to apply union of data frames indirectly.

 

For further understanding of  union of dataframes using dplyr package in R refer the dplyr documentation


Other Related Topics:

previous small union in r                                                                                                           next small union in r

Author

  • Sridhar Venkatachalam

    With close to 10 years on Experience in data science and machine learning Have extensively worked on programming languages like R, Python (Pandas), SAS, Pyspark.