Set difference of dataframes in R

set difference of dataframes in R is computed using functions like setdiff() and anti_join(). In this tutorial we will be looking on how to compute set difference of two dataframes with an example

Let’s first create two dataframes.

# create dataframe 1
df1 =data.frame(State=c('Arizona','Georgia', 'Newyork','Indiana','seattle','washington','Texas'),
Score=c(62,47,55,74,31,77,85))
df1

df1

Set difference of dataframes in R 1

df2=data.frame(State=c('Arizona','Georgia','California','Florida'),Score=c(62,47,85,12))
df2

df2:

Set difference of dataframes in R 2

 

Set difference of two dataframes – (Method 1)

Set difference of two dataframes using setdiff() function of dplyr package

#method 1

library (dplyr)
setdiff(df1,df2) 

So the set difference will be

Set difference of dataframes in R 3

 

Set difference of two dataframes – (Method 2)

Set difference of two dataframes using anti_join() function.

#method 2
anti_join(df1,df2)

So the set difference will be

Set difference of dataframes in R 4

Set difference of dataframes in R - image previous-small-11-1 on http://www.datasciencemadesimple.com                                                                                                          Set difference of dataframes in R - image next_small-11-1 on http://www.datasciencemadesimple.com