Skip to main content
Notice removed Recommended answer in R Language by Sotos
Notice added Recommended answer in R Language by Sotos
clarify reason for coercing key types; formatting
Source Link
zephryl
  • 17k
  • 4
  • 16
  • 34

You can do joins as well using Hadley Wickham's awesome dplyr package.

library(dplyr)

#make sure that CustomerId cols are both typethe numericsame type
#they ARE notaren’t usingin the provided codedata in(one questionis integer and dplyrone willis complaindouble)
df1$CustomerId <- as.numericdouble(df1$CustomerId)
df2$CustomerId <- as.numeric(df2$CustomerId)

Mutating joins: add columns to df1 using matches in df2

#inner
inner_join(df1, df2)

#left outer
left_join(df1, df2)

#right outer
right_join(df1, df2)

#alternate right outer
left_join(df2, df1)

#full join
full_join(df1, df2)

Filtering joins: filter out rows in df1, don't modify columns

semi_join(df1, df2) #keep only observations in df1 that match in df2.
anti_joinsemi_join(df1, df2) #drops

#drop all observations in df1 that match in df2.
anti_join(df1, df2)

You can do joins as well using Hadley Wickham's awesome dplyr package.

library(dplyr)

#make sure that CustomerId cols are both type numeric
#they ARE not using the provided code in question and dplyr will complain
df1$CustomerId <- as.numeric(df1$CustomerId)
df2$CustomerId <- as.numeric(df2$CustomerId)

Mutating joins: add columns to df1 using matches in df2

#inner
inner_join(df1, df2)

#left outer
left_join(df1, df2)

#right outer
right_join(df1, df2)

#alternate right outer
left_join(df2, df1)

#full join
full_join(df1, df2)

Filtering joins: filter out rows in df1, don't modify columns

semi_join(df1, df2) #keep only observations in df1 that match in df2.
anti_join(df1, df2) #drops all observations in df1 that match in df2.

You can do joins as well using Hadley Wickham's awesome dplyr package.

library(dplyr)

#make sure that CustomerId cols are both the same type
#they aren’t in the provided data (one is integer and one is double)
df1$CustomerId <- as.double(df1$CustomerId)

Mutating joins: add columns to df1 using matches in df2

#inner
inner_join(df1, df2)

#left outer
left_join(df1, df2)

#right outer
right_join(df1, df2)

#alternate right outer
left_join(df2, df1)

#full join
full_join(df1, df2)

Filtering joins: filter out rows in df1, don't modify columns

#keep only observations in df1 that match in df2.
semi_join(df1, df2)

#drop all observations in df1 that match in df2.
anti_join(df1, df2)
update answer for completeness
Source Link
Andrew Barr
  • 3.8k
  • 4
  • 20
  • 28

You can do joins as well using Hadley Wickham's awesome dplyr package.

Here is how you can do most of the joins in the original question with dplyr

library(dplyr)

#make sure that CustomerId cols are both type numeric
#they ARE not using the provided code in question and dplyr will complain
df1$CustomerId <- as.numeric(df1$CustomerId)
df2$CustomerId <- as.numeric(df2$CustomerId)

 

Mutating joins: add columns to df1 using matches in df2

#inner
inner_join(df1, df2)

#left outer
left_join(df1, df2)

#right outer 
right_join(justdf1, reversedf2)

#alternate argumentright order)outer
left_join(df2, df1)

#full join
full_join(df1, df2)

Filtering joins: filter out rows in df1, don't modify columns

semi_join(df1, df2) #keep only observations in df1 that match in df2.
anti_join(df1, df2) #drops all observations in df1 that match in df2.

You can do joins as well using Hadley Wickham's awesome dplyr package.

Here is how you can do most of the joins in the original question with dplyr

library(dplyr)

#make sure that CustomerId cols are both type numeric
#they ARE not using the provided code in question and dplyr will complain
df1$CustomerId <- as.numeric(df1$CustomerId)
df2$CustomerId <- as.numeric(df2$CustomerId)

 
#inner
inner_join(df1, df2)

#left outer
left_join(df1, df2)

#right outer (just reverse argument order)
left_join(df2, df1)

You can do joins as well using Hadley Wickham's awesome dplyr package.

library(dplyr)

#make sure that CustomerId cols are both type numeric
#they ARE not using the provided code in question and dplyr will complain
df1$CustomerId <- as.numeric(df1$CustomerId)
df2$CustomerId <- as.numeric(df2$CustomerId)

Mutating joins: add columns to df1 using matches in df2

#inner
inner_join(df1, df2)

#left outer
left_join(df1, df2)

#right outer
right_join(df1, df2)

#alternate right outer
left_join(df2, df1)

#full join
full_join(df1, df2)

Filtering joins: filter out rows in df1, don't modify columns

semi_join(df1, df2) #keep only observations in df1 that match in df2.
anti_join(df1, df2) #drops all observations in df1 that match in df2.
dplyr is no longer a new package, so I deleted the word "new"
Source Link
Andrew Barr
  • 3.8k
  • 4
  • 20
  • 28

You can do joins as well using Hadley Wickham's awesome new dplyr package.

Here is how you can do most of the joins in the original question with dplyr

library(dplyr)

#make sure that CustomerId cols are both type numeric
#they ARE not using the provided code in question and dplyr will complain
df1$CustomerId <- as.numeric(df1$CustomerId)
df2$CustomerId <- as.numeric(df2$CustomerId)


#inner
inner_join(df1, df2)

#left outer
left_join(df1, df2)

#right outer (just reverse argument order)
left_join(df2, df1)

You can do joins as well using Hadley Wickham's awesome new dplyr package.

Here is how you can do most of the joins in the original question with dplyr

library(dplyr)

#make sure that CustomerId cols are both type numeric
#they ARE not using the provided code in question and dplyr will complain
df1$CustomerId <- as.numeric(df1$CustomerId)
df2$CustomerId <- as.numeric(df2$CustomerId)


#inner
inner_join(df1, df2)

#left outer
left_join(df1, df2)

#right outer (just reverse argument order)
left_join(df2, df1)

You can do joins as well using Hadley Wickham's awesome dplyr package.

Here is how you can do most of the joins in the original question with dplyr

library(dplyr)

#make sure that CustomerId cols are both type numeric
#they ARE not using the provided code in question and dplyr will complain
df1$CustomerId <- as.numeric(df1$CustomerId)
df2$CustomerId <- as.numeric(df2$CustomerId)


#inner
inner_join(df1, df2)

#left outer
left_join(df1, df2)

#right outer (just reverse argument order)
left_join(df2, df1)
Source Link
Andrew Barr
  • 3.8k
  • 4
  • 20
  • 28
Loading