Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Performing a dplyr full_join without a common variable to blend data frames

Tags:

r

dplyr

Using the dplyr full_join() operation, I am trying to perform the equivalent of a basic merge() operation in which no common variables exist (unable to satisfy the "by=" argument). This will blend two data frames and return all possible combinations.

However, the current full_join() function requires a common variable. I am unable to locate another dplyr function that can help with this. How can I perform this operation using functions specific to the dplyr library?

df_a = data.frame(department=c(1,2,3,4))
df_b = data.frame(period=c(2014,2015,2016,2017))

#This works as desired
big_df = merge(df_a,df_b)

#I'd like to perform the following in a much bigger operation:
big_df = dplyr::full_join(df_a,df_b)

#Error: No common variables. Please specify `by` param.
like image 832
Dale Kube Avatar asked Jun 29 '17 19:06

Dale Kube


People also ask

How do I merge data frames with Dplyr?

We can merge two data frames in R by using the merge() function or by using family of join() function in dplyr package. The data frames must have same column names on which the merging happens. Merge() Function in R is similar to database join operation in SQL.

How does Full_join work in R?

full_join() return all rows and all columns from both x and y . Where there are not matching values, returns NA for the one missing. return all rows from x where there are matching values in y , keeping just columns from x .

How do I use left join in Dplyr?

Joins with dplyr. dplyr uses SQL database syntax for its join functions. A left join means: Include everything on the left (what was the x data frame in merge() ) and all rows that match from the right (y) data frame. If the join columns have the same name, all you need is left_join(x, y) .

How do I merge data in R?

To join two data frames (datasets) vertically, use the rbind function. The two data frames must have the same variables, but they do not have to be in the same order. If data frameA has variables that data frameB does not, then either: Delete the extra variables in data frameA or.


1 Answers

You can use crossing from tidyr:

crossing(df_a,df_b)

   department period
1           1   2014
2           1   2015
3           1   2016
4           1   2017
5           2   2014
6           2   2015
7           2   2016
8           2   2017
9           3   2014
10          3   2015
11          3   2016
12          3   2017
13          4   2014
14          4   2015
15          4   2016
16          4   2017
like image 135
Pierre Lapointe Avatar answered Sep 27 '22 21:09

Pierre Lapointe