I am trying to display multiple dataframes next to each other to compare certain entries. However, they have a different number of rows and I want each data frame to be in the exact same order.
I tried to use cbind
which did not work because of the different number of rows. I used merge
to bind two dfs together and then merge them again, however they change order when I do that and it seems inefficient to merge two dfs when I have more than 5 in total.
Examp:
df <- data.frame(v=1:5, x=sample(LETTERS[1:5],5))
df
v x
1 1 E
2 2 B
3 3 D
4 4 C
5 5 A
df2 <- data.frame(m=7:10, n=sample(LETTERS[6:9],4))
df2
m n
1 7 G
2 8 I
3 9 F
4 10 H
Then I ordered df2
df2 <- df2[order(df2$m, decreasing = TRUE),]
df2
m n
4 10 F
3 9 I
2 8 H
1 7 G
Expected output:
v x m n
1 1 E 10 F
2 2 B 9 I
3 3 D 8 H
4 4 C 7 G
5 5 A NA NA
As I said, I have more than two dfs and the order of the dfs should be remained. Any help will be greatly appreciated!
Use the full_join Function to Merge Two R Data Frames With Different Number of Rows. full_join is part of the dplyr package, and it can be used to merge two data frames with a different number of rows.
The bind_rows() method is used to combine data frames with different columns. The column names are number may be different in the input data frames. Missing columns of the corresponding data frames are filled with NA.
cbind() and rbind() both create matrices by combining several vectors of the same length. cbind() combines vectors as columns, while rbind() combines them as rows. Let's use these functions to create a matrix with the numbers 1 through 30.
The data frame method will be used if at least one argument is a data frame and the rest are vectors or matrices. L et’s create a second data frame and use the cbind () function to merge the second data frame to the first data frame.
For example, define a data frame using three columns and add the two columns to that data frame using the cbind () function, and see the output. The cbind () function, short for column bind, is a merge function that can combine two data frames with the same number of multiple rows into a single data frame.
It can be done using the merge () method. Below are some examples that depict how to merge data frames of different lengths using the above method: Below is a program to merge two student data frames of different lengths. Here is another program to merge one data frame of length 4 and another dataframe of length 9.
We can join vectors by columns using cbind and it does not matter whether these vectors are of same length or not. If the vectors are of same length then all the values of both the vectors are printed but if the length of these vectors are different then the values of the smaller vector gets repeated.
Base R approach :
Put the dataframes in a list, get the dataframe with maximum number of rows, append NA
's to data which have less number of rows and cbind
.
list_df <- list(df, df2)
n_r <- seq_len(max(sapply(list_df, nrow)))
result <- do.call(cbind, lapply(list_df, `[`, n_r, ))
result
# v x m n
#1 1 C 10 F
#2 2 B 9 H
#3 3 E 8 G
#4 4 D 7 I
#5 5 A NA <NA>
Edit: In case there are multiple df
. Do this
purrr::reduce
to join all these togetherdf
in .init
argument.df2 <- data.frame(m=7:10, n=sample(LETTERS[6:9],4))
df <- data.frame(v=1:5, x=sample(LETTERS[1:5],5))
df3 <- data.frame(bb = 101:110, cc = sample(letters, 10))
reduce(list(df2, df3), .init = df %>% mutate(id = row_number()) , ~full_join(.x, .y %>% mutate(id = row_number()), by = "id" )) %>%
select(-id)
v x m n bb cc
1 1 A 10 I 101 u
2 2 C 9 H 102 v
3 3 D 8 G 103 n
4 4 E 7 F 104 w
5 5 B NA <NA> 105 s
6 NA <NA> NA <NA> 106 y
7 NA <NA> NA <NA> 107 g
8 NA <NA> NA <NA> 108 i
9 NA <NA> NA <NA> 109 p
10 NA <NA> NA <NA> 110 h
Earlier Answer: Create a dummy column id
in both df
s and use full_join
full_join(df %>% mutate(id = row_number()), df2 %>% mutate(id = row_number()), by = "id") %>%
select(-id)
v x m n
1 1 A 10 I
2 2 C 9 H
3 3 D 8 G
4 4 E 7 F
5 5 B NA <NA>
Results are different from as expected becuase of different random number seed
Or in BaseR
merge(transform(df, id = seq_len(nrow(df))), transform(df2, id = seq_len(nrow(df2))), all = T)
id v x m n
1 1 1 A 10 I
2 2 2 C 9 H
3 3 3 D 8 G
4 4 4 E 7 F
5 5 5 B NA <NA>
Remove extra column simply by subsetting []
merge(transform(df, id = seq_len(nrow(df))), transform(df2, id = seq_len(nrow(df2))), all = T)[-1]
v x m n
1 1 A 10 I
2 2 C 9 H
3 3 D 8 G
4 4 E 7 F
5 5 B NA <NA>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With