I am trying to display multiple dataframes next to each other to compare certain entries. However, they have a different number of rows and I want each data frame to be in the exact same order. I tried to use <code>cbind</code> which did not work because of the different number of rows. I used <code>merge</code> to bind two dfs together and then merge them again, however they change order when I do that and it seems inefficient to merge two dfs when I have more than 5 in total. Examp: <pre class="prettyprint"><code>df <- data.frame(v=1:5, x=sample(LETTERS[1:5],5)) df v x 1 1 E 2 2 B 3 3 D 4 4 C 5 5 A df2 <- data.frame(m=7:10, n=sample(LETTERS[6:9],4)) df2 m n 1 7 G 2 8 I 3 9 F 4 10 H </code></pre> Then I ordered df2 <pre class="prettyprint"><code>df2 <- df2[order(df2$m, decreasing = TRUE),] df2 m n 4 10 F 3 9 I 2 8 H 1 7 G </code></pre> Expected output: <pre class="prettyprint"><code> v x m n 1 1 E 10 F 2 2 B 9 I 3 3 D 8 H 4 4 C 7 G 5 5 A NA NA </code></pre> As I said, I have more than two dfs and the order of the dfs should be remained. Any help will be greatly appreciated!

Base R approach : Put the dataframes in a list, get the dataframe with maximum number of rows, append <code>NA</code>'s to data which have less number of rows and <code>cbind</code>. <pre class="prettyprint"><code>list_df <- list(df, df2) n_r <- seq_len(max(sapply(list_df, nrow))) result <- do.call(cbind, lapply(list_df, `[`, n_r, )) result # v x m n #1 1 C 10 F #2 2 B 9 H #3 3 E 8 G #4 4 D 7 I #5 5 A NA <NA> </code></pre>

Edit: In case there are multiple <code>df</code>. Do this <ul> <li>Create a list of all dfs except one say first one</li> <li>use <code>purrr::reduce</code> to join all these together</li> <li>pass first <code>df</code> in <code>.init</code> argument.</li> </ul> <pre class="prettyprint"><code>df2 <- data.frame(m=7:10, n=sample(LETTERS[6:9],4)) df <- data.frame(v=1:5, x=sample(LETTERS[1:5],5)) df3 <- data.frame(bb = 101:110, cc = sample(letters, 10)) reduce(list(df2, df3), .init = df %>% mutate(id = row_number()) , ~full_join(.x, .y %>% mutate(id = row_number()), by = "id" )) %>% select(-id) v x m n bb cc 1 1 A 10 I 101 u 2 2 C 9 H 102 v 3 3 D 8 G 103 n 4 4 E 7 F 104 w 5 5 B NA <NA> 105 s 6 NA <NA> NA <NA> 106 y 7 NA <NA> NA <NA> 107 g 8 NA <NA> NA <NA> 108 i 9 NA <NA> NA <NA> 109 p 10 NA <NA> NA <NA> 110 h </code></pre> <hr> Earlier Answer: Create a dummy column <code>id</code> in both <code>df</code>s and use <code>full_join</code> <pre class="prettyprint"><code>full_join(df %>% mutate(id = row_number()), df2 %>% mutate(id = row_number()), by = "id") %>% select(-id) v x m n 1 1 A 10 I 2 2 C 9 H 3 3 D 8 G 4 4 E 7 F 5 5 B NA <NA> </code></pre> Results are different from as expected becuase of different random number seed <hr> Or in BaseR <pre class="prettyprint"><code>merge(transform(df, id = seq_len(nrow(df))), transform(df2, id = seq_len(nrow(df2))), all = T) id v x m n 1 1 1 A 10 I 2 2 2 C 9 H 3 3 3 D 8 G 4 4 4 E 7 F 5 5 5 B NA <NA> </code></pre> Remove extra column simply by subsetting [] <pre class="prettyprint"><code>merge(transform(df, id = seq_len(nrow(df))), transform(df2, id = seq_len(nrow(df2))), all = T)[-1] v x m n 1 1 A 10 I 2 2 C 9 H 3 3 D 8 G 4 4 E 7 F 5 5 B NA <NA> </code></pre>

Binding dataframes of different length (no cbind, no merge)

Tags:

merge

r

cbind

I am trying to display multiple dataframes next to each other to compare certain entries. However, they have a different number of rows and I want each data frame to be in the exact same order. I tried to use cbind which did not work because of the different number of rows. I used merge to bind two dfs together and then merge them again, however they change order when I do that and it seems inefficient to merge two dfs when I have more than 5 in total.

Examp:

df <-  data.frame(v=1:5, x=sample(LETTERS[1:5],5))
df 
  v x
1 1 E
2 2 B
3 3 D
4 4 C
5 5 A

df2 <- data.frame(m=7:10, n=sample(LETTERS[6:9],4))
df2
   m n
1  7 G
2  8 I
3  9 F
4 10 H

Then I ordered df2

df2 <- df2[order(df2$m, decreasing = TRUE),]
df2
   m n
4 10 F
3  9 I
2  8 H
1  7 G

Expected output:

  v x m n
1 1 E 10 F
2 2 B 9 I
3 3 D 8 H
4 4 C 7 G
5 5 A NA NA

As I said, I have more than two dfs and the order of the dfs should be remained. Any help will be greatly appreciated!

849

asked Apr 22 '21 06:04

Linda Espey

Video Answer

2 Answers

Base R approach :

Put the dataframes in a list, get the dataframe with maximum number of rows, append NA's to data which have less number of rows and cbind.

list_df <- list(df, df2)
n_r <- seq_len(max(sapply(list_df, nrow)))
result <- do.call(cbind, lapply(list_df, `[`, n_r, ))
result

#  v x  m    n
#1 1 C 10    F
#2 2 B  9    H
#3 3 E  8    G
#4 4 D  7    I
#5 5 A NA <NA>

102

answered Oct 28 '22 07:10

Ronak Shah

Edit: In case there are multiple df. Do this

Create a list of all dfs except one say first one
use purrr::reduce to join all these together
pass first df in .init argument.

df2 <- data.frame(m=7:10, n=sample(LETTERS[6:9],4))
df <-  data.frame(v=1:5, x=sample(LETTERS[1:5],5))
df3 <- data.frame(bb = 101:110, cc = sample(letters, 10))


reduce(list(df2, df3), .init = df %>% mutate(id = row_number()) , ~full_join(.x, .y %>% mutate(id = row_number()), by = "id" )) %>%
  select(-id)

    v    x  m    n  bb cc
1   1    A 10    I 101  u
2   2    C  9    H 102  v
3   3    D  8    G 103  n
4   4    E  7    F 104  w
5   5    B NA <NA> 105  s
6  NA <NA> NA <NA> 106  y
7  NA <NA> NA <NA> 107  g
8  NA <NA> NA <NA> 108  i
9  NA <NA> NA <NA> 109  p
10 NA <NA> NA <NA> 110  h

Earlier Answer: Create a dummy column id in both dfs and use full_join

full_join(df %>% mutate(id = row_number()), df2 %>% mutate(id = row_number()), by = "id") %>%
  select(-id)

  v x  m    n
1 1 A 10    I
2 2 C  9    H
3 3 D  8    G
4 4 E  7    F
5 5 B NA <NA>

Results are different from as expected becuase of different random number seed

Or in BaseR

merge(transform(df, id = seq_len(nrow(df))), transform(df2, id = seq_len(nrow(df2))), all = T)

  id v x  m    n
1  1 1 A 10    I
2  2 2 C  9    H
3  3 3 D  8    G
4  4 4 E  7    F
5  5 5 B NA <NA>

Remove extra column simply by subsetting []

merge(transform(df, id = seq_len(nrow(df))), transform(df2, id = seq_len(nrow(df2))), all = T)[-1]

  v x  m    n
1 1 A 10    I
2 2 C  9    H
3 3 D  8    G
4 4 E  7    F
5 5 B NA <NA>

answered Oct 28 '22 08:10

AnilGoyal

Related questions
                            
                                Is there an R function to replace a matched RegEx with a string of characters with the same length? [duplicate]
                            
                                render dropdown for single column in DT shiny BUT loaded only on cell click and with replaceData()
                            
                                How can I count the total number of occurrences at time step t of an element?
                            
                                R Shuffle and randomize columns of a data table
                            
                                How to replace exact number of characters in string based on occurrence between delimitors in R
                            
                                Using mutate with map2 and exec instead of invoke_map
                            
                                Error in summary.connection(connection) : invalid connection
                            
                                Is there a way to multiply the 2d matrices of a 3d array by a scalar in R?
                            
                                Create summary table in R using statistics from package `modifiedmk`
                            
                                Remove linear dependent variables while using the bife package
                            
                                Lookaround regular expression pattern in R
                            
                                How to append new data in googlesheet
                            
                                match all parentheses between two curly brackets
                            
                                Dodge two different geoms apart in ggplot2
                            
                                ggplot2 geom_bar fill aesthetic not changing
                            
                                How to count rows by group with n() inside dplyr::across()?
                            
                                How are apply family functions scoped?
                            
                                Tuning a LASSO model and predicting using tidymodels
                            
                                R: unequi join with merge function
                            
                                Knit PDf file from RStudio

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With