I'm trying to compare multiple vectors to see where there are matching values between them. I'd like to combine the vectors into a table where every column either has the same value (for matches) or NA (for no match). For example: <pre class="prettyprint"><code>list1 <- c("a", "b", "c", "d") list2 <- c("a", "c", "d") list3 <- c("a", "b", "c", "e", "f") </code></pre> Should become: <pre class="prettyprint"><code>a a a b NA b c c c d d NA NA NA e NA NA f </code></pre> I've tried making the vectors dataframes and using <code>merge</code>, <code>join</code> from <code>dplyr</code>, <code>cbind</code>, <code>cbind.fill</code>, but all those either return a single column or don't match values across all rows. What's the best way to get this result with R?

A <code>Base R</code> solution: <pre class="prettyprint"><code>df1 = data.frame(col = list1, list1) df2 = data.frame(col = list2, list2) df3 = data.frame(col = list3, list3) Reduce(function(x, y) merge(x, y, all=TRUE), list(df1, df2, df3)) # col list1 list2 list3 # 1 a a a a # 2 b b <NA> b # 3 c c c c # 4 d d d <NA> # 5 e <NA> <NA> e # 6 f <NA> <NA> f </code></pre> Result: <pre class="prettyprint"><code>> Reduce(function(x, y) merge(x, y, all=TRUE), list(df1, df2, df3))[,-1] list1 list2 list3 1 a a a 2 b <NA> b 3 c c c 4 d d <NA> 5 <NA> <NA> e 6 <NA> <NA> f </code></pre> or with <code>dplyr</code> + <code>purrr</code>: <pre class="prettyprint"><code>library(dplyr) library(purrr) list(list1, list2, list3) %>% map(~ data.frame(col = ., ., stringsAsFactors = FALSE)) %>% reduce(full_join, by = "col") %>% select(-col) %>% setNames(paste0("list", 1:3)) </code></pre> Data: <pre class="prettyprint"><code>list1 <- c("a", "b", "c", "d") list2 <- c("a", "c", "d") list3 <- c("a", "b", "c", "e", "f") </code></pre>

You can use <code>unlist</code> and <code>unique</code> to get all possible values, then find their matches across each of the vectors. If nothing matches, <code>match</code> returns <code>NA</code> like you want: <pre class="prettyprint"><code>list1 <- c("a", "b", "c", "d") list2 <- c("a", "c", "d") list3 <- c("a", "b", "c", "e", "f") list_of_lists <- list( list1 = list1, list2 = list2, list3 = list3 ) all_values <- unique(unlist(list_of_lists)) fleshed_out <- vapply( list_of_lists, FUN.VALUE = all_values, FUN = function(x) { x[match(all_values, x)] } ) fleshed_out # list1 list2 list3 # [1,] "a" "a" "a" # [2,] "b" NA "b" # [3,] "c" "c" "c" # [4,] "d" "d" NA # [5,] NA NA "e" # [6,] NA NA "f" </code></pre>

Join vectors into dataframe by matching values

Tags:

merge

dataframe

r

dplyr

I'm trying to compare multiple vectors to see where there are matching values between them. I'd like to combine the vectors into a table where every column either has the same value (for matches) or NA (for no match).

For example:

Click to copy

list1 <- c("a", "b", "c", "d")
list2 <- c("a", "c", "d")
list3 <- c("a", "b", "c", "e", "f")

Should become:

Click to copy

a  a  a
b NA  b
c  c  c
d  d  NA
NA NA e
NA NA f

I've tried making the vectors dataframes and using merge, join from dplyr, cbind, cbind.fill, but all those either return a single column or don't match values across all rows.

What's the best way to get this result with R?

988

asked Aug 29 '17 19:08

Evan

2 Answers

A Base R solution:

Click to copy

df1 = data.frame(col = list1, list1)
df2 = data.frame(col = list2, list2)
df3 = data.frame(col = list3, list3)

Reduce(function(x, y) merge(x, y, all=TRUE), list(df1, df2, df3))

#   col list1 list2 list3
# 1   a     a     a     a
# 2   b     b  <NA>     b
# 3   c     c     c     c
# 4   d     d     d  <NA>
# 5   e  <NA>  <NA>     e
# 6   f  <NA>  <NA>     f

Result:

Click to copy

> Reduce(function(x, y) merge(x, y, all=TRUE), list(df1, df2, df3))[,-1]
  list1 list2 list3
1     a     a     a
2     b  <NA>     b
3     c     c     c
4     d     d  <NA>
5  <NA>  <NA>     e
6  <NA>  <NA>     f

or with dplyr + purrr:

Click to copy

library(dplyr)
library(purrr)

list(list1, list2, list3) %>%
  map(~ data.frame(col = ., ., stringsAsFactors = FALSE)) %>%
  reduce(full_join, by = "col") %>%
  select(-col) %>%
  setNames(paste0("list", 1:3))

Data:

Click to copy

list1 <- c("a", "b", "c", "d")
list2 <- c("a", "c", "d")
list3 <- c("a", "b", "c", "e", "f")

190

answered Nov 02 '22 07:11

acylam

You can use unlist and unique to get all possible values, then find their matches across each of the vectors. If nothing matches, match returns NA like you want:

Click to copy

list1 <- c("a", "b", "c", "d")
list2 <- c("a", "c", "d")
list3 <- c("a", "b", "c", "e", "f")
list_of_lists <- list(
  list1 = list1,
  list2 = list2,
  list3 = list3
)

all_values <- unique(unlist(list_of_lists))

fleshed_out <- vapply(
  list_of_lists,
  FUN.VALUE = all_values,
  FUN       = function(x) {
    x[match(all_values, x)]
  }
)

fleshed_out
#    list1 list2 list3
# [1,] "a"   "a"   "a"
# [2,] "b"   NA    "b"
# [3,] "c"   "c"   "c"
# [4,] "d"   "d"   NA
# [5,] NA    NA    "e"
# [6,] NA    NA    "f"

answered Nov 02 '22 06:11

Nathan Werth

Related questions
                            
                                R: lm() result differs when using `weights` argument and when using manually reweighted data
                            
                                Multicolored title with R
                            
                                How to do non-equi join with variable column name
                            
                                Count how many values in some cells of a row are not NA (in R)
                            
                                Group dates by week in R
                            
                                Generate m equally-spaced numbers that sum to 1 in R
                            
                                Create index of definitions / theorems at end of bookdown book
                            
                                Calculate, decode and plot routes on map using leaflet and R
                            
                                Reduce padding in ggplot2 legend
                            
                                R group by show count of all factor levels even when zero dplyr
                            
                                mapping values between data frames R
                            
                                Computation failed for stat_summary, 'what' must be a character string or a function
                            
                                Creating dynamic tabs in Rmarkdown
                            
                                How To Add totals to a DT::datatable?
                            
                                Calculating Population Standard Deviation in R
                            
                                ggplot add text inside each tile of geom tile
                            
                                How do I eliminate stubborn white space between fluidRows in Shiny?
                            
                                Change font family throughout entire R Shiny App: CSS/HTML
                            
                                transpose nested list
                            
                                Randomly remove duplicated rows using dplyr()

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Join vectors into dataframe by matching values

Tags:

merge

dataframe

r

dplyr

Evan

People also ask

2 Answers

acylam

Nathan Werth

Recent Activity

Donate For Us