I have several lists with gene names like this: List1: <pre class="prettyprint"><code>XLOC_012482 XLOC_019357 XLOC_014642 XLOC_010021 XLOC_013282 </code></pre> List2: <pre class="prettyprint"><code>XLOC_012482 XLOC_019357 XLOC_004860 XLOC_004022 XLOC_002278 </code></pre> List3: <pre class="prettyprint"><code>XLOC_004860 XLOC_004022 XLOC_006292 XLOC_006616 XLOC_013802 </code></pre> And I want to extract the common elements between all pairs of lists. I tried using <code>intersect</code> but I could not use it on characters, and I also don't know how to perform this on all pairwise combinations.

You can put your lists into a single list <code>li</code> and then use <code>combn</code> on the list with <code>intersect</code> as the function parameter: <pre class="prettyprint"><code>combn(li, 2, function(x) intersect(x[[1]], x[[2]]), simplify = F) # [[1]] # [1] "XLOC_012482" "XLOC_019357" # # [[2]] # character(0) # # [[3]] # [1] "XLOC_004860" "XLOC_004022" </code></pre> Data: <pre class="prettyprint"><code>li <- list(c("XLOC_012482", "XLOC_019357", "XLOC_014642", "XLOC_010021", "XLOC_013282"), c("XLOC_012482", "XLOC_019357", "XLOC_004860", "XLOC_004022", "XLOC_002278"), c("XLOC_004860", "XLOC_004022", "XLOC_006292", "XLOC_006616", "XLOC_013802")) </code></pre>

R: pairwise extraction of common elements between multiple character lists

Tags:

r

I have several lists with gene names like this:

List1:

XLOC_012482 
XLOC_019357 
XLOC_014642 
XLOC_010021 
XLOC_013282

List2:

XLOC_012482 
XLOC_019357 
XLOC_004860 
XLOC_004022 
XLOC_002278

List3:

XLOC_004860 
XLOC_004022 
XLOC_006292 
XLOC_006616 
XLOC_013802

And I want to extract the common elements between all pairs of lists. I tried using intersect but I could not use it on characters, and I also don't know how to perform this on all pairwise combinations.

575

asked Jun 28 '16 20:06

Jon

2 Answers

You can put your lists into a single list li and then use combn on the list with intersect as the function parameter:

combn(li, 2, function(x) intersect(x[[1]], x[[2]]), simplify = F)
# [[1]]
# [1] "XLOC_012482" "XLOC_019357"
# 
# [[2]]
# character(0)
# 
# [[3]]
# [1] "XLOC_004860" "XLOC_004022"

Data:

li <- list(c("XLOC_012482", "XLOC_019357", "XLOC_014642", "XLOC_010021", 
"XLOC_013282"), c("XLOC_012482", "XLOC_019357", "XLOC_004860", 
"XLOC_004022", "XLOC_002278"), c("XLOC_004860", "XLOC_004022", 
"XLOC_006292", "XLOC_006616", "XLOC_013802"))

answered Nov 03 '22 07:11

Psidom

This is also helpful using table (I use the same li list as @Psidom's answer):

tb <- table(unlist(li))

will give you each sequence along with its count among all lists:

# XLOC_002278 XLOC_004022 XLOC_004860 XLOC_006292 XLOC_006616 XLOC_010021 XLOC_012482 
#        1           2           2           1           1           1           2 
# XLOC_013282 XLOC_013802 XLOC_014642 XLOC_019357 
#          1           1           1           2

If you want to extract those duplicated:

tb[tb>1]

# XLOC_004022 XLOC_004860 XLOC_012482 XLOC_019357 
#          2           2           2           2

answered Nov 03 '22 08:11

989

Related questions
                            
                                Debugging in plyr or dplyr - seeing which group
                            
                                Spiral Wrapped Text
                            
                                Set sliderInput values as characters in shiny
                            
                                select numeric columns and one column specified by name from data frame
                            
                                Grouping a data.table by running intervals
                            
                                How to use stringr's replace_all() function to replace specific matches in a string
                            
                                R ginv and Matlab pinv produce different results
                            
                                ggplot2 2.1.0 broke my code? Secondary transformed axis now appears incorrectly
                            
                                Discretizing the log of a continuous variable
                            
                                In RStudio, making slide deck with RMarkdown, how can I easily change the theme?
                            
                                Partial dependence plot from an xgboost model in R
                            
                                If fullfiles two rules store the names in the vector
                            
                                ggplot2: facets: different axis limits and free space
                            
                                McDonalds omega: warnings in R
                            
                                Hole in Bullseye in ggplot2
                            
                                shiny leaflet mouseover popup
                            
                                Dynamically add and remove uiOutput elements based on index using actionButtons
                            
                                How to modify the left side of a formula?
                            
                                R: how to perform more complex calculations from a combn of a dataset?
                            
                                Replicating 2 dimensional matrix to create a 3 dimensional array (in R)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

R: pairwise extraction of common elements between multiple character lists

Tags:

r

Jon

People also ask

2 Answers

Psidom

989

Recent Activity

Donate For Us