R finding rows of a data frame where certain columns match those of another [duplicate]

Tags:

I have an R question that I'm even sure how to word in one sentence, and couldn't find an answer for this yet.

I have two data frames that I would like to 'intersect' and find all rows where column values match in two columns. I've tried connecting two intersect() and which() statements with &&, but neither has given me what I want yet.

Here's what I mean. Let's say I have two data frames:

Click to copy

> testData
               Email     Manual Campaign Bounced Opened Clicked ClickThru Unsubscribed
1 stack@overflow.com EIFLS0LS        1       0      0       0         0            0
2 stack@exchange.com EIFLS0LS        1       0      0       0         0            0
3     data@frame.com EIFLS0LS        1       0      0       0         0            0
4    block@quote.com EIFLS0LS        1       0      0       0         0            0
5          ht@ml.com EIFLS0LS        1       0      0       0         0            0
6     tele@phone.com EIFLS0LS        1       0      0       0         0            0

> testBounced
               Email Campaign
1 stack@overflow.com        1
2 stack@overflow.com        2
3     data@frame.com        2
4    block@quote.com        1
5          ht@ml.com        1
6        lap@top.com        1

As you can see, there are some values in the column Email that intersect, and some from the column Campaign that intersect. I want all of the rows from testData in which BOTH columns match.

ie:

Click to copy

               Email     Manual Campaign Bounced Opened Clicked ClickThru Unsubscribed
1 stack@overflow.com EIFLS0LS        1       0      0       0         0            0
2    block@quote.com EIFLS0LS        1       0      0       0         0            0
3          ht@ml.com EIFLS0LS        1       0      0       0         0            0

EDIT:

My goal in finding these columns is to be able to update a row in the original column. So the final output that I would like is:

Click to copy

> testData
               Email     Manual Campaign Bounced Opened Clicked ClickThru Unsubscribed
1 stack@overflow.com EIFLS0LS        1       1      0       0         0            0
2 stack@exchange.com EIFLS0LS        1       0      0       0         0            0
3     data@frame.com EIFLS0LS        1       0      0       0         0            0
4    block@quote.com EIFLS0LS        1       1      0       0         0            0
5          ht@ml.com EIFLS0LS        1       1      0       0         0            0
6     tele@phone.com EIFLS0LS        1       0      0       0         0            0

My apologies if this is a duplicate, and thanks in advance for your help!

EDIT2::

I ended up just using a for loop, nothing great, but doesn't feel efficient. The dataset was small enough to do it quickly, though. If anyone has a quick, R-style way to do it, I'd be happy to see it!

902

asked Jul 26 '13 18:07

so13eit

1 Answers

You want the function merge.

merge is commonly used to merge two tables by one similar common, but the by argument can allow multiple columns:

Click to copy

merge(testData, testBounced, by=c("Email", "Campaign"))

All pairs of Email and Campaign that don't match will be discarded by default. That's controllable by the arguments all.x and all.y, which default to FALSE.

The default argument for by is intersect(names(x, y)), so you technically don't need to specify the columns in this case, but it's good for clarity.

160

answered Sep 23 '22 03:09

Señor O

Related questions
                            
                                executing an R script from python
                            
                                How to organize big R functions?
                            
                                Symbolic derivatives and simplification in R
                            
                                Placing the x-axis labels between period ticks
                            
                                mmap and csv files
                            
                                Similar .rdata functionality in Python?
                            
                                How to correctly deal with escaped Unicode Characters in R e.g. the em dash (—)
                            
                                creating a binary operator function within a package
                            
                                How can I swap labels and symbols in a legend in R?
                            
                                Pass variable name to plotting function title
                            
                                How to manually fill colors in a ggplot2 histogram
                            
                                Highlighting particular regions of a scatterplot in a ggplot
                            
                                R verify source code
                            
                                How does one turn contour lines into filled contours?
                            
                                change code block color in knitr/markdown
                            
                                Feeding newdata to R predict function
                            
                                Disregarding simple warnings/errors in tryCatch()
                            
                                using scientific notation in R
                            
                                R, filter matrix based on variance cut-offs
                            
                                sprintf format strings: reference by name?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

R finding rows of a data frame where certain columns match those of another [duplicate]

Tags:

dataframe

r

subset

so13eit

People also ask

1 Answers

Señor O

Recent Activity

Donate For Us