I am stuck with a project where I need to merge two data frames. They look something like this: <pre class="prettyprint"><code>Data1 Traffic Source Registrations Hour Minute organic 1 6 13 social 1 8 54 Data2 Email Hour2 Minute2 test@domain.com 6 13 test2@domain2.com 8 55 </code></pre> I have the following line of code to merge the 2 data frames: <pre class="prettyprint"><code>merge.df <- merge(Data1, Data2, by.x = c( "Hour", "Minute"), by.y = c( "Hour2", "Minute2")) </code></pre> It would work great if the variable time (hours & minutes) wasn't slightly off between the two data sets. Is there a way to make the column "Minute" match with "Minute2" if it's + or - one minute off? I thought I could create 2 new columns for data set one: <pre class="prettyprint"><code>Data1 Traffic Source Registrations Hour Minute Minute_plus1 Minute_minus1 organic 1 6 13 14 12 social 1 8 54 55 53 </code></pre> Is it possible to merge the 2 data frames if "Minute2" matches any variable from either "Minute", "Minute_plus1", or "Minute_minus1"? Or is there a more efficient way to accomplish this merge?

For stuff like this I usually turn to SQL: <pre class="prettyprint"><code>library(sqldf) x = sqldf(" SELECT * FROM Data1 d1 JOIN Data2 d2 ON d1.Hour = d2.Hour2 AND ABS(d1.Minute - d2.Minute2) <= 1 ") </code></pre> Depending on the size of your data, you could also just join on <code>Hour</code> and then filter. Using <code>dplyr</code>: <pre class="prettyprint"><code>library(dplyr) x = Data1 %>% left_join(Data2, by = c("Hour" = "Hour2")) %>% filter(abs(Minute - Minute2) <= 1) </code></pre> though you could do the same thing with <code>base</code> functions.

How to join data frames based on condition between 2 columns

Tags:

merge

dataframe

r

I am stuck with a project where I need to merge two data frames. They look something like this:

Data1
Traffic Source    Registrations    Hour    Minute
organic           1                6        13
social            1                8        54

Data2
Email                     Hour2   Minute2
[email protected]           6         13
[email protected]         8         55

I have the following line of code to merge the 2 data frames:

merge.df <- merge(Data1, Data2, by.x = c( "Hour", "Minute"),
           by.y = c( "Hour2", "Minute2"))

It would work great if the variable time (hours & minutes) wasn't slightly off between the two data sets. Is there a way to make the column "Minute" match with "Minute2" if it's + or - one minute off?

I thought I could create 2 new columns for data set one:

Data1
Traffic Source    Registrations   Hour   Minute    Minute_plus1   Minute_minus1
organic           1               6        13      14              12
social            1               8        54      55              53

Is it possible to merge the 2 data frames if "Minute2" matches any variable from either "Minute", "Minute_plus1", or "Minute_minus1"? Or is there a more efficient way to accomplish this merge?

952

asked Apr 28 '15 18:04

heyydrien

1 Answers

For stuff like this I usually turn to SQL:

library(sqldf)
x = sqldf("
  SELECT *
  FROM Data1 d1 JOIN Data2 d2
  ON d1.Hour = d2.Hour2
  AND ABS(d1.Minute - d2.Minute2) <= 1
")

Depending on the size of your data, you could also just join on Hour and then filter. Using dplyr:

library(dplyr)
x = Data1 %>%
  left_join(Data2, by = c("Hour" = "Hour2")) %>%
  filter(abs(Minute - Minute2) <= 1)

though you could do the same thing with base functions.

123

answered Oct 17 '22 18:10

Gregor Thomas

Related questions
                            
                                Formatting month abbreviations using as.Date [duplicate]
                            
                                Create a cluster of co-workers' Windows 7 PCs for parallel processing in R?
                            
                                Interpolate zoo object with missing Dates
                            
                                Legend with color filling and shading lines in Base R
                            
                                emacs tramp over an unreliable connection
                            
                                How to use several equal signs in text(x,y,expression(...))
                            
                                Downloading Reactive Objects in Shiny
                            
                                How do I merge two nodes into a single node using igraph
                            
                                How to assign your color scale on raw data in heatmap.2()
                            
                                Generate paired stacked bar charts in ggplot (using position_dodge only on some variables)
                            
                                How to add more margin to a heatmap.2 plot with the png device?
                            
                                How to add bullet points in R Shiny's renderText?
                            
                                is dash a special character in R regex?
                            
                                How to make monotonic (increasing) smooth spline with smooth.spline() function?
                            
                                R: How to : 3d Density plot with gplot and geom_density
                            
                                Seasonal Decomposition of Time Series by Loess with Python
                            
                                R make circle/chord diagram with circlize from dataframe
                            
                                How can I add notes to the bottom of a table using knitr::kable?
                            
                                dplyr summarize with a function of a dataframe
                            
                                How do I change the background color on a shiny dashboard in R

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With