Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R merge based on condition other than equality

Tags:

merge

dataframe

r

I have a dataframe that looks something like:

date            minutes_since_midnight   value
2015-01-01      50                       2
2015-01-01      60                       1.5
2015-01-02      45                       3.3
2015-01-03      99                       5.5

and another dataframe looking something like this

date        minutes_since_midnight   other_value
2015-01-01  55                       12
2015-01-01  80                       33
2015-01-02  45                       88

What I want to do is add another column to the first data frame, which is the boolean value whether a row exists in the second data frame for an equal value in the date column and then a minutes_since_midnight which is less than or equal to the minutes_since_midnight from the first data frame. So for the above example data we'd get:

date        minutes_since_midnight    value  has_other_value
2015-01-01  50                        2      False
2015-01-01  60                        1.5    True
2015-01-02  45                        3.3    True
2015-01-03  99                        5.5    False

How can I do this?

Hope this makes sense,

Thanks in advance

like image 677
user555265 Avatar asked Apr 20 '15 12:04

user555265


People also ask

How do I merge two datasets with common variable in R?

To join two data frames (datasets) vertically, use the rbind function. The two data frames must have the same variables, but they do not have to be in the same order. If data frameA has variables that data frameB does not, then either: Delete the extra variables in data frameA or.

How do I merge Dataframes with different columns in R?

Method 1 : Using plyr package rbind. fill() method in R is an enhancement of the rbind() method in base R, is used to combine data frames with different columns. The column names are number may be different in the input data frames. Missing columns of the corresponding data frames are filled with NA.

How do you make an outer join in R?

We can use the merge() function in base R to perform an outer join, using the 'team' column as the column to join on: What is this? Notice that all of the rows from both data frames are returned.


2 Answers

I would probably join the data.frames along the lines of the other answer, then create the variable and drop unneeded columns. But here's an option using the dplyr package to perform the steps as you describe them:

library(dplyr)
df1$has_other_value <-
  left_join(df1, df2 %>%
              group_by(date) %>%
              summarise(minMins = min(minutes_since_midnight)),
            by="date")$minMins <= df1$minutes_since_midnight

df1$has_other_value[is.na(df1$has_other_value)] <- FALSE

Result:

        date minutes_since_midnight value has_other_value
1 2015-01-01                     50   2.0           FALSE
2 2015-01-01                     60   1.5            TRUE
3 2015-01-02                     45   3.3            TRUE
4 2015-01-03                     99   5.5           FALSE
like image 191
Sam Firke Avatar answered Oct 20 '22 01:10

Sam Firke


Can you not rename the variables minutes_since_midnight to minutes_since_midnight1 and minutes_since_midnight2, merge the two data frames together then create the required has_other_value variable with an if else statement.

like image 20
figurine Avatar answered Oct 20 '22 03:10

figurine