Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sum data from two data frames matched by rowname

I have two data frames with different numbers of rows, thus:

df1:
           Data1
2019-03-01 0.011
2019-04-01 0.021
2019-05-01 0.013
2019-06-01 0.032
2019-07-01 NA

df2:
           Data2
2019-01-01 0.012
2019-02-01 0.024
2019-03-01 0.033
2019-04-01 0.017
2019-05-01 0.055
2019-06-01 0.032
2019-07-01 0.029

The row names are dates. I want to add a second column, "Result", to df1. This column would sum the value in df1$Data1 + the value in the row of df2$Data2 with the same row name. (The row names in both data frames are unique and ordered.) So, for example:

df1$Result[1] <- df1$Data1[1] + df2$Data2[3]

The result would be:

df1:
           Data1 Result
2019-03-01 0.011 0.044
2019-04-01 0.021 0.038
2019-05-01 0.013 0.068
2019-06-01 0.032 0.064
2019-07-01 NA    NA

The only way I can figure out how to do this is with a looping construct, but I think there is a better way. I'm not finding it, though, so I imagine I'm looking for the wrong thing. Any ideas?

I am also open to other suggestions for getting to the same end. So, for example, if this would be easier to accomplish with the dates in a data column instead of in the row name, that would be fine. Or if it would be easier to do with a ts object, although I generally find data frames easier to work with.


2 Answers

You can merge the two dataframes by rownames and then add the corresponding columns

transform(merge(df1, df2, by = 0), sum = Data1 + Data2)


#   Row.names Data1 Data2   sum
#1 2019-03-01 0.011 0.033 0.044
#2 2019-04-01 0.021 0.017 0.038
#3 2019-05-01 0.013 0.055 0.068
#4 2019-06-01 0.032 0.032 0.064
#5 2019-07-01    NA 0.029    NA

Or similarly with dplyr

library(dplyr)
library(tibble)

inner_join(df1 %>% rownames_to_column(), 
           df2 %>% rownames_to_column(), by = "rowname") %>%
mutate(Result = Data1 + Data2)
like image 73
Ronak Shah Avatar answered Sep 04 '25 04:09

Ronak Shah


We can use data.table

library(data.table)
setDT(df1, keep.rownames = TRUE)
setDT(df2, keep.rownames = TRUE)
df2[df1, on = .(rn)][, sum := Data1 + Data2][]
#           rn Data2 Data1   sum
#1: 2019-03-01 0.033 0.011 0.044
#2: 2019-04-01 0.017 0.021 0.038
#3: 2019-05-01 0.055 0.013 0.068
#4: 2019-06-01 0.032 0.032 0.064
#5: 2019-07-01 0.029    NA    NA
like image 36
akrun Avatar answered Sep 04 '25 06:09

akrun



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!