Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: update the values in df1 based on data in df2

Tags:

r

Hi I have two data frames (df1 and df2) with two shared variables (ID and Yr). I want to update the values in a third variable (value) in df1 with the new data in the respective value in df2. But below code does not update the value in df1, it seems the values are not passed to the corresponding cels in df1.

df1 = data.frame(ID = c("a","b","c","d","e") ,
                 Yr = c(2000,2001,2002,2003,2004), 
                 value= c(100,100,100,100, 100))
df2 = data.frame(ID = c("a","b","c") ,
                 Yr = c(2000,2001,2002), 
                 valuenew= c(200,150,120))


for (i in 1:nrow(df2)){
  id <- df2[i,'ID']
  year <- df2[i, 'Yr']
  valuenew<- df2[i, 'valuenew']
  df1[which (df1$ID == id & df1$Yr == year), 'value'] <- valuenew
}

the desired result

   ID  Yr   value
    a  2000   200
    b  2001   150
    c  2002   120
    d  2003   100
    e  2004   100

The real data I use with which none of these solutions works

df1    
head(df1, 5)
                           CoreID   Yr   FluxTot
1 Asmund2000_Greenland coast_4001 1987 0.3239693
2 Asmund2000_Greenland coast_4001 1986 0.2864100
3 Asmund2000_Greenland coast_4001 1985 0.2488508
4 Asmund2000_Greenland coast_4001 1984 0.2964794
5 Asmund2000_Greenland coast_4001 1983 0.3441080

df2
head(df2, 5)
                      CoreID   Yr GamfitHgdep
1       Beal2015_Mount Logan 2000  0.01105077
2 Eyrikh2017_Belukha glacier 2000  0.02632597
3       Zheng2014_Mt. Oxford 2000  0.01377599
4          Zheng2014_Agassiz 2000  0.01940151
5     Zheng2014_NEEM-2010-S3 2000 -0.01483026

#merged database
m<-merge(df1, df2)
head(m,5)

              CoreID   Yr     FluxTot  GamfitHgdep
1 Beal2014_Yanacocha 2000 0.003365556  0.024941373
2 Beal2014_Yanacocha 2001 0.003423333  0.027831253
3 Beal2014_Yanacocha 2002 0.003481111 -0.002908330
4 Beal2014_Yanacocha 2003 0.003538889 -0.004591100
5 Beal2014_Yanacocha 2004 0.003596667  0.005189858

enter image description here enter image description here Below is the exact code I used to do the trick but failed. No difference if the value assigning part is replaced with any other solutions. No warning, no error raised.

library(readxl)
library(dplyr)

metal = 'Hg' 
df = read_excel('All core data.xlsx','Sheet1')
df = data.frame(df)
df1 <- df[which (df$Metal==metal),] 
rownames(df1) = seq(length=nrow(df1))
head(df1, 5)

dfgam = read_excel('GAM prediction.xlsx','Sheet1')
df2 <- data.frame(dfgam)
head(df2, 5)

for (i in 1:nrow(df2)){
  coreid <- df2[i,'CoreID']
  year <- df2[i, 'Yr']
  predicted<- df2[i, 'GamfitHgdep']
  df1[which (df1$CoreID == coreid & df1$Yr == year), 'FluxTot'] <- predicted
}

after running the code, the values in df1 have not changed, for instance enter image description here

the value should be 0.024941373 as shown in head(m,5)

like image 690
Elizabeth Avatar asked Oct 29 '25 09:10

Elizabeth


1 Answers

Since dplyr version 1.0.0, you can use rows_update for this:

dplyr::rows_update(
    df1, 
    rename(df2, value=valuenew), 
    by = c("ID", "Yr")
)
#   ID   Yr value
# 1  a 2000   200
# 2  b 2001   150
# 3  c 2002   120
# 4  d 2003   100
# 5  e 2004   100
like image 162
SamR Avatar answered Oct 31 '25 02:10

SamR



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!