I really hate to ask two questions in a row but this is something that I can’t wrap my head around. So let’s say I have a data frame, as follows:
df
Row# User Morning Evening Measure Date
1 1 NA NA 2/18/11
2 1 50 115 2/19/11
3 1 85 128 2/20/11
4 1 62 NA 2/25/11
5 1 48 100.8 3/8/11
6 1 19 71 3/9/11
7 1 25 98 3/10/11
8 1 NA 105 3/11/11
9 2 48 105 2/18/11
10 2 28 203 2/19/11
11 2 35 80.99 2/21/11
12 2 91 78.25 2/22/11
Is it possible in R to take the difference between the previous consecutive day (and only the previous day, not the previous result) evening value of 1 row and the morning value of a different row for each user group? So my desired results would be this.
df
Row# User Morning Evening Date Difference
1 1 NA NA 2/18/11 NA
2 1 50 115 2/19/11 NA
3 1 85 129 2/20/11 30
4 1 62 NA 2/25/11 NA
5 1 48 100.8 3/8/11 NA
6 1 19 71 3/9/11 81.8
7 1 25 98 3/10/11 46
8 1 10 105 3/11/11 88
9 2 48 105 2/18/11 NA
10 2 28 203 2/19/11 77
11 2 35 80.99 2/21/11 NA
12 2 91 78.25 2/22/11 -10.01
All I want this to do is to take the morning value and subtract it from the evening value of the previous consecutive day for each user group. As you can see, some parts of my data frame contain NA values in the morning and evening columns, in addition, not all of the dates are in consecutive order for each different user, so naturally, NA should be assigned.
I've tried searching google but there wasn't much information on being able to apply functions to different rows for each group of rows on different columns (if that makes any sense).
My attempts include many variations of this.
df$Difference<-ave((df$Morning,df$Evening),
df$User,
FUN=function(x){
c('NA',diff(df$Evening-df$Morning)),na.rm=T
})
Again, any help would be greatly appreciated. Thanks.
Note: The input data you show and the output data are not the same. There is a NA
which is replaced by 10
in output and the last date is 2/14/11
in input and 2/22/11
in output.
I've assumed the output to be the original data to create this answer to match your result.
df$Diff <- c(NA, head(df$Evening, -1) - tail(df$Morning, -1))
df$Diff[which(c(0, diff(as.Date(as.character(df$Measure_Date),
format="%m/%d/%Y"))) != 1)] <- NA
> df
# Row User Morning Evening Measure_Date Diff
# 1 1 1 NA NA 2/18/11 NA
# 2 2 1 50 115.00 2/19/11 NA
# 3 3 1 85 128.00 2/20/11 30.00
# 4 4 1 62 NA 2/25/11 NA
# 5 5 1 48 100.80 3/8/11 NA
# 6 6 1 19 71.00 3/9/11 81.80
# 7 7 1 25 98.00 3/10/11 46.00
# 8 8 1 10 105.00 3/11/11 88.00
# 9 9 2 48 105.00 2/18/11 NA
# 10 10 2 28 203.00 2/19/11 77.00
# 11 11 2 35 80.99 2/21/11 NA
# 12 12 2 91 78.25 2/22/11 -10.01
@user1342086's edit (that got rejected, but was right indeed):
df$Diff[which(diff(df$User) != 0)] <- NA
seems to take care of the grouping by "User".
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With