I was hoping someone could help me with this problem. I’m working with time series data that has records every five years. The data frame df
is an example, with the time step in df$Time
and record values in df$A
.
df
Time A
5 4.8
10 19.6
15 27.5
20 39.7
What I need to do is convert df
to an annual time series and populate all the new records with new values based on a calculation. I have successfully converted df
to annual data df2
using the following:
df2 <- data.frame("Time" = c(5:20), "A" = c(5:20))
df2$A[] <- sapply(df2$A, function(x) df$A[match(x, df$Time)])
df2[is.na(df2)] <- 0
df2
Time A
5 4.8
6 0
7 0
8 0
9 0
10 19.6
11 0
12 0
13 0
14 0
15 27.5
16 0
17 0
18 0
19 0
20 39.7
What I can’t figure out is how to calculate new values for df2$A
. The calculation depends on the relative positions of other row/record values – for example, fordf2$A[2]
the calculation would be df2$A[2] <- df2$A[1] + (df2$A[6] - df2$A[1])/5
. It also changes throughout the data frame, for example df2$A[7] <- df2$A[6] + (df2$A[11] – df2$A[6]) / 5
.
for
loops have yielded only frustration, and I'm trying to do this without doing a line of code for every new record (the actual data has hundreds of records). I'm trying to get to df3
in an efficient way - thank you!
df3
Time A
5 4.8
6 7.76
7 10.72
8 13.68
9 16.64
10 19.6
11 21.18
12 22.76
13 24.34
14 25.92
15 27.5
16 29.94
17 32.38
18 34.82
19 37.26
20 39.7
What you are looking for is called linear interpolation. In R you can use the approx
function like this:
df <- data.frame("Time" = c(5, 10, 15, 20), "A" = c(4.8, 19.6, 27.5, 39.7))
df2 <- as.data.frame(approx(x = df$Time, y = df$A, xout = 5:20))
names(df2) <- names(df)
Result:
> df2
Time A
1 5 4.80
2 6 7.76
3 7 10.72
4 8 13.68
5 9 16.64
6 10 19.60
7 11 21.18
8 12 22.76
9 13 24.34
10 14 25.92
11 15 27.50
12 16 29.94
13 17 32.38
14 18 34.82
15 19 37.26
16 20 39.70
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With