Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fitting a spline function in R to interpolate daily values from monthly values

Take a data frame that looks like this and contains data for some dates in 2005 and a measurement at each date.

df <- data.frame("date" = c('2005-04-04','2005-04-19', '2005-04-26', '2005-05-05', 
'2005-05-12', '2005-05-25', '2005-06-02', '2005-06-16', '2005-07-07', '2005-07-14', 
'2005-07-21', '2005-08-04'), "numbers" = c(90,50,50,48,44,37,34,30,36,31,49,54))

I want to create a sequence of values from 1:365 based on this for each day of the year, essentially to create a new data frame from 01/01/2005 to 31/12/2005 which has been infilled with the values from a spline function fitting over these existing 12 values.

When I try to do this using:

numbers <- df$numbers
x = spline(1:365, numbers)

I get

Error in xy.coords(x, y, setLab = FALSE) : 'x' and 'y' lengths differ'

I'm not sure what is going wrong.

like image 672
Pad Avatar asked Jul 19 '18 13:07

Pad


People also ask

Can spline functions be used for interpolation?

In the mathematical field of numerical analysis, spline interpolation is a form of interpolation where the interpolant is a special type of piecewise polynomial called a spline.

How do you find the linear interpolation spline?

The linear spline represents a set of line segments between the two adjacent data points (Vk,Ik) and (Vk+1,Ik+1). The equations for each line segment can be immediately found in a simple form: Ik(V) = Ik + ( Ik+1 - Ik) ( V - Vk ) / (Vk+1 - Vk), where V = [Vk,Vk+1] and k = 0,1,...,(n-1).

What does spline function do in R?

Split() is a built-in R function that divides a vector or data frame into groups according to the function's parameters. It takes a vector or data frame as an argument and divides the information into groups.


1 Answers

It is easy to get rid of the error, but hard to get a sensible answer.

x <- as.POSIXlt(as.character(df$date))$yday + 1  ## day of year (start from 1)
y <- df$number

There are many interpolation splines: "fmm", "periodic", "natural", "monoH.FC" and "hyman". But not all of them are applicable here.

y1 <- spline(x, y, xout = 1:365, method = "fmm")

y2 <- spline(x, y, xout = 1:365, method = "periodic")
#Warning message:
#In spline(x, y, xout = 1:365, method = "periodic") :
#  spline: first and last y values differ - using y[1] for both

y3 <- spline(x, y, xout = 1:365, method = "natural")

y4 <- spline(x, y, xout = 1:365, method = "monoH.FC")
#Error in spline(x, y, xout = 1:365, method = "monoH.FC") : 
#  invalid interpolation method

y5 <- spline(x, y, xout = 1:365, method = "hyman")
#Error in spline(x, y, xout = 1:365, method = "hyman") : 
#  'y' must be increasing or decreasing

See ?spline for details of those methods and the necessary assumption / requirement for them.

So obviously only y1 and y3 have been obtained without problem. Let's sketch them.

par(mfrow = c(1, 2))
plot(y1, type = "l", main = "fmm"); points(x, y, pch = 19)
plot(y3, type = "l", main = "natural"); points(x, y, pch = 19)

spline interpolation / extrapolation

As we can see, we have big problem when extrapolating data.

like image 189
Zheyuan Li Avatar answered Oct 18 '22 03:10

Zheyuan Li