Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I interpolate station data with Kriging in Python?

Browsing the web I've found that some tools to use Kriging in Python are pyKriging and Gaussian Process Regression. However, I couldn't make any of them to work. The first one doesn't work for me (can't even import it):

import pyKriging

  File "~/python3.6/site-packages/pyKriging/krige.py", line 142
    except Exception, err: 
                    ^
  SyntaxError: invalid syntax

and the second one I don't understand how to use it. I couldn't find a simple working example (this rroowwllaanndd answer for instance is great but sadly the data is no longer available to download)

So my question is, how could I interpolate my data using Kriging? I have several station data saved in numpy arrays that look like this:

2000      1         1         5.0
2000      1         2         3.4
2000      1         3         0.2

and the columns are Year - Month - Day - Precipitation. I have several of these data arrays (st1, st2, st3), and another array which contains the ID of each station and the coordinates at which each station is located (stid, so station 1 is located in longitude 15.6865, latitude 62.6420, and so on).

import numpy as np
st1 = np.array([[2000,1,1,5.0],[2000,1,2,3.4],[2000,1,3,0.2]])
st2 = np.array([[2000,1,1,8.2],[2000,1,2,2.5],[2000,1,3,0.0]])
st3 = np.array([[2000,1,1,np.nan],[2000,1,2,4.5],[2000,1,3,1.2]])

stid = np.array([[1,15.6865,62.6420],[2,15.7325,62.1254],[3,16.1035,61.1449]])

What I need is an array per day (or a 3D array) which contains the data of all stations interpolated with Kriging in a grid like this for each day:

y = np.arange(61,63,0.125)
x = np.arange(14,17,0.125)
X,Y = np.meshgrid(x,y)

Any help is appreciated.

like image 952
lanadaquenada Avatar asked Jul 18 '17 19:07

lanadaquenada


2 Answers

It is good to know to find interesting documentation, packages, etc. that kriging is often called "Gaussian Process Regression".

In python, a good implementation with many examples is the one of the well-known machine learning package scikit-learn. It is based on the well-known DACE matlab implementation.

The documentation for Gaussian Process Regression includes 5 tutorials, as well as a list of available kernels.

With the data you provided, you just have to do the following to fit a simple model with the kernel of your choice:

import sklearn
gp = sklearn.gaussian_process.GaussianProcessRegressor(kernel=your_chosen_kernel)
gp.fit(X, y)  
like image 181
Pop Avatar answered Sep 23 '22 17:09

Pop


With OpenTURNS, the KrigingAlgorithm estimates the hyperparameters of a conditioned gaussian process. The purpose of the metamodel that you need is to have the (longitude,latitude) 2D point as input and the precipitation at a given date as output.

The first step is to prepare the data. In the following script, I create the coordinates_train variable which contains the longitude/latitude pairs and the precipitation_train variable which contains the precipitations. I used the precipitations at date 2000/1/2, because the data is missing for 2000/1/1 at station 3.

import openturns as ot
# Input points
coordinates_train = ot.Sample([[15.68,62.64],[15.73,62.12],[16.10,61.14]])
# Output points
precipitation_train = ot.Sample([[3.4],[2.5],[4.5]]) # At 2000/1/2

Then we can train the kriging. To do this I use a constant basis (the trend of the model) and an exponential covariance model. This should be appropriate, given that the precipitation must be quite regular with respect to the location of the station.

# Fit
inputDimension = 2
basis = ot.ConstantBasisFactory(inputDimension).build()
covarianceModel = ot.SquaredExponential([1.]*inputDimension, [1.0])
algo = ot.KrigingAlgorithm(coordinates_train, precipitation_train, covarianceModel, basis)
algo.run()
result = algo.getResult()
krigingMetamodel = result.getMetaModel()

Then we can use the metamodel to predict the precipitation at a location where it was not recorded. Since krigingMetamodel is a function, I just use the "()" operator.

# Predict
coordinates = [15.70,62.53] # A new latitude/longitude pair
precipitation = krigingMetamodel(coordinates)

Then precipitation is a 1D point containing the precipitation at the given location. This is the predicted precipitation.

>>> print(precipitation)
[3.46667]

You might as well get a more general kriging by having the (longitude,latitude,time) as input. In this case, all you have to do is to add a new dimension to the input training sample, containing the associated times, formatted as real values.

like image 29
Michael Baudin Avatar answered Sep 25 '22 17:09

Michael Baudin