Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to process panel data for use in a recurrent neural network (RNN)

I have been doing some research on recurrent neural networks, but I am having trouble understanding if and how they could be used to analyze panel data (meaning cross-sectional data that is captured at different periods in time for several subjects -- see sample data below for example).Most examples of RNNs I have seen have to do with sequences of text, rather than true panel data, so I'm not sure if they are applicable to this type of data.

Sample data:

ID    TIME    Y    X1    X2    X3
1     1       5     3     0    10
1     2       5     2     2    6
1     3       6     6     3    11
2     1       2     2     7    2
2     2       3     3     1    19
2     3       3     8     6    1
3     1       7     0     2    0

If I want to predict Y at a particular time given the covariates X1, X2 and X3 (as well as their values in previous time periods), can this kind of sequence be evaluated by a recurrent neural network? If so, do you have any resources or ideas on how to turn this type of data into feature vectors and matching labels that can be passed to an RNN (I'm using Python, but am open to other implementations).

like image 910
user1895076 Avatar asked Oct 12 '16 20:10

user1895076


People also ask

How does a recurrent neural network RNN work what would this be used for?

Recurrent neural networks (RNNs) are a class of neural network that are helpful in modeling sequence data. Derived from feedforward networks, RNNs exhibit similar behavior to how human brains function. Simply put: recurrent neural networks produce predictive results in sequential data that other algorithms can't.

What type of input data is best suited for RNN?

RNN is best suited for sequential data. It can handle arbitrary input / output lengths. RNN uses its internal memory to process arbitrary sequences of inputs.

Do we need to normalize data for RNN?

Yes, normalisation/scaling is typically recommended and sometimes very important. Especially for neural networks, normalisation can be very crucial because when you input unnormalised inputs to activation functions, you can get stuck in a very flat region in the domain and may not learn at all.


1 Answers

TSAI (based on fastai) https://timeseriesai.github.io/tsai/data.preparation.html#SlidingWindowPanel offers a panel data preprataion function which might be of use for you.

FYI: it has some great SOTA algoithms for time series classification & regression.

like image 109
Georg Heiler Avatar answered Nov 15 '22 21:11

Georg Heiler