Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

KDB+\q: How to fill regularly spaced time series?

Tags:

kdb+

I am trying to implement .ml.filltab[] located here on a table of timeseries data composed of regularly interspaced temporal aggregations i.e.:

second     |  amount price 
-----------|----------------
02:46:01   |  54     9953.5
02:46:04   |  150          
02:46:05   |         9954.5

Applying the .ml.filltab[] function does not update the index so the table incorrectly fills values without the appropriate index. i.e.

second     |  amount price 
-----------|----------------
02:46:01   |  54     9953.5
02:46:04   |  150    9953.5    
02:46:05   |  150    9953.5

As opposed to correctly filling in the index as follows:

second     |  amount price 
-----------|----------------
02:46:01   |  54     9953.5
02:46:02   |  54     9953.5
02:46:03   |  54     9953.5
02:46:04   |  150    9953.5    
02:46:05   |  150    9954.5

In addition to the above It also requires a column by which to group the data i.e. a `sym column which is unnecessary when only aggregating a single series.

How does one run a fill forward operation on a table accounting for missing index? Thanks

like image 550
James Avatar asked Nov 02 '25 09:11

James


1 Answers

One way of doing this would be to create a secondary empty table with the same schema where the second column has all of the missing time points. To do this create a list of all the times you are interested in. One way of doing this would be to take the first and last element and do the following:

q)f:exec first second from t
q)f
02:46:01
q)l:exec last second from t
q)l
02:46:05
q)s:f+`second$til 1+`long$l-f
q)s
02:46:01 02:46:02 02:46:03 02:46:04 02:46:05

Note that the second type needed to be casted to a long type to work with the til function, then cast back to a second type. Also we add 1 to ensure the last time (02:46:05) is not missed out.

We can create an empty table with the same schema by doing the following:

q)et:select by second:s from 0#t
q)et
second  | amount price
--------| ------------
02:46:01|
02:46:02|
02:46:03|
02:46:04|
02:46:05|

The 0# is a quick way of creating an empty table with the same schema as the original table t. Now an lj can be used to merge the two tables together, and fills will then forward fill in any nulls:

q)et lj t
second  | amount price
--------| -------------
02:46:01| 54     9953.5
02:46:02|
02:46:03|
02:46:04| 150
02:46:05|        9954.5
q)fills et lj t
second  | amount price
--------| -------------
02:46:01| 54     9953.5
02:46:02| 54     9953.5
02:46:03| 54     9953.5
02:46:04| 150    9953.5
02:46:05| 150    9954.5
like image 137
Eliot Robinson Avatar answered Nov 04 '25 19:11

Eliot Robinson



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!