I have this table for stock prices (simplified version here):
+----------+--------+-------+
| Time | Ticker | Price |
+----------+--------+-------+
| 10:00:00 | A | 5 |
| 10:00:01 | A | 6 |
| 10:00:00 | B | 3 |
+----------+--------+-------+
I want to select the row group by Ticker with maximum Time, e.g.
+----------+--------+-------+
| Time | Ticker | Price |
+----------+--------+-------+
| 10:00:01 | A | 6 |
| 10:00:00 | B | 3 |
+----------+--------+-------+
I know how to do it in SQL, similar question can be found here , but I have no idea how to do elegantly it in KDB.
I have a solution that do selection twice:
select first Time, first Ticker, first Price by Ticker from (`Time xdesc select Time, Ticker, Price from table where date=2018.06.21)
Is there more clean solution?
Whenever you're doing a double select involving a by
, it's a good sign that you can instead use fby
q)t:([]time:10:00:00 10:00:01 10:00:00;ticker:`A`A`B;price:5 6 3)
q)
q)select from t where time=(max;time) fby ticker
time ticker price
---------------------
10:00:01 A 6
10:00:00 B 3
Kdb also offers a shortcut of taking last records whenever do you a select by
with no specified columns but this approach isn't as general or customizable
q)select by ticker from t
ticker| time price
------| --------------
A | 10:00:01 6
B | 10:00:00 3
One additional thing to note, select by
can give wrong results if the data is not sorted correctly.
e.g.
select by ticker from reverse[t]
ticker| time price
------| --------------
A | 10:00:00 5 //wrong result
B | 10:00:00 3
The fby
can get the correct results regardless of the order:
select from (reverse t) where time=(max;time) fby ticker
time ticker price
---------------------
10:00:00 B 3
10:00:01 A 6
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With