I am learning q on kdb database. I am concerned by the fact that there are no loops in q. I need to write an algorithm that I would write with several nested for-loops in a verbose program like C. But in q I am stuck by the fact that I cannot loop.
Just to give a specific example (one of many), I have this simple vector (column table):
q)closures
price
-----
18.54
18.53
18.53
18.52
18.57
18.9
18.9
18.77
18.59
18.51
18.37
I need a vector that groups 3by3 these entries, with superposition, like (using R syntax): closures[0:2],closures[1:3],closures[2:4],closures[3:5]... How can I do?
In general, how do I need to change my mentality, to correctly program in q?
Thanks a lot for your suggestions Marco
Addressing your last point about "how do I need to change my mentality, to correctly program in q?":
You need to utilise over (/), scan (\) and .z.s rather than using loops.
For example, your problem could be solved in the following ways: (note that these would not actually be the best solutions for your particular problem - indexing is the better approach there - but nonetheless these solutions below should help get the point across)
price:18.54 18.53 18.53 18.52 18.57 18.9 18.9 18.77 18.59 18.51 18.37
q)3#'{1_x}\[8;price]
18.54 18.53 18.53
18.53 18.53 18.52
18.53 18.52 18.57
18.52 18.57 18.9
18.57 18.9 18.9
18.9 18.9 18.77
18.9 18.77 18.59
18.77 18.59 18.51
18.59 18.51 18.37
i.e iterate over the list, chop one off each time, take first 3 of each iteration
or similarly
q)3#'{1 rotate x}\[8;price]
18.54 18.53 18.53
18.53 18.53 18.52
18.53 18.52 18.57
18.52 18.57 18.9
18.57 18.9 18.9
18.9 18.9 18.77
18.9 18.77 18.59
18.77 18.59 18.51
18.59 18.51 18.37
i.e. rotate by 1 eight times, take first 3 of each rotation
Using a .z.s approach
q){$[2<count x;enlist[3#x],.z.s 1_x;()]}[price]
18.54 18.53 18.53
18.53 18.53 18.52
18.53 18.52 18.57
18.52 18.57 18.9
18.57 18.9 18.9
18.9 18.9 18.77
18.9 18.77 18.59
18.77 18.59 18.51
18.59 18.51 18.37
i.e. if there are at least 3 elements left, take first 3, then chop off first item and re-apply same function to the shortened list.
Using over (/) would be convoluted in this example but in general over is usful for replacing "while" type constructs
i:0
a:0;
while[i<10;i+:1;a+:10]
is better achieved using
q){x+10}/[10;0]
100
i.e. add 10, ten times with a starting (seed) value of zero.
b:();
while[not 18~last b;b,:1?20]
i.e keep appending random numbers between 1 and 20 until you hit 18, then stop.
is better achieved using
q){x,1?20}/[{not 18~last x};()]
1 2 16 5 8 18
i.e append a random number between 1 and 20, iterate so long as the check function returns true, starting off with () as seed value
There are many other options for using scan and over, depending on whether the function you're iterating over is monadic/diadic etc.
As a broad generalisation: a "for i=1:10" type loop can be achieved in q using "function each i", a "do" type loop can be achieved in q using "function/[numOfTimes;seed]", a "while" type loop can be achieved in q using "function/[booleanCheckFunction;seed]"
flip (-2_price;-1_1_price;2_price)
As already mentioned using builtins is always faster than iteration with ' or /. That's why on a vecor of 1 million elements the code from the top answer takes forever to complete and uses GBs of heap. The indexing method from JPC and Harshal is miles faster but the flip above is three times faster still.
Once you have the data in the format you need just use each to apply your closure to the elements of the list.
Always keeping an eye on what \t tells you is a good practice - KDB will not optimize anything for you.
As for nested loops something i have found useful is creating a cross list. E.g
`for i=1:10
for j=1:20
for k=1:30
f(i, j, k)
`
in q you can
il: 1 _til 11
jl: 1_til 21
kl: 1_til 31
lst: il cross jl cross kl
raze g(x) each til count ls
where g is defined as
g: {[i]
itr: first lst[i];
jtr: first 1_lst[i];
ktr: last lst[i];
f(itr, jtr, ktr)
}
Hope this clarifies. As for the closures part don't know R syntax there. Can help if you can tell what output you want.
One method would be to calculate the indices you care about and then index into the array -
q) f:{y til[x]+/:neg[x]_til count y} // [x] = sublist length [y] = list
q) f[3;18.54 18.53 18.53 18.52 18.57 18.9 18.9 18.77 18.59 18.51 18.37]
18.54 18.53 18.53
18.53 18.53 18.52
18.53 18.52 18.57
18.52 18.57 18.9
18.57 18.9 18.9
18.9 18.9 18.77
18.9 18.77 18.59
18.77 18.59 18.51
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With