Given a table, I would like to get the ratio between rows that are removed and approved by company, delimited by a certain range (amount).
Sample Table
status company amount
-----------------------
removed a 100
removed b 200
approved b 300
removed a 400
approved b 500
removed b 600
approved b 700
removed a 800
approved a 900
removed a 1000
A Malformed Query:
update sample: count Status where (Status = `approved) % count Status where (Status = `removed) where (amount<= 500, amount > 0) by company from sample
update sample: count Status where (Status = `approved) % count Status where (Status = `removed) where (amount<= 1000, amount > 500) by company from sample
Resultant Table
company 0-500 600-1000
-----------------------
a 0 1/2
b 2 1
removed deals in the 0 - 500 range, hence its ratio is 0.approved deals and one removed deal in the 0 - 500 range, hence its ratio is 2.approved deal and two removed deals in the 600 - 1000 range, hence its ratio is 1/2.approved and oneremoved` deal in the 600 - 1000 range, hence its ratio is 1.Table Query
sample:([]status:`removed`removed`approved`removed`approved`removed`approved`removed`approved`removed; company:`a`b`b`a`b`b`b`a`a`a; amount: 100 200 300 400 500 600 700 800 900 1000)
One potential way of achieving your specifications would be to form two select queries and horizontally joining them.
q)a:select range1:(count status where status=`approved)%count status where status=`removed by company from sample where amount within (0;500)
q)b:select range2:(count status where status=`approved)%count status where status=`removed by company from sample where amount within (600;1000)
q)a ,' b
company| range1 range2
-------| -------------
a | 0 0.5
b | 2 1
Additionally you can rename the columns so that their names are as you wanted in your question:
q)(`company,(`$"0-500"),(`$"600-1000")) xcol a,'b
company| 0-500 600-1000
-------| --------------
a | 0 0.5
b | 2 1
The following will give you the answer you requested, however it does require you to be looking within the ranges of 0-600 and 600-1200, as xbar will split the amount column into chunks within multiples of 600. The above is more specific to your request (i.e. 0-500 and 600-1000).
q)ratios:select (sum status=`approved)%sum status=`removed by company,600 xbar amount from sample
q)ratios
company amount| x
--------------| ---
a 0 | 0
a 600 | 0.5
b 0 | 2
b 600 | 1
The table can then be pivoted to give the required table format:
q)exec ((`$"0-600"),(`$"600-1200"))!x by company:company from ratios
company| 0-600 600-1200
-------| --------------
a | 0 0.5
b | 2 1
First, I created the random sample table with:
n:100;t:([]status:n?`a`r;company:n?`a`b`c`d;amount:100*n?10)
status company amount
---------------------
r a 800
r c 100
a c 900
a d 500
a a 400
a a 600
a a 600
r c 100
r c 800
a a 500
r c 400
r a 900
r d 200
r c 700
a a 0
r b 100
a c 900
a d 200
a a 100
a d 800
Then, I created a dictionary with different sized ranges where the values are the lower bounds:
g:`g1`g2`g3`g4!0 200 300 500
Next, I used bin here to get the ratios that fall within each bucket range:
q)select rat:sum[status=`a]%sum[status=`r] by company , amount:key[g] value[g] bin amount from t
company amount| rat
--------------| ---------
a g1 | 1.666667
a g2 | 0w
a g3 | 0w
a g4 | 0.6666667
b g1 | 1
b g2 | 1
b g3 | 1.5
b g4 | 1.333333
c g1 | 0.3333333
c g2 | 2
c g3 | 0.3333333
c g4 | 0.875
d g1 | 0.6666667
d g2 | 3
d g3 | 3
d g4 | 5
Finally, I used a pivot on the whole thing to visualize the data better:
q){[x] exec key[g]#amount!rat by company:company from x} select rat:sum[status=`a]%sum[status=`r] by company , amount:key[g] value[g] bin amount from t
company| g1 g2 g3 g4
-------| --------------------------------
a | 1.666667 0w 0w 0.6666667
b | 1 1 1.5 1.333333
c | 0.3333333 2 0.3333333 0.875
d | 0.6666667 3 3 5
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With