Divide a Count by Count in a query and upsert to new table

Question

Given a table, I would like to get the ratio between rows that are removed and approved by company, delimited by a certain range (amount).

Sample Table

status   company amount
-----------------------
removed  a       100   
removed  b       200   
approved b       300   
removed  a       400   
approved b       500   
removed  b       600   
approved b       700   
removed  a       800
approved a       900   
removed  a       1000

A Malformed Query:

update sample: count Status where (Status = `approved) % count Status where (Status = `removed) where (amount<= 500, amount > 0) by company from sample

update sample: count Status where (Status = `approved) % count Status where (Status = `removed) where (amount<= 1000, amount > 500) by company from sample

Resultant Table

company  0-500  600-1000
-----------------------
  a       0       1/2
  b       2       1

A has two removed deals in the 0 - 500 range, hence its ratio is 0.
B has two approved deals and one removed deal in the 0 - 500 range, hence its ratio is 2.
A has one approved deal and two removed deals in the 600 - 1000 range, hence its ratio is 1/2.
B has one approved and oneremoved` deal in the 600 - 1000 range, hence its ratio is 1.

Table Query

sample:([]status:`removed`removed`approved`removed`approved`removed`approved`removed`approved`removed; company:`a`b`b`a`b`b`b`a`a`a; amount: 100 200 300 400 500 600 700 800 900 1000)

Disastron · Accepted Answer

One potential way of achieving your specifications would be to form two select queries and horizontally joining them.

q)a:select range1:(count status where status=`approved)%count status where status=`removed by company from sample where amount within (0;500)
q)b:select range2:(count status where status=`approved)%count status where status=`removed by company from sample where amount within (600;1000)
q)a ,' b
company| range1 range2
-------| -------------
a      | 0      0.5
b      | 2      1

Additionally you can rename the columns so that their names are as you wanted in your question:

q)(`company,(`$"0-500"),(`$"600-1000")) xcol a,'b
company| 0-500 600-1000
-------| --------------
a      | 0     0.5
b      | 2     1

The following will give you the answer you requested, however it does require you to be looking within the ranges of 0-600 and 600-1200, as xbar will split the amount column into chunks within multiples of 600. The above is more specific to your request (i.e. 0-500 and 600-1000).

q)ratios:select (sum status=`approved)%sum status=`removed by company,600 xbar amount from sample
q)ratios
company amount| x
--------------| ---
a       0     | 0
a       600   | 0.5
b       0     | 2
b       600   | 1

The table can then be pivoted to give the required table format:

q)exec ((`$"0-600"),(`$"600-1200"))!x by company:company from ratios
company| 0-600 600-1200
-------| --------------
a      | 0     0.5
b      | 2     1

Craig Neubieser · Answer

First, I created the random sample table with:

n:100;t:([]status:n?`a`r;company:n?`a`b`c`d;amount:100*n?10)
status company amount
---------------------
r      a       800
r      c       100
a      c       900
a      d       500
a      a       400
a      a       600
a      a       600
r      c       100
r      c       800
a      a       500
r      c       400
r      a       900
r      d       200
r      c       700
a      a       0
r      b       100
a      c       900
a      d       200
a      a       100
a      d       800

Then, I created a dictionary with different sized ranges where the values are the lower bounds:

g:`g1`g2`g3`g4!0 200 300 500

Next, I used bin here to get the ratios that fall within each bucket range:

q)select rat:sum[status=`a]%sum[status=`r] by company , amount:key[g] value[g] bin amount from t
company amount| rat
--------------| ---------
a       g1    | 1.666667
a       g2    | 0w
a       g3    | 0w
a       g4    | 0.6666667
b       g1    | 1
b       g2    | 1
b       g3    | 1.5
b       g4    | 1.333333
c       g1    | 0.3333333
c       g2    | 2
c       g3    | 0.3333333
c       g4    | 0.875
d       g1    | 0.6666667
d       g2    | 3
d       g3    | 3
d       g4    | 5

Finally, I used a pivot on the whole thing to visualize the data better:

q){[x] exec  key[g]#amount!rat by company:company from x}  select rat:sum[status=`a]%sum[status=`r] by company , amount:key[g] value[g] bin amount from t
company| g1        g2 g3        g4
-------| --------------------------------
a      | 1.666667  0w 0w        0.6666667
b      | 1         1  1.5       1.333333
c      | 0.3333333 2  0.3333333 0.875
d      | 0.6666667 3  3         5

Divide a Count by Count in a query and upsert to new table

Tags:

kdb+

Carrein

2 Answers

Disastron

Craig Neubieser

Recent Activity

Donate For Us

Divide a Count by Count in a query and upsert to new table

Tags:

kdb+

Carrein

2 Answers

Disastron

Craig Neubieser

Related questions

Recent Activity

Donate For Us