Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL Advanced Query Brainteaser

Tags:

sql

mysql

I've been asked to create a financial report, which needs to give a total commission rate between two dates for several 'referrers'. That's the easy part.

The difficult part is that the commission rate varies depending not only on the referrer but also on the type of referral and also on the number of referrals of that type that have been made by a given referrer.

The tracking of the number of referrals needs to take into account ALL referrals, rather than those in the given date range - in other words, the commission rate is on a sliding scale for each referrer, changing as their total referrals increase. Luckily, there are only a maximum of 3 commission levels for each type of referral.

The referrals are all stored in the same table, 1 row per referral, with a field denoting the referrer and the type of referral. An example to illustrate:

ID   Type    Referrer    Date
1    A       X           01/12/08
2    A       X           15/01/09
3    A       X           23/02/09
4    B       X           01/12/08
5    B       X           15/01/09
6    A       Y           01/12/08
7    A       Y           15/01/09
8    B       Y           15/01/09
9    B       Y           23/02/09

The commission rates are not stored in the referral table - and indeed may change - instead they are stored in the referrer table, like so:

Referrer    Comm_A1    Comm_A2    Comm_A3    Comm_B1    Comm_B2    Comm_B3
X           30         20         10         55         45         35
Y           45         35         25         60         40         30

Looking at the above referral table as an example, and assuming the commission rate level increased after referral number 1 and 2 (then remained the same), running a commission report for December 2008 to February 2009 would return the following:

[Edit] - to clarify the above, the commission rate has three levels for each type and each referrer, with the initial rate Comm_A1 for the first referral commission, then Comm_A2 for the second, and Comm_A3 for all subsequent referrals.

Referrer    Type_A_Comm    Type_A_Ref    Type_B_Comm    Type_B_Ref
X           60             3             100            2
Y           80             2             100            2

Running a commission report for just February 2009 would return:

Referrer    Type_A_Comm    Type_A_Ref    Type_B_Comm    Type_B_Ref
X           10             1             0              0
Y           0              0             40             1

Edit the above results have been adjusted from my original question, in terms of the column / row grouping.

I'm quite sure that any solution will involve a sub-query (perhaps for each referral type) and also some kind of aggregate / Sum If - but I'm struggling to come up with a working query.

[Edit] I'm not sure about writing an equation of my requirements, but I'll try to list the steps as I see them:

Determine the number of previous referrals for each type and each referrer - that is, irrespective of any date range.

Based on the number of previous referrals, select the appropriate commission level - 0 previous = level 1, 1 previous = level 2, 2 or more previous = level 3

(Note: a referrer with no previous referrals but, say, 3 new referrals, would expect a commission of 1 x level 1, 1 x level 2, 1 x level 3 = total commission)

Filter results according to a date range - so that commission payable for a period of activity may be determined.

Return data with column for referrer, and a column with the total commission for each referral type (and ideally, also a column with a count for each referral type).

Does that help to clarify my requirements?

like image 305
BrynJ Avatar asked Nov 24 '25 06:11

BrynJ


2 Answers

Assuming that you have a table called type that lists your particular referral types, this should work (if not, you could substitute another subselect for getting the distinct types from referral for this purpose).

select
    r.referrer,
    t.type,
    (case 
        when isnull(ref_prior.referrals, 0) < @max1 then 
            (case 
                when isnull(ref_prior.referrals, 0) + isnull(ref_period.referrals, 0) < @max1 then isnull(ref_period.referrals, 0) 
                else @max1 - isnull(ref_prior.referrals, 0) 
            end) 
        else 0 
    end) * (case t.type when 'A' then r.Comm_A1 when 'B' then r.Comm_B1 else null end) +
    (case when isnull(ref_prior.referrals, 0) + isnull(ref_period.referrals, 0) > @max1 then
        (case 
            when isnull(ref_prior.referrals, 0) < @max2 then 
                (case 
                    when isnull(ref_prior.referrals, 0) + isnull(ref_period.referrals, 0) < @max2 then isnull(ref_period.referrals, 0) 
                    else @max2 - isnull(ref_prior.referrals, 0) 
                end) 
            else 0 
        end) -
        (case 
            when isnull(ref_prior.referrals, 0) < @max1 then 
                (case 
                    when isnull(ref_prior.referrals, 0) + isnull(ref_period.referrals, 0) < @max1 then isnull(ref_period.referrals, 0) 
                    else @max1 - isnull(ref_prior.referrals, 0) 
                end) 
            else 0 
        end)
    else 0 end) * (case t.type when 'A' then r.Comm_A2 when 'B' then r.Comm_B2 else null end) +
    (case when isnull(ref_prior.referrals, 0) + isnull(ref_period.referrals, 0) > @max2 then
        (isnull(ref_period.referrals, 0)) -
            (
                (case when isnull(ref_prior.referrals, 0) + isnull(ref_period.referrals, 0) > @max1 then
                    (case 
                        when isnull(ref_prior.referrals, 0) < @max2 then 
                            (case 
                                when isnull(ref_prior.referrals, 0) + isnull(ref_period.referrals, 0) < @max2 then isnull(ref_period.referrals, 0) 
                                else @max2 - isnull(ref_prior.referrals, 0) 
                            end) 
                        else 0 
                    end) -
                    (case 
                        when isnull(ref_prior.referrals, 0) < @max1 then 
                            (case 
                                when isnull(ref_prior.referrals, 0) + isnull(ref_period.referrals, 0) < @max1 then isnull(ref_period.referrals, 0) 
                                else @max1 - isnull(ref_prior.referrals, 0) 
                            end) 
                        else 0 
                    end)
                else 0 end) +
                (case 
                    when isnull(ref_prior.referrals, 0) < @max1 then 
                        (case 
                            when isnull(ref_prior.referrals, 0) + isnull(ref_period.referrals, 0) < @max1 then isnull(ref_period.referrals, 0) 
                            else @max1 - isnull(ref_prior.referrals, 0) 
                        end) 
                    else 0 
                end)
            )                   
    else 0 end) * (case t.type when 'A' then r.Comm_A3 when 'B' then r.Comm_B3 else null end) as Total_Commission

from referrer r

join type t on 1 = 1 --intentional cartesian product
left join (select referrer, type, count(1) as referrals from referral where date < @start_date group by referrer, type) ref_prior on ref_prior.referrer = r.referrer and ref_prior.type = t.type
left join (select referrer, type, count(1) as referrals from referral where date between @start_date and @end_date group by referrer, type) ref_period on ref_period.referrer = r.referrer and ref_period.type = t.type

This assumes that you have a @start_date and @end_date variable, and you'll obviously have to supply the logic missing from the case statement to make the proper selection of rates based upon the type and number of referrals from ref_total.

Edit

After reviewing the question, I saw the comment about the sliding scale. This greatly increased the complexity of the query, but it's still doable. The revised query now also depends on the presence of two variables @max1 and @max2, representing the maximum number of sales that can fall into category '1' and category '2' (for testing purposes, I used 1 and 2 respectively, and these produced the expected results).

like image 156
Adam Robinson Avatar answered Nov 26 '25 19:11

Adam Robinson


Adam's answer is far more thorough than I'm going to be but I think trying to write this as a single query might not be the right approach.

Have you thought about creating a stored procedure which creates and then populates a temporary table, step by step.

The temporary table would have the shape of the results set you're looking for. The initial insert creates your basic data set (essentially the number of rows you're looking to return with key identifiers and then anything else you're looking to return which can be easily assembled as part of the same query).

You then have a series of updates to the temporary table assembling each section of the more complex data.

Finally select it all back and drop the temporary table.

The advantages of this are that it allows you to break it down in your mind and assemble it a bit at a time which allows you to more easily find where you've gone wrong. It also means that the more complex bits can be assembled in a couple of stages.

In addition if some poor sod comes along and has to debug the whole thing afterwards it's going to be far easier for him to trace through what's happening where.

like image 40
Jon Hopkins Avatar answered Nov 26 '25 20:11

Jon Hopkins



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!