Collapse and summarize counts within group by

Tags:

I have the following set of data:

SalesPerson PackageHistoryID    PackageID   SalesPersonID   EnrollmentAmount    PackageType
-------------------------------------------------------------------------------------------
Jim Jones   2895                310         59019           27.15               New Member
Jim Jones   2895                310         59019           53.21               New Member
Jim Jones   2895                310         59019           42.35               New Member
Jim Jones   2916                221         59019           379.01              Renewal
Jim Jones   2932                326         59019           53.21               New Member
Jim Jones   2932                326         59019           27.15               New Member
Jim Jones   2933                326         59019           53.21               Renewal
Jim Jones   2933                326         59019           27.15               Renewal

Upon that data set I run the following query:

select Salesperson, PackageType, count(*) AS Packages, sum(EnrollmentAmount) AS Enrollment
from Sales2
group by SalesPerson, PackageType
order by SalesPerson, PackageType

...and I get these results:

Salesperson    PackageType    Packages     Enrollment
----------------------------------------------------
Jim Jones      New Member     5            203.07
Jim Jones      Renewal        3            459.37

My final results as shown above are almost perfect. The only problem is the counts in the Packages column. Instead of 5 and 3, the counts should be 2 and 2, because I want it to indicate the number of PackageTypes per PackageHistoryID, not per EnrollmentAmount. I want the EnrollmentAmounts summed so the records can be compressed such that PackageHistoryID never repeats. The first data set shown manifests a 1-many relationship between PackageHistory records and EnrollmentAmount. I thought my 2nd query (the group by) would aggregate this correctly but you can see that it shows 8 total PackageHistories when it really should only show 4.

Here is how the final result set should look:

Salesperson    PackageType    Packages     Enrollment
----------------------------------------------------
Jim Jones      New Member     2            203.07
Jim Jones      Renewal        2            459.37

The 2 and 2 indicate the fact that there are really only 4 PackageHistory records in the result set; 2 are New Member and 2 are Renewal. The multiple EnrollmentAmount records are causing too many records and thus the counts get wrongly expanded in the final query.

Important note: Although SalesPerson is always the same in the results shown, these can sometimes be different, though they will be the same for any given PackageHistory (1-1). The grouping needs to be (1) by SalesPerson, then (2) by PackageType, and summarize/flatten the EnrollmentAmounts within each unique PackageHistory.

What query will give me correct results?

856

asked Apr 02 '15 20:04

HerrimanCoder

1 Answers

You should do a count(distinct PackageHistoryID) instead of count(*):

select Salesperson, PackageType, count(distinct PackageHistoryID) AS Packages,
       sum(EnrollmentAmount) AS Enrollment
from Sales2
group by SalesPerson, PackageType
order by SalesPerson, PackageType

153

answered Oct 03 '22 06:10

Giorgos Betsos

Related questions
                            
                                SQL Query to do a reverse CONTAINS search?
                            
                                Countermeasure to timing attack against SQL SELECT of hash token
                            
                                SQL SELECT id and count of items in same table
                            
                                What does it mean that 'OOP languages are organized around graphs'?
                            
                                Maximum Count of Distinct Values in SQL
                            
                                Converting 1 record with a start and end date into multiple records for each day
                            
                                mysqldump dump only database with certain prefix
                            
                                Difference between USING and ON when joining more than two tables
                            
                                whats the best datatype to store height?
                            
                                jOOQ addConditions: in SQL question mark appears instead of the value
                            
                                Query combinations with nested array of records in JSON datatype
                            
                                is there a better way to write this query
                            
                                Can you copy table privileges from one table to another in postgresql?
                            
                                SQL - decrease value to zero
                            
                                Oracle - need to extract text between given strings
                            
                                SQL table with a single column and incremental id
                            
                                How to construct ClassTag for Spark SQL DataFrame Mapping?
                            
                                Using SQL concatenation with ORDER BY
                            
                                sql server string comparison
                            
                                SQL query to count most "popular" value?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Collapse and summarize counts within group by

Tags:

sql

sql-server

tsql

HerrimanCoder

People also ask

1 Answers

Giorgos Betsos

Recent Activity

Donate For Us