This is a pretty basic problem and for whatever reason I can't find a reasonable solution. I'll do my best to explain.
Say you have an event ticket (section, row, seat #). Each ticket belongs to an attendee. Multiple tickets can belong to the same attendee. Each attendee has a worth (ex: Attendee #1 is worth $10,000). That said, here's what I want to do:
1. Group the tickets by their section
2. Get number of tickets (count)
3. Get total worth of the attendees in that section
Here's where I'm having problems: If Attendee #1 is worth $10,000 and is using 4 tickets, sum(attendees.worth) is returning $40,000. Which is not accurate. The worth should be $10,000. Yet when I make the result distinct on the attendee, the count is not accurate. In an ideal world it'd be nice to do something like
select
tickets.section,
count(tickets.*) as count,
sum(DISTINCT ON (attendees.id) attendees.worth) as total_worth
from
tickets
INNER JOIN
attendees ON attendees.id = tickets.attendee_id
GROUP BY tickets.section
Obviously this query doesn't work. How can I accomplish this same thing in a single query? OR is it even possible? I'd prefer to stay away from sub queries too because this is part of a much larger solution where I would need to do this across multiple tables.
Also, the worth should follow the ticket divided evenly. Ex: $10,000 / 4. Each ticket has an attendee worth of $5,000. So if the tickets are in different sections, they take their prorated worth with them.
Thanks for your help.
You can use DISTINCT when performing an aggregation. You'll probably use it most commonly with the COUNT function. In this case, you should run the query below that counts the unique values in the month column.
The PostgreSQL SUM() is an aggregate function that returns the sum of values or distinct values. The SUM() function ignores NULL . It means that SUM() doesn't consider the NULL in calculation. If you use the DISTINCT option, the SUM() function calculates the sum of distinct values.
From experiments, I founded that the GROUP BY is 10+ times faster than DISTINCT. They are different. So what I learned is: GROUP-BY is anyway not worse than DISTINCT, and it is better sometimes.
Removing duplicate rows from a query result set in PostgreSQL can be done using the SELECT statement with the DISTINCT clause. It keeps one row for each group of duplicates. The DISTINCT clause can be used for a single column or for a list of columns.
You need to aggregate the tickets before the attendees:
select ta.section, sum(ta.numtickets) as count, sum(a.worth) as total_worth
from (select attendee_id, section, count(*) as numtickets
from tickets
group by attendee_id, section
) ta INNER JOIN
attendees a
ON a.id = ta.attendee_id
GROUP BY ta.section
You still have a problem of a single attendee having seats in multiple sections. However, you do not specify how to solve that (apportion the worth? randomly choose one section? attribute it to all sections? canonically choose a section?)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With