I've inherited some tables which I'm trying to clean up but first I'm trying to join everything I need but am having problems because there's more than one way to get to the table SpecialEvents
thru EventRegistrations
.
In some cases EventRegistrations
can be joined directly using event_registrations.scoreable_id
while in other cases you must first join another table SpecialPlaces
, you can know which way you need to go via event_registrations.scoreable_type
which is either SpecialEvent
or SpecialPlace
.
Basically, how can I join SpecialEvents
in the case where I must also join SpecialPlaces
first? Eg If I try to join SpecialEvents
in two different ways I get an error: "table name "special_events" specified more than once".
SELECT event_registrations.id, array_agg(teams.name), event_registrations.number_of_players, event_registrations.state, event_registrations.created_at, array_agg(players.email), array_agg(special_events.name), array_agg(special_places.id)
FROM event_registrations
LEFT JOIN teams ON event_registrations.team_id = teams.id
LEFT JOIN team_memberships ON teams.id = team_memberships.team_id
LEFT JOIN players ON team_memberships.player_id = players.id
LEFT JOIN special_events ON event_registrations.scoreable_id = special_events.id AND event_registrations.scoreable_type = 'SpecialEvent'
LEFT JOIN special_places ON event_registrations.scoreable_id = special_places.id AND event_registrations.scoreable_type = 'SpecialPlace'
GROUP BY event_registrations.id, event_registrations.number_of_players, event_registrations.state, event_registrations.created_at
+----+-----------+---------------------------+-----------+---------------------------+
| id | region_id | start_at | state | created_at |
+----+-----------+---------------------------+-----------+---------------------------+
| 2 | 1 | 2015-10-22 19:30:00 +0100 | published | 2015-09-21 09:41:05 +0100 |
| 4 | 1 | 2016-01-21 19:30:00 +0000 | published | 2015-11-26 15:11:25 +0000 |
| 3 | 1 | 2016-01-28 19:30:00 +0000 | published | 2015-11-23 16:16:27 +0000 |
| 5 | 1 | 2016-12-31 19:30:00 +0000 | draft | 2016-02-24 15:17:22 +0000 |
| 6 | 1 | 2016-05-16 19:30:00 +0100 | published | 2016-03-29 14:33:40 +0100 |
| 10 | 1 | 2016-09-12 19:30:00 +0100 | published | 2016-06-28 17:18:54 +0100 |
| 8 | 1 | 2016-10-07 19:30:00 +0100 | draft | 2016-06-09 15:03:36 +0100 |
| 7 | 1 | 2016-05-23 19:30:00 +0100 | published | 2016-03-30 19:30:21 +0100 |
| 9 | 1 | 2016-08-04 19:30:00 +0100 | published | 2016-06-09 15:18:56 +0100 |
| 11 | 1 | 2016-11-07 19:30:00 +0000 | draft | 2016-07-11 17:20:11 +0100 |
+----+-----------+---------------------------+-----------+---------------------------+
+----+------------------+----------+---------------------------+
| id | special_event_id | place_id | created_at |
+----+------------------+----------+---------------------------+
| 1 | 2 | 243 | 2015-10-12 18:07:09 +0100 |
| 3 | 2 | 83 | 2015-10-15 15:54:40 +0100 |
| 5 | 4 | 262 | 2015-11-26 16:29:35 +0000 |
| 4 | 3 | 262 | 2015-11-23 16:25:31 +0000 |
| 6 | 5 | 281 | 2016-02-24 15:20:33 +0000 |
| 7 | 6 | 262 | 2016-03-29 14:34:00 +0100 |
| 8 | 7 | 262 | 2016-04-11 13:28:00 +0100 |
| 9 | 8 | 262 | 2016-06-09 15:03:52 +0100 |
| 12 | 11 | 262 | 2016-07-11 17:20:26 +0100 |
| 10 | 9 | 262 | 2016-06-09 15:20:08 +0100 |
+----+------------------+----------+---------------------------+
+----+---------+--------------+----------------+-------+---------------------------+
| id | team_id | scoreable_id | scoreable_type | state | created_at |
+----+---------+--------------+----------------+-------+---------------------------+
| 1 | 3979 | 2 | SpecialEvent | draft | 2015-11-30 10:09:06 +0000 |
| 2 | 3717 | 2 | SpecialEvent | draft | 2015-11-30 10:09:06 +0000 |
| 3 | 3626 | 8 | SpecialPlace | draft | 2015-11-30 10:09:06 +0000 |
| 4 | 3202 | 8 | SpecialPlace | draft | 2015-11-30 10:09:06 +0000 |
| 5 | 703 | 2 | SpecialEvent | draft | 2015-11-30 10:09:06 +0000 |
| 6 | 278 | 2 | SpecialEvent | draft | 2015-11-30 10:09:06 +0000 |
| 7 | 3166 | 2 | SpecialEvent | draft | 2015-11-30 10:09:06 +0000 |
| 8 | 3147 | 2 | SpecialEvent | draft | 2015-11-30 10:09:06 +0000 |
| 9 | 3146 | 2 | SpecialEvent | draft | 2015-11-30 10:09:06 +0000 |
| 10 | 3145 | 2 | SpecialEvent | draft | 2015-11-30 10:09:06 +0000 |
+----+---------+--------------+----------------+-------+---------------------------+
What my colleagues mean to say is the way you want to do it is not feasible, however, there are a myriad of ways to do the same thing.
What would you could do to avoid two joins is create a combined table of both SpecialEvents and SpecialPlaces containing all the information you want there and then JOIN that.
eg something like this:
SELECT event_registrations.id, array_agg(teams.name), event_registrations.number_of_players, event_registrations.state, event_registrations.created_at, array_agg(players.email), array_agg(special_events.name), array_agg(special_places.id)
FROM event_registrations
LEFT JOIN teams ON event_registrations.team_id = teams.id
LEFT JOIN team_memberships ON teams.id = team_memberships.team_id
LEFT JOIN players ON team_memberships.player_id = players.id
LEFT JOIN special_places ON event_registrations.scoreable_id = special_places.id AND event_registrations.scoreable_type = 'SpecialPlace'
LEFT JOIN (
SELECT special_events.id AS special_event_id, special_places.id AS special_place_id, special_events.name
FROM special_places
LEFT JOIN special_events ON special_places.special_event_id = special_events.id
UNION
SELECT special_events.id AS special_event_id, null AS special_place_id, special_events.name
FROM special_events
) el1
ON (event_registrations.scoreable_id = el1.special_place_id AND event_registrations.scoreable_type = 'SpecialPlace') OR (event_registrations.scoreable_id = el1.special_event_id AND event_registrations.scoreable_type = 'SpecialEvent')
GROUP BY event_registrations.id, event_registrations.number_of_players, event_registrations.state, event_registrations.created_at
Assuming that id
is the PRIMARY KEY
column in each of the given tables, and based on some educated guesses:
SELECT er.id
, t.name AS team_name -- can only be 1, no array_agg
, er.number_of_players
, er.state
, er.created_at
, tp.player_emails -- pre-aggregated!
, se.name AS special_event_name -- can only be 1, no array_agg
, sp.id AS special_pace_id -- can only be 1, no array_agg
FROM event_registrations er
LEFT JOIN teams t ON t.id = er.team_id
LEFT JOIN (
SELECT tm.team_id, array_agg(p.email) AS player_emails
FROM team_memberships tm
JOIN players p ON p.id = tm.player_id
GROUP BY 1
) tp USING (team_id)
LEFT JOIN special_places sp ON sp.id = er.scoreable_id AND er.scoreable_type = 'SpecialPlace'
LEFT JOIN special_events se ON se.id = er.scoreable_id AND er.scoreable_type = 'SpecialEvent'
OR se.id = sp.special_event_id AND er.scoreable_type = 'SpecialPlace'
Much simpler and faster.
If you indeed need to join to the same table twice, you have to use table aliases like:
FROM event_registrations er
which is short for:
FROM event_registrations AS er
Turns out, you do not need to join to the same table twice. Still use table aliases to cut the noise. Related:
The only identifiable reason for the global GROUP BY
in the outer SELECT
was the join to team_memberships
that could potentially multiply rows. I moved the aggregation of player_emails
to a much cheaper subquery, removed the outer GROUP BY
and simplified the rest. Should also be substantially faster. Related:
If you need GROUP BY
in the outer query - and event_registrations.id
is indeed the PRIMARY KEY
- then this:
GROUP BY er.id, er.number_of_players, er.state, er.created_at
... is just another noisy way of saying:
GROUP BY er.id
Since Postgres 9.1, the PK covers all columns of a table in the GROUP BY
clause. See:
But you don't need that at all.
Finally, the core question is solved by joining to special_places
conditionally first and then, conditionally again, joining to special_events
. Missing columns are filled in with NULL values:
LEFT JOIN special_places sp ON sp.id = er.scoreable_id AND er.scoreable_type = 'SpecialPlace'
LEFT JOIN special_events se ON se.id = er.scoreable_id AND er.scoreable_type = 'SpecialEvent'
OR se.id = sp.special_event_id AND er.scoreable_type = 'SpecialPlace'
The final AND er.scoreable_type = 'SpecialPlace'
is redundant, strictly speaking, as there can be no sp.special_event_id
otherwise anyway. I kept it for clarity.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With