Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to query for the "parent" of an arbitrary child relationship in SQL?

Tags:

sql

postgresql

I'm looking to write the simplest, most efficient SQL query to retrieve a single team from a given event. But the event might not be directly related to the team, it might be related at any level in the table relationship hierarchy.


Setup

Here's a simplistic representation of what my schema looks like:

enter image description here

A couple things to note:

  • teams can have many collections, apps, and webhooks.
  • collections can also have many webhooks.
  • webhooks can belong to either a team or collection, but only one.
  • events can belong to any object, but only one.

This seems like a fairly basic setup that most SaaS-type companies would have (eg. Slack or Stripe). Everything is grouped under a "team" which users can then interact with as members.


Problem

Given that setup, I'd like to create a SQL query that solves...

Find the team of an event by id.

I can easily write queries that find the parent team either directly, or indirectly via a specific means. For example...

Find the team of a directly related "team event".

SELECT teams.*
FROM teams
JOIN events ON events.team_id = teams.id
WHERE events.id = ${id}

Or...

Find the team of an indirectly related "collection event".

SELECT teams.*
FROM teams
JOIN collections ON collections.team_id = teams.id
JOIN events ON events.collection_id = collections.id
WHERE events.id = ${id}

Webhooks get a more complex, because they can be related in two different ways...

Find the team of an indirectly related "webhook event".

SELECT teams.*
FROM teams
JOIN collections ON collections.team_id = teams.id
JOIN webhooks AS team_webhooks ON team_webhooks.team_id = teams.id
JOIN webhooks AS collection_webhooks ON collection_webhooks.collection_id = collections.id
JOIN events AS team_webhook_events ON team_webhook_events.webhook_id = team_webhooks.id
JOIN events AS collection_webhook_events ON collection_webhook_events.webhook_id = collection_webhooks.id
WHERE team_webhook_events.id = ${id}
OR collection_webhook_events.id = ${id}

I'm not sure if I should be writing it like that, or by using UNION instead...

SELECT teams.*
FROM teams
JOIN webhooks ON webhooks.team_id = teams.id
JOIN events ON events.webhook_id = webhooks.id
WHERE events.id = ${id}

UNION

SELECT teams.*
FROM teams
JOIN collections ON collections.team_id = teams.id
JOIN webhooks ON webhooks.collection_id = collections.id
JOIN events ON events.webhook_id = webhooks.id
WHERE events.id = ${id}

...unsure which is better for performance.


As you can see, there are a lot of different ways for a single event to be related to a given team, via all those paths! So when I try to write the "all encompassing" query, it ends up being super complex...

Potential Solution #1

Doing it the UNION way, I think I can just make a bunch of queries, and union them all together, although this seems like it might be wasteful?

SELECT teams.*
FROM teams
JOIN events ON events.team_id = teams.id
WHERE events.id = ${id}

UNION

SELECT teams.*
FROM teams
JOIN apps ON apps.team_id = teams.id
JOIN events ON events.app_id = apps.id
WHERE events.id = ${id}

UNION

SELECT teams.*
FROM teams
JOIN collections ON collections.team_id = teams.id
JOIN events ON events.collection_id = collections.id
WHERE events.id = ${id}

UNION

SELECT teams.*
FROM teams
JOIN webhooks ON webhooks.team_id = teams.id
JOIN events ON events.webhook_id = webhooks.id
WHERE events.id = ${id}

UNION

SELECT teams.*
FROM teams
JOIN collections ON collections.team_id = teams.id
JOIN webhooks ON webhooks.collection_id = collections.id
JOIN events ON events.webhook_id = webhooks.id
WHERE events.id = ${id}

Potential Solution #2

Using multiple JOINs I think I might be able to join all of the different combinations of event owners together, with their relationship to teams? Although this also seems like a lot of JOINs...

SELECT teams.*
FROM teams

JOIN apps ON apps.team_id = teams.id
JOIN collections ON collections.team_id = teams.id
JOIN webhooks AS team_webhooks ON team_webhooks.team_id = teams.id
JOIN webhooks AS collection_webhooks ON collection_webhooks.collection_id = collections.id

JOIN events AS app_events ON app_events.app_id = apps.id
JOIN events AS collection_events ON collection_events.collection_id = collections.id
JOIN events AS team_events ON team_events.team_id = teams.id
JOIN events AS collection_webhook_events ON collection_webhook_events.webhook_id = collection_webhooks.id
JOIN events AS team_webhook_events ON team_webhook_events.webhook_id = team_webhooks.id

WHERE app_events.id = ${id}
OR collection_events.id = ${id}
OR team_events.id = ${id}
OR collection_webhook_events.id = ${id}
OR team_webhook_events.id = ${id}

Questions

  • Are either of those potential solutions the best way to do it?
  • Is there a more efficient way to write it?
  • Is there a simpler, easier-to-read-later way to write it?

Edit: I don't think this question is a duplicate!

Yesterday, I asked "How to query for nested relationships in SQL?", which involves a similar data structure (because it's the same one), and was written in a similar way. But, as far as I can tell, that question is actually the inverse of this one.

In the first, I was asking how to...

Find all of the events that are related (directly or indirectly) to a given user by ID. Or put more generically, given a "parent"'s ID, find all of its possible descendants in the tree-like data model.

But in this one, I'm asking how to...

Find the team of an arbitrary event by ID. Or put more generically, given an arbitrary "child"'s ID, which can be attached to the tree-like data model at any level of the hierarchy, find its matching parent. (It's essentially the inverse of the first question.)

And as far as I can tell, they involve different solutions. The first seems to involve performing a series of queries as you traverse through the tree, collecting all of the events. The second (this one), I'm less sure about, but seems to require building up all of the joined models first, and then checking to see if which of them match the specific event.

Please help me to mark it as not a duplicate!

Double edit: Okay, no longer marked as duplicate! Thanks :D

like image 968
Ian Storm Taylor Avatar asked Apr 07 '17 01:04

Ian Storm Taylor


People also ask

How do you query a parent-child relationship in SQL?

To find out who that child's parent is, you have to look at the column parent_id , find the same ID number in the id column, and look in that row for the parent's name. In other words, Jim Cliffy has no parents in this table; the value in his parent_id column is NULL .

How do I get all parent children in SQL?

parent = c. child ) SELECT distinct parent, child , level FROM cte order by level, parent; This will give you all descendants and the level.

Which database allows only a parent-child relationship in SQL?

The Oracle NoSQL Database enables tables to exist in a parent-child relationship.


1 Answers

Find the team of an arbitrary event by ID:

Take left joins from events table and see which object returns a team_id use that.

select coalesce(e.team_id,a.team_id,c.team_id,w.team_id) eteam_id 
from events e 
   left join apps a on e.app_id=a.id 
   left join collections c on c.id=e.collection_id 
   left join webhooks w on w.id=e.webhook_id 
   left join teams t on t.id=e.team_id 
where e.id=${id};

Just for completeness if you want all data for resultant team:

select t.* from teams t 
join 
(select coalesce(e.team_id,a.team_id,c.team_id,w.team_id) eteam_id
from events e left join apps a on e.app_id=a.id 
left join collections c on c.id=e.collection_id 
left join webhooks w on w.id=e.webhook_id 
left join teams t on t.id=e.team_id 
where e.id=1) t1 
on t.id=eteam_id;
like image 173
bazinga012 Avatar answered Sep 22 '22 14:09

bazinga012