Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Recursive/hierarchical query in BigQuery

I have a recursion/hierarchical problem that I'm trying to figure out in BigQuery.

I have a list of employees and each employee has a manager ID. I need to be able to enter a single Employee_ID and return an array of every person beneath them.

CREATE TABLE p_RLS.testHeirarchy
 (
   Employee_ID INT64,
   Employee_Name STRING,
   Position STRING,
   Line_Manager_ID INT64
 );

INSERT INTO p_RLS.testHeirarchy (Employee_ID, Employee_Name, Position, Line_Manager_ID)
VALUES(1,'Joe','Worker',11),
      (2,'James','Worker',11),
      (3,'Jack','Worker',11),
      (4,'Jill','Worker',12),
      (5,'Jan','Worker',12),
      (6,'Jacquie','Worker',13),
      (7,'Joaquin','Worker',14),
      (8,'Jeremy','Worker',14),
      (9,'Jade','Worker',15),
      (10,'Jocelyn','Worker',15),
      (11, 'Bob', 'Store Manager',16),
      (12, 'Bill', 'Store Manager',16),
      (13, 'Barb', 'Store Manager',16),
      (14, 'Ben', 'Store Manager',17),
      (15, 'Burt', 'Store Manager',17),
      (16, 'Sally','Group Manager',18),
      (17, 'Sam','Group Manager',19),
      (18, 'Anna', 'Ops Manager',20),
      (19, 'Amy', 'Ops Manager',20),
      (20, 'Zoe', 'State Manager', NULL);

My desired output would resemble:

SELECT 20 as Employee_ID, [19,18,17,16,15,14,13,12,11,10,9,8,7,6,5,4,3,2,1] as Reports;
SELECT 11 as Employee_ID, [3,2,1] as Reports;
SELECT 1 as Employee_ID, [] as Reports;

I have got the following working but it seems very ugly/inconvenient and doesn't support unlimited levels:

WITH test as (
SELECT L0.Employee_ID, L0.Employee_Name, L0.Position, L0.Line_Manager_ID,
ARRAY_AGG(DISTINCT L1.Employee_ID IGNORE NULLS) as Lvl1, 
ARRAY_AGG(DISTINCT L2.Employee_ID IGNORE NULLS) as Lvl2, 
ARRAY_AGG(DISTINCT L3.Employee_ID IGNORE NULLS) as Lvl3, 
ARRAY_AGG(DISTINCT L4.Employee_ID IGNORE NULLS) as Lvl4, 
ARRAY_AGG(DISTINCT L5.Employee_ID IGNORE NULLS) as Lvl5, 
ARRAY_AGG(DISTINCT L6.Employee_ID IGNORE NULLS) as Lvl6, 
ARRAY_AGG(DISTINCT L7.Employee_ID IGNORE NULLS) as Lvl7
FROM p_RLS.testHeirarchy as L0
LEFT OUTER JOIN p_RLS.testHeirarchy L1 ON L0.Employee_ID = L1.Line_Manager_ID
LEFT OUTER JOIN p_RLS.testHeirarchy L2 ON L1.Employee_ID = L2.Line_Manager_ID
LEFT OUTER JOIN p_RLS.testHeirarchy L3 ON L2.Employee_ID = L3.Line_Manager_ID
LEFT OUTER JOIN p_RLS.testHeirarchy L4 ON L3.Employee_ID = L4.Line_Manager_ID
LEFT OUTER JOIN p_RLS.testHeirarchy L5 ON L4.Employee_ID = L5.Line_Manager_ID
LEFT OUTER JOIN p_RLS.testHeirarchy L6 ON L5.Employee_ID = L6.Line_Manager_ID
LEFT OUTER JOIN p_RLS.testHeirarchy L7 ON L6.Employee_ID = L7.Line_Manager_ID
WHERE L0.Employee_ID = 16
GROUP BY 1,2,3,4)

SELECT
Employee_ID, ARRAY_CONCAT(
    IFNULL(Lvl1,[]),
    IFNULL(Lvl2,[]),
    IFNULL(Lvl3,[]),
    IFNULL(Lvl4,[]),
    IFNULL(Lvl5,[]),
    IFNULL(Lvl6,[]),
    IFNULL(Lvl7,[])) as All_reports
FROM test

Is there a better way to do this? Is a recursive approach possible in BigQuery?

like image 856
Drx Avatar asked Apr 16 '26 05:04

Drx


1 Answers

Recursive CTE was recently introduced !
This makes things so much easier

with recursive iterations as (
  select line_manager_id, employee_id, 1 pos from your_table
  union all 
  select b.line_manager_id, a.employee_id, pos + 1 
  from your_table a join iterations b
  on b.employee_id = a.line_manager_id 
)
select line_manager_id, string_agg('' || employee_id order by pos, employee_id desc) as reports_as_list
from iterations
where not line_manager_id is null
group by line_manager_id
order by line_manager_id desc 

If applied to sample data in question - output is

enter image description here

like image 161
Mikhail Berlyant Avatar answered Apr 19 '26 01:04

Mikhail Berlyant