I'm trying to write a SQL query to generate a summary row for the actions performed by a given user in a given period. I have the following relevant table structure:
users
audit_periods (can be processing, shipping, break, etc)
audit_tasks
audit_task_types
For each user for a given period, I'd like to create something like the following row of data:
users.id users.email time_spent_processing time_spent_shipping ... number_of_scans number_of_pallets
which would be calculated by figuring out for each user:
I've exhausted all of the SQL tricks I know (not many) and came up with something like the following:
select
u.id as user_id,
u.email as email,
u.team as team,
ap.period_type as period_type,
att.name,
time_to_sec(
timediff(least("2011-03-17 00:00:00", ifnull(ap.finished_at, utc_timestamp())), greatest("2011-03-16 00:00:00", ap.started_at))
) as period_duration,
sum(at.score) as period_score
from audit_periods as ap
inner join users as u on ap.user_id = u.id
left join audit_tasks as at on at.audit_period_id = ap.id
left join audit_task_types as att on at.audit_task_type_id = att.id
where (ap.started_at >= "2011-03-16 00:00:00" or (ap.finished_at >= "2011-03-17 00:00:00" and ap.finished_at <= "2011-03-17 00:00:00"))
and (ap.finished_at <= "2011-03-17 00:00:00" or (ap.started_at >= "2011-03-16 00:00:00" and ap.started_at <= "2011-03-16 00:00:00"))
and u.team in ("Foo", "Bar")
group by u.id, ap.id, at.id
but this seems to be functionally equivalent to just selecting all of the audit tasks in the end. I've tried some subqueries as well, but to little avail. More directly, this generates something like (skipping less important columns):
user_id | period_type | period_duration | name | score
1 processing 1800s scan 200
1 shipping 1000s place_in_pallet 100
1 shipping 1000s place_in_pallet 100
1 break 500s null null
when I want:
user_id | processing | shipping | break | scan | place_in_pallet | score
1 1800s 1000s 500s 1 2 400
I can easily fetch all of the audit_tasks for a given user and roll them up in code, but I might be fetching hundreds of thousands of audit_tasks over a given period, so it needs to be done in SQL.
Just to be clear -- I'm looking for a query to generate one row per user, containing summary data collected across the other 3 tables. So, for each user, I want to know how much time he spent in each type of audit_period (3600 seconds processing, 3200 seconds shipping, etc), as well as how many of each audit_task he performed (5 scans, 10 items placed in pallet, etc).
I think I have the elements of a solution, I'm just having trouble piecing them together. I know exactly how I would accomplish this in Ruby/Java/etc, but I don't think I understand SQL well enough to know which tool I'm missing. Do I need a temp table? A union? Some other construct entirely?
Any help is greatly appreciated, and I can clarify if the above is complete nonsense.
You will need to break this up into two crosstab queries which give you the information about audit_periods by user and another query that will give you the audit_task information by user and then join that to the Users table. It isn't clear how you want to roll up the information in each of the cases. For example, if a given user has 10 audit_period
rows, how should the query roll up those durations? I assumed a sum of the durations here but you might want a min or max or perhaps even an overall delta.
Select U.user_id
, AuditPeriodByUser.TotalDuration_Processing As processing
, AuditPeriodByUser.TotalDuration_Shipping As shipping
, AuditPeriodByUser.TotalDuration_Break As break
, AuditTasksByUser.TotalCount_Scan As scan
, AuditTasksByUser.TotalCount_Place_In_Pallet As place_in_pallet
, AuditTasksByUser.TotalScore As score
From users As U
Left Join (
Select AP.user_id
, Sum( Case When AP.period_type = 'processing'
Then Time_To_Sec(
TimeDiff(
Coalesce(AP.started_at, UTC_TIMESTAMP()), AP.finished_at ) ) )
As TotalDuration_Processing
, Sum( Case When AP.period_type = 'shipping'
Then Time_To_Sec(
TimeDiff(
Coalesce(AP.started_at, UTC_TIMESTAMP()), AP.finished_at ) ) )
As TotalDuration_Shipping
, Sum( Case When AP.period_type = 'break'
Then Time_To_Sec(
TimeDiff(
Coalesce(AP.started_at, UTC_TIMESTAMP()), AP.finished_at ) ) )
As TotalDuration_Break
From audit_periods As AP
Where AP.started_at >= @StartDate
And AP.finished_at <= @EndDate
Group by AP.user_id
) As AuditPeriodByUser
On AuditPeriodByUser.user_id = U.user_id
Left Join (
Select AP.user_id
, Sum( Case When AT.Name = 'scan' Then 1 Else 0 End ) As TotalCount_Scan
, Sum( Case When AT.Name = 'place_in_pallet' Then 1 Else 0 End ) As TotalCount_Place_In_Pallet
, Sum( AT.score ) As TotalScore
From audit_tasks As AT
Join audit_task_types As ATT
On ATT.id = AT.audit_task_type_id
Join audit_periods As AP
On AP.audit_period_id = AP.id
Where AP.started_at >= @StartDate
And AP.finished_at <= @EndDate
Group By AP.user_id
) As AuditTasksByUser
On AuditTasksByUser.user_id = U.user_id
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With