I'm working on creating a SQL query that will pull records from a table based on the value of two aggregate functions. These aggregate functions are pulling data from the same table, but with different filter conditions. The problem that I run into is that the results of the SUMs are much larger than if I only include one SUM function. I know that I can create this query using temp tables, but I'm just wondering if there is an elegant solution that requires only a single query.
I've created a simplified version to demonstrate the issue. Here are the table structures:
EMPLOYEE TABLE
EMPID
1
2
3
ABSENCE TABLE
EMPID DATE HOURS_ABSENT
1 6/1/2009 3
1 9/1/2009 1
2 3/1/2010 2
And here is the query:
SELECT
E.EMPID
,SUM(ATOTAL.HOURS_ABSENT) AS ABSENT_TOTAL
,SUM(AYEAR.HOURS_ABSENT) AS ABSENT_YEAR
FROM
EMPLOYEE E
INNER JOIN ABSENCE ATOTAL ON
ATOTAL.EMPID = E.EMPID
INNER JOIN ABSENCE AYEAR ON
AYEAR.EMPID = E.EMPID
WHERE
AYEAR.DATE > '1/1/2010'
GROUP BY
E.EMPID
HAVING
SUM(ATOTAL.HOURS_ABSENT) > 10
OR SUM(AYEAR.HOURS_ABSENT) > 3
Any insight would be greatly appreciated.
The principle when combining two aggregate functions is to use the subquery for calculating the 'inner' statistic. Then the result is used in the aggregate functions of the outer query.
The GROUP BY clause is normally used along with five built-in, or "aggregate" functions.
You cannot use aggregate functions in a WHERE clause or in a JOIN condition. However, a SELECT statement with aggregate functions in its SELECT list often includes a WHERE clause that restricts the rows to which the aggregate is applied.
An aggregate function performs a calculation on a set of values, and returns a single value. Except for COUNT(*) , aggregate functions ignore null values. Aggregate functions are often used with the GROUP BY clause of the SELECT statement. All aggregate functions are deterministic.
SELECT
E.EMPID
,SUM(ABSENCE.HOURS_ABSENT) AS ABSENT_TOTAL
,SUM(case when year(Date) = 2010 then ABSENCE.HOURS_ABSENT else 0 end) AS ABSENT_YEAR
FROM
EMPLOYEE E
INNER JOIN ABSENCE ON
ABSENCE.EMPID = E.EMPID
GROUP BY
E.EMPID
HAVING
SUM(ATOTAL.HOURS_ABSENT) > 10
OR SUM(case when year(Date) = 2010 then ABSENCE.HOURS_ABSENT else 0 end) > 3
edit:
It's not a big deal, but I hate repeating conditions so we could refactor like:
Select * From
(
SELECT
E.EMPID
,SUM(ABSENCE.HOURS_ABSENT) AS ABSENT_TOTAL
,SUM(case when year(Date) = 2010 then ABSENCE.HOURS_ABSENT else 0 end) AS ABSENT_YEAR
FROM
EMPLOYEE E
INNER JOIN ABSENCE ON
ABSENCE.EMPID = E.EMPID
GROUP BY
E.EMPID
) EmployeeAbsences
Where ABSENT_TOTAL > 10 or ABSENT_YEAR > 3
This way, if you change your case condition, it's in one spot only.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With