I am running the below query to get 3rd highest salary from employee table and it worked correctly, but I can't understand its logic. How does the sub-query values match the main query(left part). Can someone please explain what is the logic working behind this query?
select e1.salary
from employee as e1
where 3 = (select count(salary)
from employee as e2
where e1.salary<=e2.salary)
PS: I can understand the count()
returns number of rows (where all records are unique).
SQL uses a three-valued logic: besides true and false, the result of logical expressions can also be unknown. SQL's three valued logic is a consequence of supporting null to mark absent data. If a null value affects the result of a logical expression, the result is neither true nor false but unknown.
Logical query processing is the conceptual interpretation of the query that defines the correct result, and unlike the keyed-in order of the query clauses, it starts by evaluating the FROM clause. Understanding logical query processing is crucial for correct understanding of T-SQL.
A query can either be a request for data results from your database or for action on the data, or for both. A query can give you an answer to a simple question, perform calculations, combine data from different tables, add, change, or delete data from a database.
Primarily, queries are used to find specific data by filtering explicit criteria. Queries also help automate data management tasks, summarize data and engage in calculations. Other examples of queries include append, crosstab, delete, make a table, parameter, totals and updates.
This query is basically saying:
for each row in employee assign to e1
count = 0
for each row in employee assign to e2
if e1.salary <= e2.salary
count = count + 1
end if
end for
if count = 3
add e1 to result set
end if
end for
return result set
In summary for every row in the employee table it is visiting the table a second time and counting the number of rows with a lower or equal salary. If there are exactly 3 it will add the row to the result.
It is worth pointing out that this may well go wrong if there is more than one employee with the same salary. What you probably want is a query with a ranking function instead. Something like this:
SELECT salary
FROM
(SELECT
salary
,DENSE_RANK () OVER (ORDER BY salary DESC) [rank]
FROM employee) t
WHERE
[rank] = 3
What exactly is meant by "3rd Highest" is perhaps a bit ambiguous. If we have salaries 8, 8, 6, 5 the above will return 5. If we wanted 6 you would need to change the DENSE_RANK to ROW_NUMBER like this:
SELECT salary
FROM
(SELECT
salary
,ROW_NUMBER () OVER (ORDER BY salary DESC) [rank]
FROM employee) t
WHERE
[rank] = 3
The DENSE_RANK version above also suffers from returning multiple rows if there is a tie for third place. Whether this is desirable or not depends upon exactly what is required but it is possible to cut this down by using an aggregate function on the salary.
SELECT MAX(salary)
FROM
(SELECT
salary
,DENSE_RANK() OVER (ORDER BY salary desc) [rank]
FROM employee) t
WHERE
[rank] = 3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With