I'm trying to make a query with postgresql. The database contains two relations: "kingdom", which includes some english kings, and "dinasty", which contains some people from the stuart dinasty
The relation "kingdom" includes the king name and when his kingdom started and ended. The relation "dinasty" includes name, gender, birth and death.
What I'm trying to query is the king that was the oldest one when he died.
With my query I'm receiving this error at LINE 3 (NOT IN): subquery has too many columns
This is the query:
SELECT kingdom.king, dinasty.birth, dinasty.death
FROM kingdom, dinasty
WHERE kingdom.king = dinasty.name AND kingdom.king NOT IN
(
SELECT DISTINCT R1.king, R1.birth, R1.death
FROM
(
SELECT DISTINCT R1.king, D1.birth, D1.death
FROM kingdom AS R1, dinasty AS D1, dinasty AS D2
WHERE R1.king=D1.name
) AS R1,
(
SELECT DISTINCT R1.king, D1.birth, D1.death
FROM kingdom AS R1, dinasty AS D1, dinasty AS D2
WHERE R1.king=D1.name
) AS R2
WHERE R1.death-R1.birth < R2.death-R2.birth
);
What is inside the NOT IN is correct.
SQL: Multiple Column SubqueriesYou can write subqueries that return multiple columns. The following example retrieves the order amount with the lowest price, group by agent code.
If you want compare two or more columns. you must write a compound WHERE clause using logical operators Multiple-column subqueries enable you to combine duplicate WHERE conditions into a single WHERE clause.
You can use group by in a subquery, but your syntax is off.
Multiple-row subqueries are nested queries that can return more than one row of results to the parent query. Multiple-row subqueries are used most commonly in WHERE and HAVING clauses. Since it returns multiple rows, it must be handled by set comparison operators (IN, ALL, ANY).
You are projecting three columns in your subquery, but comparing a single one of them in the IN
clause. Select just the required column (r1.king
) for the IN
in the subquery:
SELECT kingdom.king, dinasty.birth, dinasty.death
FROM kingdom, dinasty
WHERE kingdom.king = dinasty.name AND kingdom.king NOT IN
(
SELECT DISTINCT R1.king
FROM
(
SELECT DISTINCT R1.king, D1.birth, D1.death
FROM kingdom AS R1, dinasty AS D1, dinasty AS D2
WHERE R1.king=D1.name
) AS R1,
(
SELECT DISTINCT R1.king, D1.birth, D1.death
FROM kingdom AS R1, dinasty AS D1, dinasty AS D2
WHERE R1.king=D1.name
) AS R2
WHERE R1.death-R1.birth < R2.death-R2.birth
);
As was answered, your column count didn't match, but there's a much simpler way to write this.
When writing queries, it's best to think them through in stages. First, you need to know how old each king was when they died:
SELECT *, death-birth AS lived_for FROM dinasty
Now that you have that, you can use DISTINCT ON to find the longest-lived king for each kingdom
SELECT DISTINCT ON( name ) name, birth, death, lived_for
FROM (
SELECT *, death-birth AS lived_for FROM dinasty
) a
ORDER BY name, lived_for DESC
;
The distinct on will take the first row for each distinct value, so it's important you pair it with the correct ORDER BY
. First we order by the name of the dinasty, then how long the king lived for in descending order. That means the first king shown for each dinasty will be the longest-lived one, and that is the record that DISTINCT ON will keep for each dinasty.
Note that I also removed the JOIN to kindgom, but you can add that back in if needed:
SELECT k.*, oldest.*
FROM (
SELECT DISTINCT ON( name ) name, birth, death, lived_for
FROM (
SELECT *, death-birth AS lived_for FROM dinasty
) a
ORDER BY name, lived_for DESC
) oldest
JOIN kingdom k ON k.king = oldest.name
;
Finally, if you ever need to use multiple columns in a sub-select, you can use the ROW() construct:
SELECT ...
FROM table_a
WHERE ROW(f1, f2, f3) NOT IN (SELECT f1a, f2a, f3a FROM ... )
;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With