I'm studying for my Database Systems Exam (tomorrow) and I'm having trouble in exercises where I'm asked to write queries. Here's an example:
I am being asked to write a query to answer the following question: Among the authors with lowest Age, who has most books written?
The problem is my teacher forbids me to use sub-queries inside FROM clause, and to use TOP.
I have written an answer, but one I know is incorrect:
SELECT W.AName, COUNT(W.ID_B) AS NBooks
FROM Write W, Author A1
WHERE (A1.AName = W.AName) AND
(A1.AAge = (SELECT MIN(A2.Age)
FROM Author A2))
GROUP BY W.AName
ORDER BY NBooks DESC
This one gives all the authors with the lower age, and their respective number of books written (I hope..). The correct answer should be only the first line of this one.
Let me be clear:
Table Author
AName | AAge
---------------
John | 25
Rick | 30
Sean | 26
Lena | 25
Table Writes
AName | ID_B
---------------
John | 2
Lena | 1
John | 3
Lena | 4
Rick | 5
Rick | 6
Lena | 6
Rick | 7
Rick | 8
(notice that Sean did't write any book, book nº6 has 2 authors, and Rick is the author with most books (4) )
Now, the code I wrote above gives this result (I guess):
AName | NBooks
-----------------
Lena | 3
John | 2
(The lowest age is 25 and both Lena and John are 25)
Whats asked is:
AName | NBooks
-----------------
Lena | 3
(Lena is the Author, among all the authors with the lowest age (25), with most books written)
Thanks in advance
No, we can't use a MAX(COUNT(*) and we can not layer aggregate functions on top of one another in the same SELECT clause.
To find the max value of a column, use the MAX() aggregate function; it takes as its argument the name of the column for which you want to find the maximum value. If you have not specified any other columns in the SELECT clause, the maximum will be calculated for all records in the table.
Example - With Single ExpressionSELECT MAX(salary) AS "Highest salary" FROM employees; In this SQL MAX function example, we've aliased the MAX(salary) field as "Highest salary". As a result, "Highest salary" will display as the field name when the result set is returned.
COUNT(*) returns the number of rows in a specified table, and it preserves duplicate rows. It counts each row separately. This includes rows that contain null values.
Because you are a student, I'll answer part of the question. Here is an answer, ignoring the youngest part:
select a.AName, COUNT(*) as NumBooks
from Author a join
Write w
on a.AName = w.AName
group by a.AName
having count(*) >= all(select COUNT(*) as NumBooks
from write w
group by w.AName
)
I think you can figure out how to modify it.
By the way, the restriction on limit
and top
is, I hope, only for this example. Otherwise, you should get another teacher, since these are very important constructs.
Also, you need to learn conventional join syntax, rather than ,
in the from
clause. Once again, if you teacher is not teaching the modern sytnax (since about 1988), get a new teacher. And, I assume the restriction on subqueries applies to CTEs as well.
I also want to point out the "correct" version of the query:
select top 1 a.aname, count(*) as NumBooks
from Author a join
Write w
on a.AName = w.AName
group by author.name, author.Aage
order by author.Age asc, count(*) desc
This query is better than the query above along almost any dimension. It does one join
, one group by
and one sort. The complete version of my query does two join
s explicitly, two join
s implicitly (the age clause), and two group by
s. The former is going to have better performance than the latter.
From a readability perspective, this version is shorter and cleaner. I also think that it is much easier to teach what this is doing, rather than the "unusual" constructs in the first version. Most students will understand what top
and order by
are doing and can emulate this. Mimicking what happens in that having
clause requires some mental gymnastics.
If you want to get all the authors with the maximum count, the first thing is to realize that the previous query is equivalent to:
select aname, NumBooks
from (select a.aname, count(*) as NumBooks,
row_number() over (partition by author.Name order by a.aAge, count(*) desc) as seqnum
from Author a join
Write w
on a.AName = w.AName
group by author.name, author.Aage
) aw
where seqnum = 1
Switching this to get all the authors is easy:
select aname, NumBooks
from (select a.aname, count(*) as NumBooks,
dense_rank() over (partition by author.Name order by a.aAge, count(*) desc) as seqnum
from Author a join
Write w
on a.AName = w.AName
group by author.name, author.Aage
) aw
where seqnum = 1
This is also more efficient than the query that answers the question. Not being able to use top
or subqueries in the from
clause is like running a three-legged race. Yes, you can probably get there, but you'll get there much faster running on your own two legs.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With