Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL - Query - max(count())

Tags:

sql

erd

I'm studying for my Database Systems Exam (tomorrow) and I'm having trouble in exercises where I'm asked to write queries. Here's an example:

enter image description here

I am being asked to write a query to answer the following question: Among the authors with lowest Age, who has most books written?

The problem is my teacher forbids me to use sub-queries inside FROM clause, and to use TOP.

I have written an answer, but one I know is incorrect:

SELECT W.AName, COUNT(W.ID_B) AS NBooks
FROM Write W, Author A1
WHERE (A1.AName = W.AName) AND
      (A1.AAge = (SELECT MIN(A2.Age)
                  FROM Author A2))
GROUP BY W.AName
ORDER BY NBooks DESC

This one gives all the authors with the lower age, and their respective number of books written (I hope..). The correct answer should be only the first line of this one.

Let me be clear:

Table Author
AName    | AAge
---------------
John     | 25
Rick     | 30
Sean     | 26
Lena     | 25

Table Writes
AName    | ID_B
---------------
John     | 2
Lena     | 1
John     | 3
Lena     | 4
Rick     | 5
Rick     | 6
Lena     | 6
Rick     | 7
Rick     | 8

(notice that Sean did't write any book, book nº6 has 2 authors, and Rick is the author with most books (4) )

Now, the code I wrote above gives this result (I guess):

AName    | NBooks
-----------------
Lena     | 3
John     | 2

(The lowest age is 25 and both Lena and John are 25)

Whats asked is:

AName    | NBooks
-----------------
Lena     | 3

(Lena is the Author, among all the authors with the lowest age (25), with most books written)

Thanks in advance

like image 916
Marco Castanho Avatar asked Jan 11 '13 20:01

Marco Castanho


People also ask

Can I use max count ()) in SQL?

No, we can't use a MAX(COUNT(*) and we can not layer aggregate functions on top of one another in the same SELECT clause.

How do you find Max count in SQL?

To find the max value of a column, use the MAX() aggregate function; it takes as its argument the name of the column for which you want to find the maximum value. If you have not specified any other columns in the SELECT clause, the maximum will be calculated for all records in the table.

What is the syntax of Max () in SQL?

Example - With Single ExpressionSELECT MAX(salary) AS "Highest salary" FROM employees; In this SQL MAX function example, we've aliased the MAX(salary) field as "Highest salary". As a result, "Highest salary" will display as the field name when the result set is returned.

What does count (*) do in SQL?

COUNT(*) returns the number of rows in a specified table, and it preserves duplicate rows. It counts each row separately. This includes rows that contain null values.


1 Answers

Because you are a student, I'll answer part of the question. Here is an answer, ignoring the youngest part:

select a.AName, COUNT(*) as NumBooks
from Author a join
     Write w
     on a.AName = w.AName
group by a.AName
having count(*) >= all(select COUNT(*) as NumBooks
                       from write w
                       group by w.AName
                      )

I think you can figure out how to modify it.

By the way, the restriction on limit and top is, I hope, only for this example. Otherwise, you should get another teacher, since these are very important constructs.

Also, you need to learn conventional join syntax, rather than , in the from clause. Once again, if you teacher is not teaching the modern sytnax (since about 1988), get a new teacher. And, I assume the restriction on subqueries applies to CTEs as well.

I also want to point out the "correct" version of the query:

select top 1 a.aname, count(*) as NumBooks
from Author a join
     Write w
     on a.AName = w.AName
group by author.name, author.Aage
order by author.Age asc, count(*) desc

This query is better than the query above along almost any dimension. It does one join, one group by and one sort. The complete version of my query does two joins explicitly, two joins implicitly (the age clause), and two group bys. The former is going to have better performance than the latter.

From a readability perspective, this version is shorter and cleaner. I also think that it is much easier to teach what this is doing, rather than the "unusual" constructs in the first version. Most students will understand what top and order by are doing and can emulate this. Mimicking what happens in that having clause requires some mental gymnastics.

If you want to get all the authors with the maximum count, the first thing is to realize that the previous query is equivalent to:

select aname, NumBooks
from (select a.aname, count(*) as NumBooks,
             row_number() over (partition by author.Name order by a.aAge, count(*) desc) as seqnum
      from Author a join
           Write w
           on a.AName = w.AName
      group by author.name, author.Aage
     ) aw
where seqnum = 1

Switching this to get all the authors is easy:

select aname, NumBooks
from (select a.aname, count(*) as NumBooks,
             dense_rank() over (partition by author.Name order by a.aAge, count(*) desc) as seqnum
      from Author a join
           Write w
           on a.AName = w.AName
      group by author.name, author.Aage
     ) aw
where seqnum = 1

This is also more efficient than the query that answers the question. Not being able to use top or subqueries in the from clause is like running a three-legged race. Yes, you can probably get there, but you'll get there much faster running on your own two legs.

like image 162
Gordon Linoff Avatar answered Sep 16 '22 14:09

Gordon Linoff