Continuing with the spirit of using the Stack Exchange Data Explorer to learn SQL, (see: Can we become our own “Northwind” for teaching SQL / databases?), I've decided to try to write a query to answer a simple question (on meta): What % of stackoverflow users have over 10,000 rep?.
Here's what I've done:
SELECT COUNT(*)
FROM Users
WHERE
Users.Reputation >= 10000
Result:
556
SELECT COUNT(*)
FROM
USERS
Result:
227691
Now, how do I put them together into one query? What is this query idiom called? What do I need to write so I can get, say, a one-row three-column result like this:
556 227691 0,00244190592
For queries like this, where I'm doing multiple counts on a single table based on different criteria, I like to use SUM
and CASE
:
SELECT
UsersCount.[10K],
UsersCount.[All],
(CAST(UsersCount.[10K] AS FLOAT) / UsersCount.[All]) AS [Ratio]
FROM
(SELECT
SUM(CASE
WHEN Users.Reputation >= 10000 THEN 1
ELSE 0
END) AS [10K],
COUNT(*) AS [All]
FROM Users) AS UsersCount
(query results)
The advantage is that you're only scanning the Users table once, which may be significantly faster.
You can use a Common Table Expression (CTE):
WITH c1 AS (
SELECT COUNT(*) AS cnt
FROM Users
WHERE Users.Reputation >= 10000
), c2 AS (
SELECT COUNT(*) AS cnt
FROM Users
)
SELECT c1.cnt, c2.cnt, CAST(c1.cnt AS FLOAT) / c2.cnt
FROM c1, c2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With