Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

GROUP BY between dates to count unique users returns more than expected

Tags:

sql

Using the Stack Exchange Data Explorer I'm trying to figure out how many different users asked questions between 19 October 2012 and 9 November 2012. I've gotten my query to this:

SELECT Count(*)
FROM Posts
WHERE (CreationDate BETWEEN '2012-10-19' AND '2012-11-09') AND PostTypeId = 1
GROUP BY OwnerUserId

However, this returns 3,071 rows on Super User (I'm not going to try it on Stack Overflow). This seems way too high. Am I missing something?

like image 540
Aarthi Avatar asked Dec 15 '22 16:12

Aarthi


2 Answers

If you just want to get a simple count of the distinct users, you can use the DISTINCT keyword like this:

SELECT Count(DISTINCT OwnerUserID)
FROM Posts
WHERE (CreationDate BETWEEN '2012-10-19' AND '2012-11-09') AND PostTypeId = 1

The distinct keyword will ensure that you only count each OwnerUserID once - the original question will just return the number of questions that each distinct user asked (which is fairly useless without additional context).

In effect, this query returns a count of the number of rows returned in the original query.

like image 84
Andrew Avatar answered Jun 06 '23 18:06

Andrew


If all you want is the total number of distinct users, then you can use the distnct keyword as part of your Count and then leave out the GROUP BY

For example

SELECT Count(DISTINCT OwnerUserId)
FROM Posts
WHERE (CreationDate BETWEEN '2012-10-19' AND '2012-11-09') AND PostTypeId = 1
like image 42
Jack Avatar answered Jun 06 '23 17:06

Jack