Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to select most frequent value in a column per each id group?

I have a table in SQL that looks like this:

user_id | data1
0       | 6
0       | 6
0       | 6
0       | 1
0       | 1
0       | 2
1       | 5
1       | 5
1       | 3
1       | 3
1       | 3
1       | 7

I want to write a query that returns two columns: a column for the user id, and a column for what the most frequently occurring value per id is. In my example, for user_id 0, the most frequent value is 6, and for user_id 1, the most frequent value is 3. I would want it to look like below:

user_id | most_frequent_value
0       | 6
1       | 3

I am using the query below to get the most frequent value, but it runs against the whole table and returns the most common value for the whole table instead of for each id. What would I need to add to my query to get it to return the most frequent value for each id? I am thinking I need to use a subquery, but am unsure of how to structure it.

SELECT user_id, data1 AS most_frequent_value
FROM my_table
GROUP BY user_id, data1
ORDER BY COUNT(*) DESC LIMIT 1
like image 472
cjh193 Avatar asked Dec 14 '16 15:12

cjh193


People also ask

How do you find the most repeated value in a column SQL?

How do you find the most repeated value in a column SQL? select cnt1. column_name from (select COUNT(*) as total, column_name from table_name group by column_name) cnt1, (select MAX(total) as maxtotal from (select COUNT(*) as total, column_name from table_name group by column_name)) cnt2 where cnt1.

How do I find the most occuring value in SQL?

SELECT <column_name>, COUNT(<column_name>) AS `value_occurrence` FROM <my_table> GROUP BY <column_name> ORDER BY `value_occurrence` DESC LIMIT 1; Replace <column_name> and <my_table> . Increase 1 if you want to see the N most common values of the column.

How do you SELECT the first value in a GROUP BY a bunch of rows?

To do that, you can use the ROW_NUMBER() function. In OVER() , you specify the groups into which the rows should be divided ( PARTITION BY ) and the order in which the numbers should be assigned to the rows ( ORDER BY ). You assign the row numbers within each group (i.e., year).


2 Answers

You can use a window function to rank the userids based on their count of data1.

WITH cte AS (
SELECT 
    user_id 
  , data1
  , ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY COUNT(data1) DESC) rn
FROM dbo.YourTable
GROUP BY
  user_id,
  data1)

SELECT
    user_id,
    data1
FROM cte WHERE rn = 1 
like image 182
SQLChao Avatar answered Oct 04 '22 20:10

SQLChao


If you use proper "order by" then distinct on (user_id) make the same work because it takes 1.line from data partitioned by "user_id". DISTINCT ON is specialty of PostgreSQL.

select distinct on (user_id) user_id, most_frequent_value from (
SELECT user_id, data1 AS most_frequent_value, count(*) as _count
FROM my_table
GROUP BY user_id, data1) a
ORDER BY user_id, _count DESC 
like image 35
JosMac Avatar answered Oct 04 '22 20:10

JosMac