Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PostgreSQL - Update table with subquery from updated table itself

I have a table with these values: "user_id, text, text_lang, user_lang".

The field text_lang contains a locale shortcut ('en','de') representing the language of the text. I now want to set the (unknown/empty) 'user_lang' by selecting the text_lang that has the maximum occurrence for each user_id.

I have a working select statement that returns the text_lang with the most occurrences for each user_id, but I cannot figure out how to update the user_lang of each row based on the ID.

SELECT user_id, text_lang
FROM (
    SELECT  user_id,
        text_lang,
        max(text_lang_count) OVER (PARTITION BY user_id) max_count,
        text_lang_count
    FROM (
        SELECT  user_id,
                text_lang,
                COUNT(text_lang) AS text_lang_count
        FROM test
        GROUP BY user_id, text_lang
    ) AS xx
) AS xy
WHERE tweet_lang_count = max_count

I assume (but am not certain) that MySQL/SQL: Update with correlated subquery from the updated table itself contains something closely related to my question, but I have been unable to adapt that solution to this specific problem.

like image 482
Senshi Avatar asked May 30 '13 11:05

Senshi


2 Answers

SQL Fiddle

with lang as (
    select distinct on (user_id)
        user_id, text_lang, text_lang_count
    from (
        select
            user_id,
            text_lang,
            count(text_lang) as text_lang_count
        from t
        group by user_id, text_lang
    ) s
    order by 1, 3 desc, text_lang != 'en' -- in a tie english wins
)
update t
set user_lang = lang.text_lang
from lang
where t.user_id = lang.user_id

Calculate the predominant language in the CTE (with) and use it in the from clause

like image 129
Clodoaldo Neto Avatar answered Nov 15 '22 05:11

Clodoaldo Neto


The correlated update in PostgreSQL has a different syntax than MySQL. It can be written like this:

UPDATE tablename T
 SET user_lang=subquery.text_lang
 FROM ( SELECT user_id,text_lang FROM... etc... ) subquery
 WHERE T.user_id=subquery.user_id

where the subquery is exactly your SELECT query from the question.

The above is non-standard, though. The form that complies to the SQL standard would be like:

UPDATE tablename T
 SET user_lang=(SELECT text_lang FROM ... WHERE... AND user_id=T.user_id)

but with this form you would have to reshape your initial query to compute the desired text_lang for just one specific user_id.

like image 22
Daniel Vérité Avatar answered Nov 15 '22 05:11

Daniel Vérité