Is it possible to identify distinct words and a count for each, from fields containing text strings in Postgres?
Should be split by a space ' ' or other delimit symbol between words; not by an 's', unless intended to do so, e.g., treating 'myWordshere' as 'myWord' and 'here'.
SELECT word, count(*)
FROM (
SELECT regexp_split_to_table(some_column, ' ') as word
FROM some_table
) t
GROUP BY word
Something like this?
SELECT some_pk, regexp_split_to_table(some_column, '\s') as word FROM some_table
Getting the distinct words is easy then:
SELECT DISTINCT word FROM ( SELECT regexp_split_to_table(some_column, '\s') as word FROM some_table ) t
or getting the count for each word:
SELECT word, count(*) FROM ( SELECT regexp_split_to_table(some_column, '\s') as word FROM some_table ) t GROUP BY word
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With