Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find most popular word occurrences in MySQL?

I have a table called results with 5 columns.

I'd like to use the title column to find rows that are say: WHERE title like '%for sale%' and then listing the most popular words in that column. One would be for and another would be sale but I want to see what other words correlate with this.

Sample data:

title
cheap cars for sale
house for sale
cats and dogs for sale
iphones and androids for sale
cheap phones for sale
house furniture for sale

Results (single words):

for    6
sale    6
cheap    2
and    2
house    2
furniture 1
cars    1
etc...
like image 771
User Avatar asked Sep 24 '15 12:09

User


People also ask

How do I find the most popular value in SQL?

SELECT <column_name>, COUNT(<column_name>) AS `value_occurrence` FROM <my_table> GROUP BY <column_name> ORDER BY `value_occurrence` DESC LIMIT 1; Replace <column_name> and <my_table> . Increase 1 if you want to see the N most common values of the column. Save this answer.

How do I count the number of times a word appears in MySQL?

The MySQL COUNT() function allows you to count how many times a certain value appears in your MySQL database. The function can also help you to count how many rows you have in your MySQL table. The function has one expression parameter where you can specify the condition of the query.

How do I find the highest number in MySQL?

If you're working with MySQL, you can combine MAX() with the GREATEST() function to get the biggest value from two or more fields. Here's the syntax for GREATEST: GREATEST(value1,value2,...)

How do I find a specific word in a MySQL database?

MySQL Workbench There is a Schemas tab on the side menu bar, click on the Schemas tab, then double click on a database to select the database you want to search. Then go to menu Database - Search Data, and enter the text you are searching for, click on Start Search.


2 Answers

You can extract words with some string manipulation. Assuming you have a numbers table and that words are separated by single spaces:

select substring_index(substring_index(r.title, ' ', n.n), ' ', -1) as word,
       count(*)
from results r join
     numbers n
     on n.n <= length(title) - length(replace(title, ' ', '')) + 1
group by word;

If you don't have a numbers table, you can construct one manually using a subquery:

from results r join
     (select 1 as n union all select 2 union all select 3 union all . . .
     ) n
     . . .

The SQL Fiddle (courtesy of @GrzegorzAdamKowalski) is here.

like image 196
Gordon Linoff Avatar answered Oct 16 '22 21:10

Gordon Linoff


You can use ExtractValue in some interesting way. See SQL fiddle here: http://sqlfiddle.com/#!9/0b0a0/45

We need only one table:

CREATE TABLE text (`title` varchar(29));

INSERT INTO text (`title`)
VALUES
    ('cheap cars for sale'),
    ('house for sale'),
    ('cats and dogs for sale'),
    ('iphones and androids for sale'),
    ('cheap phones for sale'),
    ('house furniture for sale')
;

Now we construct series of selects which extract whole words from text converted to XML. Each select extracts N-th word from the text.

select words.word, count(*) as `count` from
(select ExtractValue(CONCAT('<w>', REPLACE(title, ' ', '</w><w>'), '</w>'), '//w[1]') as word from `text`
union all
select ExtractValue(CONCAT('<w>', REPLACE(title, ' ', '</w><w>'), '</w>'), '//w[2]') from `text`
union all
select ExtractValue(CONCAT('<w>', REPLACE(title, ' ', '</w><w>'), '</w>'), '//w[3]') from `text`
union all
select ExtractValue(CONCAT('<w>', REPLACE(title, ' ', '</w><w>'), '</w>'), '//w[4]') from `text`
union all
select ExtractValue(CONCAT('<w>', REPLACE(title, ' ', '</w><w>'), '</w>'), '//w[5]') from `text`) as words
where length(words.word) > 0
group by words.word
order by `count` desc, words.word asc
like image 37
Grzegorz Adam Kowalski Avatar answered Oct 16 '22 21:10

Grzegorz Adam Kowalski