Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to perform simple map reduce in Postgres?

Tags:

postgresql

I am using trigrams for full-text search in postgres like this:

SELECT * 
FROM "Users" users 
WHERE 'search_text' % ANY(regexp_split_to_array(users.name,E'\\s+'))

Query above checks if search_text matches any word (split by whitespace) in User.displayName. This works however results are not ordered by actual "score".

Score can be calculated by similarity(text,text) function.

The problem is that I have to order by them sum of all similarities found for each word in User.name. So if user name is "A B C" then its score should be:

similarity('search_text','A') + similarity('search_text','B') + similarity('search_text','C')

So I need to map user name words to scores and then sum (reduce) them. How I can do this in postgres?

like image 317
user606521 Avatar asked Dec 03 '14 15:12

user606521


1 Answers

In instead of splitting to an array split to a set then regroup summing the similarity

with users as (
    select user_id, name
    from (values (1, 'John Smith')) t(user_id, name)
)
select user_id, sum(similarity('smth', name_part))
from (
    select user_id, regexp_split_to_table(name, E'\\s+') as name_part
    from users
) users
where 'smth' % name_part
group by user_id
order by 2 desc
;
 user_id |  sum  
---------+-------
       1 | 0.375

I'm not sure you want to eliminate those name parts for which the similarity is lower than the threshold. If not just omit the where clause

BTW, why do you need to split the name? Why not just compare the whole name to the search string?

with users as (
    select user_id, name
    from (values (1, 'John Smith')) t(user_id, name)
)
select similarity('jon smth', name), *
from users
order by 1 desc
;
 similarity | user_id |    name    
------------+---------+------------
   0.333333 |       1 | John Smith
like image 102
Clodoaldo Neto Avatar answered Nov 12 '22 17:11

Clodoaldo Neto