SQL select and group records with same words in different order

Question

I have the following values in a table:

Table Fruits

id | name                        | price
-----------------------------------------
 1 | 'APPLE BANANA ORANGE'       | 12.00
 2 | 'BANANA ORANGE APPLE'       |  4.00
 3 | 'ORANGE APPLE BANANA'       | 10.00
 4 | 'LEMON APPLE BANANA ORANGE' |  7.00
 5 | 'APPLE LEMON BANANA ORANGE' |  8.00

I want to select all the values of the 3 first rows and all I have is a text containing 'APPLE BANANA ORANGE'

Example

SELECT * 
FROM fruits 
WHERE name IN 
('APPLE BANANA ORANGE','BANANA ORANGE APPLE','ORANGE APPLE BANANA')

The thing is that those values comes from another table and i'm looking a way to generate the different combinations of possible values

Maybe using a function:

SELECT * 
FROM fruits f 
INNER JOIN order o ON o.name IN some_function(f.name)

Maybe using some regex:

SELECT * 
FROM fruits f 
INNER JOIN order o ON o.name ~ '(?=' || f.name || ')'

I've tried to use some regex found over the internet that uses lookaround and they bring all the values that contains those three words but includes the one that also have more words. Like 'LEMON APPLE BANANA ORANGE' and 'APPLE LEMON BANANA ORANGE'

The other thing is that I want to group those results no matter the order that they appear like

SELECT sum(price) 
FROM fruits f 
INNER JOIN order o ON o.name ~ '(?=' || f.name || ')'
GROUP BY somefunction(name);

Thanks for you help

Marcel Isaac

UPDATE

I've created a function that splits the content of the string, orders the words and return a new string with the ordered words

CREATE OR REPLACE FUNCTION order_words(txt character varying) 
returns character varying as
$BODY$
DECLARE 
    tmp character varying;
BEGIN
    SELECT string_agg(t,' ') 
    INTO tmp
    FROM (
      SELECT rstt 
      FROM regexp_split_to_table(txt, ' ') rstt 
      ORDER BY rstt) t; 

    RETURN tmp;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;

When executed on the values it always returns the same string

select order_words('APPLE BANANA ORANGE');
    order_words     
---------------------
APPLE BANANA ORANGE
(1 row)

select order_words('BANANA ORANGE APPLE');
    order_words     
---------------------
APPLE BANANA ORANGE
(1 row)

select order_words('ORANGE APPLE BANANA');
    order_words     
---------------------
APPLE BANANA ORANGE
(1 row)

Now I can write my code

SELECT order_words(name),sum(price) 
FROM fruits f 
INNER JOIN order o ON order_words(o.name) = order_words(f.name)
GROUP BY order_words(name);

I'll test for performance

a_horse_with_no_name · Accepted Answer

Not sure if this would be faster than your solution:

select f.*
from fruits f
  join orders o 
    on string_to_array(f.name, ' ') @> string_to_array(o.name, ' ')
   and cardinality(string_to_array(f.name, ' ')) = cardinality(string_to_array(o.name, ' '));

The idea is to split both values into array and check if they overlap. But because "overlaps" doesn't mean all elements are equal, I'm comparing the length of the array as well. If they are of equal length and overlap all elements are equal.

The expression string_to_array(f.name, ' ') can be indexed and a GIN index can even be used by the @> operator.

SQL select and group records with same words in different order

Tags:

regex

sql

postgresql

sql-order-by

group-by

Marcel Isaac

1 Answers

a_horse_with_no_name

Recent Activity

Donate For Us

SQL select and group records with same words in different order

Tags:

regex

sql

postgresql

sql-order-by

group-by

Marcel Isaac

1 Answers

a_horse_with_no_name

Related questions

Recent Activity

Donate For Us