Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to search for different character sets in postgresql?

I want to search a table in a postgres DB which contains both Arabic and English text. For example:

id | content
-----------------
1  | دجاج    
2  | chicken
3  | دجاج chicken

The result would get me row 3.

I imagine this has to do with limiting characters using regex, but I cannot find a clean solution to select both. I tried:

SELECT regexp_matches(content, '^([x00-\xFF]+[a-zA-Z][x00-\xFF]+)*')
FROM mg.messages;

However, this only matches english and some non english characters within {}.

like image 542
Slamice Avatar asked Oct 20 '22 17:10

Slamice


1 Answers

I know nothing about Arabic text or RTL languages in general, but this worked:

create table phrase (
  id serial,
  phrase text
);

insert into phrase (phrase) values ('apple pie');
insert into phrase (phrase) values ('فطيرة التفاح');

select *
from phrase
where phrase like ('apple%')
or phrase like ('فطيرة%');

http://sqlfiddle.com/#!15/75b29/2

like image 57
dwurf Avatar answered Nov 01 '22 14:11

dwurf