Postgres - Full Text Search to accept emojis

Question

I want to create a Full Text Search that accepts emojis on the query, or another type of index to search on text. For example, I have this text: Playa 🌊🌞🌴 @CobolIquique h' and PostgreSQL parse it weirdly on the emojis.

Debugging, Using SELECT * FROM ts_debug('english','Playa 🌊🌞🌴 @CobolIquique h'); I have the following result:

Results 1

And I don't know why the token is considered an space symbol. If I debug the parser SELECT * FROM ts_parse('default', 'Playa 🌊🌞🌴 @CobolIquique h'); I just get the same tokens and with the tokens types ts_token_type('default') there is not a emoji type (or something similar). So, How can I create a parser to split the string correctly with the spaces and doesn't consider emojis as blank spaces? or How can I create a text index that can use emojis on the queries?

Artur · Accepted Answer

To create a new parser, which is different from default one, you should be a C programmer and you should write your own PostgreSQL extension. This extension should define the following functions:

start_function();
gettoken_function();
end_function();
lextypes_function();
headline_function(); // optional

As an example you can examine pg_tsparser module.

Postgres - Full Text Search to accept emojis

Tags:

parsing

full-text-search

postgresql

emoji

FeanDoe

1 Answers

Artur

Recent Activity

Donate For Us

Postgres - Full Text Search to accept emojis

Tags:

parsing

full-text-search

postgresql

emoji

FeanDoe

1 Answers

Artur

Related questions

Recent Activity

Donate For Us