I want to create a Full Text Search that accepts emojis on the query, or another type of index to search on text. For example, I have this text: Playa 🌊🌞🌴 @CobolIquique h'
and PostgreSQL parse it weirdly on the emojis.
Debugging, Using SELECT * FROM ts_debug('english','Playa 🌊🌞🌴 @CobolIquique h');
I have the following result:
And I don't know why the token is considered an space symbol. If I debug the parser SELECT * FROM ts_parse('default', 'Playa 🌊🌞🌴 @CobolIquique h');
I just get the same tokens and with the tokens types ts_token_type('default')
there is not a emoji type (or something similar). So, How can I create a parser to split the string correctly with the spaces and doesn't consider emojis as blank spaces? or How can I create a text index that can use emojis on the queries?
To create a new parser, which is different from default one, you should be a C programmer and you should write your own PostgreSQL extension. This extension should define the following functions:
start_function();
gettoken_function();
end_function();
lextypes_function();
headline_function(); // optional
As an example you can examine pg_tsparser module.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With