to_tsvector()
supports several languages: english, german, french ...
How to get full list of these languages ?
The to_tsvector function internally calls a parser which breaks the document text into tokens and assigns a type to each token. For each token, a list of dictionaries (Section 12.6) is consulted, where the list can vary depending on the token type.
And the more size you want to search in, the more Elasticsearch is better than PostgreSQL in performance. Additionally, you could also get many benefits and great performance if you pre-process the posts into several fields and indexes well before storing into Elasticsearch.
PostgreSQL Full Text Search refers to a technique in which you search for a single computer-stored document or a collection in your full-text database. It provides you with the capability to identify natural languages documents that satisfy a query.
There are instructions in the manual how to retrieve all information with psql:
12.10. psql Support
Information about text search configuration objects can be obtained in psql using a set of commands:
\dF{d,p,t}[+] [PATTERN]
In particular:
List text search dictionaries (add
+
for more detail).=> \dFd
There is more, read the manual.
Ultimately, possible parameter values for to_tsvector()
, to_tsquery()
et al. are defined by entries in the system catalog pg_ts_config
, from where you can get the definitive list. As of Postgres 14:
test=> SELECT cfgname FROM pg_ts_config; cfgname ------------ simple arabic armenian basque catalan danish dutch english finnish french german greek hindi hungarian indonesian irish italian lithuanian nepali norwegian portuguese romanian russian serbian spanish swedish tamil turkish yiddish (29 rows)
But more can be added.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With