I'm running Django 1.10.1 against Postgres 9.4. My staging server and dev environments have psql servers at version 9.4.9 and production is an RDS instance at 9.4.7.
It seems like my SearchVectorField is not storing the search configuration given in production, though it is in staging and dev, and it seems to be either a version thing (unlikely, given the version difference and that it also worked on 9.3 in staging/dev) or the fact that production is on RDS instead of local on the server.
I'm using a custom configuration for full-text search called unaccent
, which looks like this:
Token | Dictionaries
-----------------+-----------------------
asciihword | english_stem
asciiword | english_stem
email | simple
file | simple
float | simple
host | simple
hword | unaccent,english_stem
hword_asciipart | english_stem
hword_numpart | simple
hword_part | unaccent,english_stem
int | simple
numhword | simple
numword | simple
sfloat | simple
uint | simple
url | simple
url_path | simple
version | simple
word | unaccent,english_stem
Unaccent is installed in both environments, and works in both environments.
I'm storing the search data in a django.contrib.postgres.search.SearchVectorField
on my Writer
model:
class Writer(models.Model):
#...
search = SearchVectorField(blank=True)
That column is updated with the following search vector:
writer_search_vector = (SearchVector('first_name', 'last_name', 'display_name',
config='unaccent', weight='A') +
SearchVector('raw_search_data', config='unaccent', weight='B'))
by the following statement, which runs periodically:
Writer.objects.update(search=search_utils.writer_search_vector)
And, for some reason, the configuration is storing successfully on my staging server and in dev, but not in production. E.g., this code returns the same results in all environments:
In [3]: Writer.objects.annotate(searchy=SearchVector('last_name')).filter(searchy='kostenberger')
Out[3]: <QuerySet []>
In [4]: Writer.objects.annotate(searchy=SearchVector('last_name', config='unaccent')).filter(searchy='kostenberger')
Out[4]: <QuerySet [<Writer: Andreas J. Köstenberger>, <Writer: Margaret Elizabeth Köstenberger>]>
But in staging, I get the following correct result if I use the stored vector:
In [5]: Writer.objects.filter(search='kostenberger')
Out[5]: <QuerySet [<Writer: Andreas J. Köstenberger>, <Writer: Margaret Elizabeth Köstenberger>]>
while in production, against the RDS instance, I get the following, incorrect result:
In [5]: Writer.objects.filter(search='kostenberger')
Out[5]: <QuerySet []>
and yet, in production still, the unaccent works but the english_stem does not, in that it will match the stemmed version of the text (below), but not the original version (above):
In [6]: Writer.objects.filter(search='kostenberg')
Out[6]: <QuerySet [<Writer: Margaret Elizabeth Köstenberger>, <Writer: Andreas J. Köstenberger>]>
Note that the database tables for Writer in the two environments are identical for this test.
Any ideas why the stored vector isn't working in production with the correct config, while if I create the vector on the fly it will work?
On RDS Postgres, you aren't allowed to change the default_text_search_config
parameter. So, you have to configure the text search with each query:
from django.contrib.postgres.search import SearchRank, SearchQuery
…
search_query = SearchQuery(value='kostenberger', config='unaccent')
Writer.objects.filter(search=search_query)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With