Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hebrew dictionary for PostgreSQL on Heroku?

Reading Heroku help on enabling full text search in PostgreSQL I see that it doesn't support Hebrew by default. Does anyone know how to add support for Hebrew dictionary in PostgreSQL on Heroku?

like image 823
MikeMarsian Avatar asked Apr 04 '13 07:04

MikeMarsian


1 Answers

I work on Heroku Postgres, and would like input on this matter from those able to vend it.

I'm looking into this but so far the waters have been murky as to how Hebrew is supported in many/any open source projects, including dedicated full text searching projects like Lucene or Xapian. There are full blown toolchains for dealing with this, but their integration with PostgreSQL is not yet existent as far as I know, e.g. hebstem, hspell and libhspell, HebMorph.

If someone knows of what the current state of the art is for this in Postgres, I can try to make it work on Heroku at a time of my discretion, depending on the precise details of that implementation that I have to review somewhat carefully.

As-is my attempts to locate an ispell dictionary have been questionable, as is the efficacy of ispell style dictionaries for Hebrew given the reportedly very different stemming rules.

Related work:

  • Lucene Hebrew analyzer, which links to HebMorph

  • Xapan Hspell Integration, but it's not clear if this ever got fully fleshed out (at the time Xapian was working on extensibility in this area)

Thoughts?

like image 140
fdr Avatar answered Nov 11 '22 12:11

fdr