Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Enchant dictionary across different platforms

Different results for enchant library (enchant 1.6.6)

In MAC OSX 10.11.12 (El Capitan):

>>> import enchant
>>> d = enchant.Dict("en_US")
>>> d.suggest("prfomnc")
['performance', 'prominence', 'preform', 'perform']

In Linux Ubuntu 14.04 LTS:

>>> import enchant
>>> d = enchant.Dict("en_US")
>>> d.suggest("prfomnc")
['princedom', 'preferment', 'preform']

Any ideas why I get different results and other alternatives in NLTK for "suggest" functionality?


MAC OS

>>> enchant.list_dicts()
[('de_DE', <Enchant: Myspell Provider>), ('en_AU', <Enchant: Myspell Provider>), ('en_GB', <Enchant: Myspell Provider>), ('en_US', <Enchant: Myspell Provider>), ('fr_FR', <Enchant: Myspell Provider>)]

Ubuntu

>>> enchant.list_dicts()
[('en', <Enchant: Aspell Provider>), ('en_CA', <Enchant: Aspell Provider>), ('en_GB', <Enchant: Aspell Provider>), ('en_US', <Enchant: Aspell Provider>), ('en_ZA', <Enchant: Myspell Provider>), ('en_AU', <Enchant: Myspell Provider>)]

In my Ubuntu tried:

>>> b = enchant.Broker()
>>> b.set_ordering("en_US","myspell,aspell")
>>> b.set_ordering("*","aspell,myspell")
>>> b.request_dict("en_US").provider
<Enchant: Myspell Provider>
>>> b.request_dict("en_GB").provider
<Enchant: Aspell Provider>
>>> d.suggest("prfomnc")
['princedom', 'preferment', 'preform']

But still same results

like image 412
gogasca Avatar asked Nov 09 '22 20:11

gogasca


1 Answers

The enchant library is not a spell-correction library. Instead, it is an aggregator, searching for an interfacing with a variety of supported systems.

From the documentation:

Enchant is capable of having multiple backends loaded at once. Currently, Enchant has 8 backends:

Aspell/Pspell (intends to replace Ispell)
Ispell (old as sin, could be interpreted as a defacto standard)
MySpell/Hunspell (an OOo project, also used by Mozilla)
Uspell (primarily Yiddish, Hebrew, and Eastern European languages - hosted in AbiWord's CVS under the module "uspell")
Hspell (Hebrew)
Zemberek (Turkish)
Voikko (Finnish)
AppleSpell (Mac OSX)

Notice the last one?

I suspect, without expending any energy to confirm it, that you're getting different results because your MacOS system and your Linux system have different spelling software installed, or perhaps they have the same software installed but maybe they are in a different order in the searchpath used by enchant.

like image 104
aghast Avatar answered Nov 15 '22 13:11

aghast