Is there any way I can use multiple dictionary in enchant. This is what I do,
import enchant
d = enchant.Dict("en_US")
d.check("materialise")
>> False
But if I use enchant.Dict("en_UK")
, I will get True
. What is the best way to combine multiple dictionaries, so that it will return True
no matter materialise
or materialize
as the input argument?
@Mass17 that is actually not correct. The expression "en_US" and "en_UK"
is a logical AND operation on 2 strings of which the result is "en_UK"
. Here's how the AND operator works in the above expression: (1) first, any non-empty string is considered True
, (2) if the left string is True then the right string is checked and returned. Read about Python's short-circuit evaluation for some insight about why it works this way.
So:
>>> "en_US" and "en_UK"
'en_UK'
And note, if you switch the order of the strings:
>>> "en_UK" and "en_US"
'en_US'
The words "materialise" and "materialize" BOTH appear in your "en_UK"
dictionary, hence the results you got. You haven't actually "combined" the 2 dictionaries yet.
I may be late here, but this question intrigued me too.
So, the solution for using multiple dialects of the English language in Python's enchant is as below:
import enchant
'''
Use "en" simply to cover all available dialects and word usages of the English language
'''
d = enchant.Dict("en")
d.check("materialise") # UK (en_GB)
>>> True
d.check("materialize") # USA (en_US)
>>> True
Hope this helps for our future readers here :)
For Hunspell dictionaries there's a workaround if both dictionaries share the same .aff
file and I suppose en_US
and en_GB
pass that condition.
The author is Sergey Kurakin and the Bash script is (dic_combine.sh
) as follows:
#!/bin/bash
# Combines two or more hunspell dictionaries.
# (C) 2010 Sergey Kurakin <kurakin_at_altlinux_dot_org>
# Attention! All source dictionaries MUST share the same affix file.
# Usage: dic_combine source1.dic source2.dic [source3.dic...] > combined.dic
TEMPFILE=`mktemp`
cat $@ | sort --unique | sed -r 's|^[0123456789]*$||;/^$/d' > $TEMPFILE
cat $TEMPFILE | wc -l
cat $TEMPFILE
rm -f $TEMPFILE
rm -f $TEMPFILE
So, you have to put those dictionary files in a directory and run:
$ dic_combine en_US.dic en_GB.dic > en.dic
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With