Im looking at a dictionary file (".dic") and its associated "aff" file. What I'm trying to do is combine the rules in the "aff" file with the words in the "dic" file to create a global list of all words contained within the dictionary file.
The documentation behind these files is difficult to find. Does anyone know of a resource that I can learn from?
Is there any code out there that will already do this (am I duplicating an effort that I don't need to)?
thanks!
According to Pillowcase, here it's an example of usage:
# Download dictionary
wget -O ./dic/es_ES.aff "https://raw.githubusercontent.com/sbosio/rla-es/master/source-code/hispalabras-0.1/hispalabras/es_ES.aff"
wget -O ./dic/es_ES.dic "https://raw.githubusercontent.com/sbosio/rla-es/master/source-code/hispalabras-0.1/hispalabras/es_ES.dic"
# Compile program
wget -O ./dic/unmunch.cxx "https://raw.githubusercontent.com/hunspell/hunspell/master/src/tools/unmunch.cxx"
wget -O ./dic/unmunch.h "https://raw.githubusercontent.com/hunspell/hunspell/master/src/tools/unmunch.h"
g++ -o ./dic/unmunch ./dic/unmunch.cxx
# Generate dictionary
./dic/unmunch ./dic/es_ES.dic ./dic/es_ES.aff 2> /dev/null > ./dic/es_ES.txt.bk
sort ./dic/es_ES.txt.bk > ./dic/es_ES.txt # Opcional
rm ./dic/es_ES.txt.bk # Opcional
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With