Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can convert a dictionary file (.dic) with an affix file (.aff) to create a list of words?

Tags:

dictionary

Im looking at a dictionary file (".dic") and its associated "aff" file. What I'm trying to do is combine the rules in the "aff" file with the words in the "dic" file to create a global list of all words contained within the dictionary file.

The documentation behind these files is difficult to find. Does anyone know of a resource that I can learn from?

Is there any code out there that will already do this (am I duplicating an effort that I don't need to)?

thanks!

like image 319
wordless Avatar asked Jan 04 '11 19:01

wordless


1 Answers

According to Pillowcase, here it's an example of usage:

# Download dictionary
wget -O ./dic/es_ES.aff "https://raw.githubusercontent.com/sbosio/rla-es/master/source-code/hispalabras-0.1/hispalabras/es_ES.aff"
wget -O ./dic/es_ES.dic "https://raw.githubusercontent.com/sbosio/rla-es/master/source-code/hispalabras-0.1/hispalabras/es_ES.dic"

# Compile program
wget -O ./dic/unmunch.cxx "https://raw.githubusercontent.com/hunspell/hunspell/master/src/tools/unmunch.cxx"
wget -O ./dic/unmunch.h "https://raw.githubusercontent.com/hunspell/hunspell/master/src/tools/unmunch.h"
g++ -o ./dic/unmunch ./dic/unmunch.cxx

# Generate dictionary
./dic/unmunch ./dic/es_ES.dic ./dic/es_ES.aff 2> /dev/null > ./dic/es_ES.txt.bk
sort ./dic/es_ES.txt.bk > ./dic/es_ES.txt # Opcional
rm ./dic/es_ES.txt.bk # Opcional
like image 94
Rubén Morales Avatar answered Sep 29 '22 06:09

Rubén Morales