I want to use GermaNet for the lemmatization (corresponding to getLemma()
in WordNet), of a list (actually DTM
terms -- for enhancing text classification performance). But, I couldn't find any hint, or R package for GermaNet. Is it somehow possible to still use it in R?
I assume you have access to the raw files where the wordnet data is stored (Germanet seems to allow for a free licency). You could parse them (simply using some nifty regular expressions) and extract the information you need (I don't know exactly what a DTM is, but I suppose it's something to do with synsets or links between then). A wordnet (not German) I worked on was organized in multiple files, some containing the links, some information in a form like
0 @1@ WORD_MEANING
1 PART_OF_SPEECH "v"
1 VARIANTS
2 LITERAL "someverb"
3 SENSE 7
3 DEFINITION "adefinition"
3 EXAMPLES
4 EXAMPLE "anexample"
3 EXTERNAL_INFO
...
That shouldn't be too hard to parse.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With