Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use GermaNet (WordNet German correspondent) with R

I want to use GermaNet for the lemmatization (corresponding to getLemma() in WordNet), of a list (actually DTM terms -- for enhancing text classification performance). But, I couldn't find any hint, or R package for GermaNet. Is it somehow possible to still use it in R?

like image 618
alex Avatar asked Mar 19 '14 04:03

alex


1 Answers

I assume you have access to the raw files where the wordnet data is stored (Germanet seems to allow for a free licency). You could parse them (simply using some nifty regular expressions) and extract the information you need (I don't know exactly what a DTM is, but I suppose it's something to do with synsets or links between then). A wordnet (not German) I worked on was organized in multiple files, some containing the links, some information in a form like

0 @1@ WORD_MEANING
  1 PART_OF_SPEECH "v"
  1 VARIANTS
    2 LITERAL "someverb"
      3 SENSE 7
      3 DEFINITION "adefinition"
      3 EXAMPLES
        4 EXAMPLE "anexample"
      3 EXTERNAL_INFO
...

That shouldn't be too hard to parse.

like image 52
user3554004 Avatar answered Sep 22 '22 16:09

user3554004