Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Noun Synonyms in WordNet

I want to use the synonym tokenfilter in Elasticsearch for an index. I downloaded the Prolog version of WordNet 3.0, and found the wn_s.pl file that Elasticsearch can understand. However, it seems that the file contains synonyms for all sorts of words and phrases, while I am really only interested in supporting synonyms for nouns. Is there a way to extract those type of entries?

like image 228
flamecto Avatar asked Jan 13 '23 19:01

flamecto


1 Answers

Given that the format of wn_s.pl is

s(112947045,1,'usance',n,1,0).
s(200001742,1,'breathe',v,1,25).

A very raw way of doing that would be to execute the following in your terminal to only take the lines from that file that have the ',n,' string.

grep ",n," wn_s.pl > wn_s_nouns_only.pl

The file wn_s_nouns_only.pl will only have the entries that are marked as nouns.

like image 120
arturomp Avatar answered Jan 15 '23 08:01

arturomp