If you look at the original Wordnet search and select "Display options: Show Lexical File Info", you'll see an extremely useful classification of words called lexical file. Eg for "filling" we have:
<noun.substance>S: (n) filling, fill (any material that fills a space or container)
<noun.process>S: (n) filling (flow into something (as a container))
<noun.food>S: (n) filling (a food mixture used to fill pastry or sandwiches etc.)
<noun.artifact>S: (n) woof, weft, filling, pick (the yarn woven across the warp yarn in weaving)
<noun.artifact>S: (n) filling ((dentistry) a dental appliance consisting of ...)
<noun.act>S: (n) filling (the act of filling something)
The first thing in brackets is the "lexical file". Unfortunately I have not been able to find a SPARQL endpoint that provides this info
The latest RDF translation of Wordnet 3.0 points to two things:
Talis SPARQL endpoint. Use eg this query to check there's no such info:
DESCRIBE <http://purl.org/vocabularies/princeton/wn30/synset-chair-noun-1>
W3C's mapping description. Appendix D "Conversion details" describes something useful: wn:classifiedByTopic
.
But it's not the same as lexical file, and is quite incomplete. Eg "chair" has nothing, while one of the senses of "completion" is in the topic "American Football"
DESCRIBE <http://purl.org/vocabularies/princeton/wn30/synset-completion-noun-1>
->
<j.1:classifiedByTopic rdf:resource="http://purl.org/vocabularies/princeton/wn30/synset-American_football-noun-1"/>
The question: is there a public Wordnet query API, or a database, that provides the lexical file information?
Using the Python NLTK interface:
from nltk.corpus import wordnet as wn
for synset in wn.synsets('can'):
print synset.lexname
I don't think you can find it in the RDF/OWL Representation of WordNet. It's in the WordNet distribution though: dict/lexnames
. Here is the content of the file as of WordNet 3.0:
00 adj.all 3
01 adj.pert 3
02 adv.all 4
03 noun.Tops 1
04 noun.act 1
05 noun.animal 1
06 noun.artifact 1
07 noun.attribute 1
08 noun.body 1
09 noun.cognition 1
10 noun.communication 1
11 noun.event 1
12 noun.feeling 1
13 noun.food 1
14 noun.group 1
15 noun.location 1
16 noun.motive 1
17 noun.object 1
18 noun.person 1
19 noun.phenomenon 1
20 noun.plant 1
21 noun.possession 1
22 noun.process 1
23 noun.quantity 1
24 noun.relation 1
25 noun.shape 1
26 noun.state 1
27 noun.substance 1
28 noun.time 1
29 verb.body 2
30 verb.change 2
31 verb.cognition 2
32 verb.communication 2
33 verb.competition 2
34 verb.consumption 2
35 verb.contact 2
36 verb.creation 2
37 verb.emotion 2
38 verb.motion 2
39 verb.perception 2
40 verb.possession 2
41 verb.social 2
42 verb.stative 2
43 verb.weather 2
44 adj.ppl 3
For each entry of dict/data.*, the second number is the lexical file info. For example, this filling entry contains the number 13, which is noun.food.
07883031 13 n 01 filling 0 002 @ 07882497 n 0000 ~ 07883156 n 0000 | a food mixture used to fill pastry or sandwiches etc.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With