understanding semcor corpus structure h

Tags:

I'm learning NLP. I currently playing with Word Sense Disambiguation. I'm planning to use the semcor corpus as training data but I have trouble understanding the xml structure. I tried googling but did not get any resource describing the content structure of semcor.

<s snum="1">
<wf cmd="ignore" pos="DT">The</wf>
<wf cmd="done" lemma="group" lexsn="1:03:00::" pn="group" pos="NNP" rdf="group" wnsn="1">Fulton_County_Grand_Jury</wf>
<wf cmd="done" lemma="say" lexsn="2:32:00::" pos="VB" wnsn="1">said</wf>
<wf cmd="done" lemma="friday" lexsn="1:28:00::" pos="NN" wnsn="1">Friday</wf>
<wf cmd="ignore" pos="DT">an</wf>
<wf cmd="done" lemma="investigation" lexsn="1:09:00::" pos="NN" wnsn="1">investigation</wf>
<wf cmd="ignore" pos="IN">of</wf>
<wf cmd="done" lemma="atlanta" lexsn="1:15:00::" pos="NN" wnsn="1">Atlanta</wf>
<wf cmd="ignore" pos="POS">'s</wf>
<wf cmd="done" lemma="recent" lexsn="5:00:00:past:00" pos="JJ" wnsn="2">recent</wf>
<wf cmd="done" lemma="primary_election" lexsn="1:04:00::" pos="NN" wnsn="1">primary_election</wf>
<wf cmd="done" lemma="produce" lexsn="2:39:01::" pos="VB" wnsn="4">produced</wf>
<punc>``</punc>
<wf cmd="ignore" pos="DT">no</wf>
<wf cmd="done" lemma="evidence" lexsn="1:09:00::" pos="NN" wnsn="1">evidence</wf>
<punc>''</punc>
<wf cmd="ignore" pos="IN">that</wf>
<wf cmd="ignore" pos="DT">any</wf>
<wf cmd="done" lemma="irregularity" lexsn="1:04:00::" pos="NN" wnsn="1">irregularities</wf>
<wf cmd="done" lemma="take_place" lexsn="2:30:00::" pos="VB" wnsn="1">took_place</wf>
<punc>.</punc>
</s>

I'm assuming wnsn is 'word sense'. Is it correct?
What does the attribute lexsn mean? How does it map to wordnet?
What does the attribute pn refer to? (third line)
How is the rdf attribute assigned? (again third line)
In general, what are the possible attributes?

638

asked Jan 03 '11 10:01

Sharmila

1 Answers

The format is described in the "doc/cxtfile.txt" file in the SemCor 1.6 archive; for some reason, documentation is not included in later versions.

136

answered Sep 25 '22 15:09

Bkkbrad

Related questions
                            
                                Converting adjectives and adverbs to their noun forms
                            
                                Dual-line bilingual paragraph in LaTeX [closed]
                            
                                How To Parse Verbs Using Spacy
                            
                                Anaphora resolution in stanford-nlp using python
                            
                                NLP: Building (small) corpora, or "Where to get lots of not-too-specialized English-language text files?"
                            
                                An algorithm for declension of nouns of Polish/Slavic languages
                            
                                PHP implementation of Bayes classificator: Assign topics to texts
                            
                                Implementing Read typeclass where parsing strings includes "$"
                            
                                How to get logical parts of a sentence with java?
                            
                                Is there software that outputs speech-to-text at the Phonological level?
                            
                                How can I use Python NLTK to identify collocations among single characters?
                            
                                Justadistraction: tokenizing English without whitespaces. Murakami SheepMan
                            
                                Checking if a string contains an English sentence
                            
                                Python NLP British English vs American English

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

understanding semcor corpus structure h

Tags:

nlp

linguistics

corpus

Sharmila

People also ask

1 Answers

Bkkbrad

Recent Activity

Donate For Us