Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to extract noun adjective pairs from a sentence

Tags:

python

nltk

I wish to extract noun-adjective pairs from this sentence. So, basically I want something like : (Mark,sincere) (John,sincere).

from nltk import word_tokenize, pos_tag, ne_chunk
sentence = "Mark and John are sincere employees at Google."
print ne_chunk(pos_tag(word_tokenize(sentence)))
like image 892
Aarushi Aiyyar Avatar asked Dec 18 '22 00:12

Aarushi Aiyyar


1 Answers

Spacy's POS tagging would be a better than NLTK. It's faster and better. Here is an example of what you want to do

import spacy
nlp = spacy.load('en')
doc = nlp(u'Mark and John are sincere employees at Google.')
noun_adj_pairs = []
for i,token in enumerate(doc):
    if token.pos_ not in ('NOUN','PROPN'):
        continue
    for j in range(i+1,len(doc)):
        if doc[j].pos_ == 'ADJ':
            noun_adj_pairs.append((token,doc[j]))
            break
noun_adj_pairs

output

[(Mark, sincere), (John, sincere)]

like image 150
vumaasha Avatar answered Dec 27 '22 01:12

vumaasha