Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best way to extract keywords from input NLP sentence

I'm working on a project where I need to extract important keywords from a sentence. I've been using a rules based system based on the POS tags. However, I run into some ambiguous terms that I've been unable to parse. Is there some machine learning classifier that I can use to extract relevant keywords based on a training set of different sentences?

like image 864
Daniel Svoboda Avatar asked Dec 10 '14 16:12

Daniel Svoboda


People also ask

How do you extract keywords from text?

You can use a keyword extractor to pull out single words (keywords) or groups of two or more words that create a phrase (key phrases). Try the keyword extractor, below, using your own text to pull out single words (keywords) or groups of two or more words that create a phrase (key phrases).


1 Answers

Also try this multilingual RAKE implementation - works with any language.

Can be installed with pip install multi-rake

from multi_rake import Rake

text_en = (
    'Compatibility of systems of linear constraints over the set of '
    'natural numbers. Criteria of compatibility of a system of linear '
    'Diophantine equations, strict inequations, and nonstrict inequations '
    'are considered. Upper bounds for components of a minimal set of '
    'solutions and algorithms of construction of minimal generating sets '
    'of solutions for all types of systems are given. These criteria and '
    'the corresponding algorithms for constructing a minimal supporting '
    'set of solutions can be used in solving all the considered types of '
    'systems and systems of mixed types.'
)

rake = Rake()

keywords = rake.apply(text_en)

print(keywords[:10])

#  ('minimal generating sets', 8.666666666666666),
#  ('linear diophantine equations', 8.5),
#  ('minimal supporting set', 7.666666666666666),
#  ('minimal set', 4.666666666666666),
#  ('linear constraints', 4.5),
#  ('natural numbers', 4.0),
#  ('strict inequations', 4.0),
#  ('nonstrict inequations', 4.0),
#  ('upper bounds', 4.0),
#  ('mixed types', 3.666666666666667)
like image 84
v.grabovets Avatar answered Nov 15 '22 17:11

v.grabovets