I've currently written an algorithm in Ruby based on the arc90 readability code to extract an article from a web page.
Now that I have the article, I want to extract keywords and specific information from it (names, author, etc)
I heard Alchemy was a great ruby gem for doing this though it consumes a lot of resources. Are there any better gems I can use for this?
You can use a keyword extractor to pull out single words (keywords) or groups of two or more words that create a phrase (key phrases). Try the keyword extractor, below, using your own text to pull out single words (keywords) or groups of two or more words that create a phrase (key phrases).
YAKE! It is a lightweight, unsupervised automatic keyword extraction method that relies on statistical text features extracted from individual documents to identify the most relevant keywords in the text.
fast, leightweight and easy-to-use gem for extracting keywords from longer content:
https://rubygems.org/gems/highscore
i use it in production, works like a charm.
The question is a bit older, but i'll leave this here for others who will come from google to see this question.
There is an OpenCalais gem which provides similar capability. In addition to entity extraction it can also detect events and relations between entities. It's not lightweight, though I couldn't tell if it's better or worse than Alchemy as I haven't used the Alchemy gem. Hope this helps.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With