How to get the same results as http://developer.yahoo.com/search/content/V1/termExtraction.html
This question has been asked quite a few times before.
best approach to analyze text in PHP?
What is a good keyword extraction web service?
What is a simple way to generate keywords from a text?
Trying to approach this problem with existing solutions I stumbled upon "Text Analysis" Solr performs on the document before indexing as described in http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters - which includes stemming as well.
So the final index will consist mostly of terms used to describe the document.
Is there a solution that provides analyzers, tokenizers, and token filters for direct use? If solr is the way out, what is the best way get this data from solr's index?
The process of finding these keywords is called Keyword Research.
Solr is a way to create a custom search engine. It does not seem to be the right tool for the job. The Wikipedia article about term extraction lists in its "external links" section several web applications for term extraction. OpenNLP has a list of tools which may be useful. Its Chunker may be helpful.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With