Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Clever way of building a tag cloud? - Python

I've built a content aggregator and would like to add a tag cloud representing the current trends.

Unfortunately this is quite complex, as I have to look for keywords that represent the context of each article.

For example words such as I, was, the, amazing, nice have no relation to context.


Help would be much appreciated! :)

like image 607
RadiantHex Avatar asked Dec 23 '22 04:12

RadiantHex


2 Answers

Use NLTK, and in particular its Stopwords corpus:

Besides regular content words, there is another class of words called stop words that perform important grammatical functions, but are unlikely to be interesting by themselves. These include prepositions, complementizers, and determiners. NLTK comes bundled with the Stopwords corpus, a list of 2400 stop words across 11 different languages (including English).

like image 175
Alex Martelli Avatar answered Dec 24 '22 16:12

Alex Martelli


NLTK can help you analyze the content in order to pick out relevant terms.

like image 38
Ignacio Vazquez-Abrams Avatar answered Dec 24 '22 16:12

Ignacio Vazquez-Abrams