I just wrote a script that extracts all the spoken text in the Dutch Parlement of a few thousand XML files. For every speaker it count the amount of times a speaker said some words.
After doing this I calculated the TF * IDF value of every word for each speaker in the Dutch Parlement. If you are not familiar with this see this link: TF IDF explanation
So now I have a dictionary for each speaker in the Dutch Parlement where the keys are the words he said and the values are the corresponding TF*IDF values:
{u'asielzoekers': 0.0034861170591325486,
u'belastingverlaging': 0.0018551991553514675,
u'buma': 0.0020712555982839408,
u'islam': 0.0029519544163739155,
u'moslims': 0.0027958002747301355,
u'ouderen': 0.0022803123245457566,
u'pechtold': 0.0021525864470786928,
u'president': 0.003281844532743345,
u'rutte': 0.0023488684001475584,
u'samsom': 0.0019304632325980841}
Right now I want to create a wordcloud from these values. I have shortly looked into the wordcloud module written by amueller But for as far as I can see this module is not working with a dictionary but just plain text.
So any help on how to create a wordcloud from a dictionary's values would be highly appreciated.
Thanks in advance!
In the word cloud, select the word you wish to combine with other words (eg, “convenient”). Type in a word or phrase you wish to combine the word with (eg, type in “ease”), and press Enter. Repeat this process for all other words or phrases you wish to combine (eg, "easy"), until you have exhausted the synonyms.
From the wordcloud documentation: stopwords : set of strings or None. The words that will be eliminated. If None, the build-in STOPWORDS list will be used.
Unlike WordArt.com, WordClouds.com does not automatically duplicate words that you submit to fill out an image.
import matplotlib.pyplot as plt
from wordcloud import WordCloud
word_could_dict = {'Git':100, 'GitHub':100, 'push':50, 'pull':10, 'commit':80, 'add':30, 'diff':10,
'mv':5, 'log':8, 'branch':30, 'checkout':25}
wordcloud = WordCloud(width = 1000, height = 500).generate_from_frequencies(word_could_dict)
plt.figure(figsize=(15,8))
plt.imshow(wordcloud)
And we get:
dictionary= {u'asielzoekers': 0.0034861170591325486,.. u'samsom': 0.0019304632325980841}
from PIL import Image
import matplotlib.pyplot as plt
from wordcloud import WordCloud
wc = WordCloud(background_color="white",width=1000,height=1000, max_words=10,relative_scaling=0.5,normalize_plurals=False).generate_from_frequencies(dictionary)
plt.imshow(wc)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With