Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Plotting words frequency and NLTK

I have a file with various words, which I want to count the frequency of each word in the document and plot it. However, my plot is not showing results. The x-axis must contain the words, and the y-axis the frequency. I am using NLTK, NumPy and Matplotlib

Here's my code, maybe I did something wrong

def graph():
    f = open("file.txt", "r")
    inputfile = f.read()
    words = nltk.tokenize.word_tokenize(inputfile)
    count = set(words)
    dic = nltk.FreqDist(words)
    FreqDist(f).plot(50, cumulative=False)
    f.close()
  • Given a list of words in the file file.txt:
southbound
stopped
travel
lane
started
around
stopped
stopped
started
like image 600
KPavezC Avatar asked Apr 20 '15 18:04

KPavezC


1 Answers

import nltk

def graph():
    with open("file.txt", "r") as f:
        inputfile = f.read()
    tokens = nltk.tokenize.word_tokenize(inputfile)
    fd = nltk.FreqDist(tokens)
    fd.plot(30,cumulative=False)

graph()

enter image description here

You can play with the graph by altering the parameters to the plot()

like image 94
Anuj Gupta Avatar answered Sep 20 '22 09:09

Anuj Gupta