So I have a text file which has the script of Act 1 from a Romeo and Juliet play and I want to count how many times someone says a word.
Here is the text: http://pastebin.com/X0gaxAPK
There are 3 people speaking in the text: Gregory, Sampson, and Abraham.
Basically I want to make 3 different dictionaries (if that's the best way to do it?) for each of the three speakers. Populate the dictionaries with the words the people say respectively, and then count how many times they say each word in the entire script.
How would I go about doing this? I think I can figure out the word count but I am a bit confused on how to separate who says what and put it into 3 different dictionaries for each person.
My output should look something like this (this is not correct but an example):
Gregory -
25: the
15: a
5: from
3: while
1: hello
etc
Where the number is the frequency of the word said in the file.
Right now I have code written that reads the text file, strips the punctuation, and compiles the text into a list. I also don't want to use any outside modules, I'd like to do it the old fashioned way to learn, thanks.
You don't have to post exact code, just explain what I need to do and hopefully I can figure it out. I'm using Python 3.
import collections
import string
c = collections.defaultdict(collections.Counter)
speaker = None
with open('/tmp/spam.txt') as f:
for line in f:
if not line.strip():
# we're on an empty line, the last guy has finished blabbing
speaker = None
continue
if line.count(' ') == 0 and line.strip().endswith(':'):
# a new guy is talking now, you might want to refine this event
speaker = line.strip()[:-1]
continue
c[speaker].update(x.strip(string.punctuation).lower() for x in line.split())
Example output:
In [1]: run /tmp/spam.py
In [2]: c.keys()
Out[2]: [None, 'Abraham', 'Gregory', 'Sampson']
In [3]: c['Gregory'].most_common(10)
Out[3]:
[('the', 7),
('thou', 6),
('to', 6),
('of', 4),
('and', 4),
('art', 3),
('is', 3),
('it', 3),
('no', 3),
('i', 3)]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With