Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Given a text count occurrences of all two consecutive words


Input:

Once upon a time a time this upon a


Output:

dictionary {
    'Once upon': 1,
       'upon a': 2,
       'a time': 2,
       'time a': 1,
    'time this': 1,
    'this upon': 1
}


CODE:

def countTuples(path):
    dic = dict()
    with codecs.open(path, 'r', 'utf-8') as f:
        for line in f:
            s = line.split()
            for i in range (0, len(s)-1):
                dic[str(s[i]) + ' ' + str(s[i+1])] += 1
    return dic

I am getting this error:

File "C:/Users/user/Anaconda3/hw2.py", line 100, in countTuples
    dic[str(s[i]) + ' ' + str(s[i+1])] += 1
TypeError: list indices must be integers or slices, not str

If I remove the += and just place =1 everything works just fine, I guess the problem is when I try to access an entry to extract a value that doesn't exist yet ?

What can I do to fix this ?

like image 491
Tony Tannous Avatar asked Feb 05 '23 16:02

Tony Tannous


1 Answers

You can use a defaultdict to make your solution work. With a defaultdict, you specify the default type of the value of a key-value pair. This allows you to make an assignment like +=1 to a key which has not been explicitly created, yet:

import codecs
from collections import defaultdict

def countTuples(path):
    dic = defaultdict(int)
    with codecs.open(path, 'r', 'utf-8') as f:
        for line in f:
            s = line.split()
            for i in range (0, len(s)-1):
                dic[str(s[i]) + ' ' + str(s[i+1])] += 1
    return dic

>>> {'Once upon': 1,
     'a time': 2,
     'this upon': 1,
     'time a': 1,
     'time this': 1,
     'upon a': 2})
like image 60
pansen Avatar answered Feb 07 '23 18:02

pansen