I am trying to add dictionaries which have a key from each element from the list and value(s) from one following element from the list and a count for the number of times it follows it, in dictionary format. For example, if we have the list of words, ['The', 'cat', 'chased', 'the', 'dog']
and if the key is "the", I want the values to be {‘dog’: 1, ‘cat’: 1}. The entire output should be {‘the’: {‘dog’: 1, ‘cat’: 1}, ‘chased’: {‘the’: 1}, ‘cat’: {‘chased’: 1}}
.
My code so far can generate key and values but not in dictionary in dictionary format. Could someone help on this?
My code:
line = ['The', 'cat', 'chased', 'the', 'dog']
output = {}
for i, item in enumerate(line):
print(i, item, len(line))
if i != len(line) - 1:
output[item] = line[i+1]=i
print(output)
Output:
{'The': 'cat', 'chased': 'the', 'the': 'dog', 'cat': 'chased'}
I did not test it but something like that maybe? Using defaultdict
:
from collections import defaultdict
line = ['The', 'cat', 'chased', 'the', 'dog']
output = defaultdict(lambda: defaultdict(int))
for t, token in enumerate(line[:-1]):
output[token.lower()][line[t + 1].lower()] += 1
You can use collections.Counter
for this. Example -
line = ['The', 'cat', 'chased', 'the', 'dog','the','dog']
from collections import Counter
output = {}
for i, item in enumerate(line):
print(i, item, len(line))
if i != len(line) - 1:
output.setdefault(item.lower(),Counter()).update(Counter({line[i+1]:1}))
print(output)
.setdefault()
first checks if the key exists, if it doesn't it sets it to the second argument and then returns the value at that key.
In Counter , when you do .update()
, if the key already exists, it increases the count by 1 , so this seems like the correct structure to use for your case.
Also, Counter behave just like normal dictionary, so you can later on use them just as any dictionary.
Demo (Please do note the modified input to show a scenario where 'dog'
followed 'the'
twice) -
>>> line = ['The', 'cat', 'chased', 'the', 'dog','the','dog']
>>> from collections import Counter
>>> output = {}
>>> for i, item in enumerate(line):
... print(i, item, len(line))
... if i != len(line) - 1:
... output.setdefault(item.lower(),Counter()).update(Counter({line[i+1]:1}))
...
0 The 7
1 cat 7
2 chased 7
3 the 7
4 dog 7
5 the 7
6 dog 7
>>> print(output)
{'dog': Counter({'the': 1}), 'cat': Counter({'chased': 1}), 'chased': Counter({'the': 1}), 'the': Counter({'dog': 2, 'cat': 1})}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With