Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to create a dictionary of dictionaries of dictionaries in Python

So I am taking a natural language processing class and I need to create a trigram language model to generate random text that looks "realistic" to a certain degree based off of some sample data.

Essencially need to create a "trigram" to hold the various 3 letter grammar word combinations. My professor hints that this can be done by having a dictionary of dictionaries of dictionaries which I attempted to create using:

trigram = defaultdict( defaultdict(defaultdict(int)))

However I get an error that says:

trigram = defaultdict( dict(dict(int)))
TypeError: 'type' object is not iterable

How would I do about created a 3 layer nested dictionary or a dictionary of dictionaries of dictionaries of int values?

I guess people vote down a question on stack overflow if they don't know how to answer it. I'll add some background to better explain the question for those willing to help.

This trigram is used to keep track of triple word patterns. The are used in text language processing software and almost everywhere throughout natural language processing "think siri or google now".

If we designate the 3 levels of dictionaries as dict1 dict2 and dict3 then parsing a text file and reading a statement "The boy runs" would have the following:

A dict1 which has a key of "the". Accessing that key would return dict2 which contains the key "boy". Accessing that key would return the final dict3 which would contain the key "runs" now accessing that key would return the value 1.

This symbolizes that in this text "the boy runs" has appeared 1 time. If we encounter it again then we would follow the same process and increment 1 to two. If we encounter "the girl walks" then dict2 the "the" keys dictionary will now contain another key for "girl" which would have a dict3 that has a key of "walks" and a value of 1 and so forth. Eventually after parsing a ton of text (and keeping track of the word count" you will have a trigram which can determine the likeliness of a certain starting word leading to a 3 word combination based off the frequency of times they appeared in the previously parsed text.

This can help you create grammar rules to identify languages or in my case created randomly generated text that looks very much like grammatical english. I need a three layer dictionary because at any position of a 3 word combination there can be another word that can create a whole different set of combinations. I TRIED my best to explain trigrams and the purpose behind them to the best of my ability... granted I just stated the class a couple weeks ago.

Now... with ALL of that being said. How would I go about creating a dictionary of dictionaries of dictionaries whose base dictionary holds values of type int in python?

trigram = defaultdict( defaultdict(defaultdict(int)))

throws an error for me

like image 485
crazyCoder Avatar asked Sep 28 '13 03:09

crazyCoder


People also ask

Can Python dictionaries contain other dictionaries?

A dictionary can contain dictionaries, this is called nested dictionaries.

How do you create a sub dictionary in Python?

To create a nested dictionary, simply pass dictionary key:value pair as keyword arguments to dict() Constructor. You can use dict() function along with the zip() function, to combine separate lists of keys and values obtained dynamically at runtime.

Can a dictionary contain a dictionary?

Both can be nested. A list can contain another list. A dictionary can contain another dictionary. A dictionary can also contain a list, and vice versa.


1 Answers

I've tried nested defaultdict's before and the solution seems to be a lambda call:

trigram = defaultdict(lambda: defaultdict(lambda: defaultdict(int)))

trigram['a']['b']['c'] += 1

It's not pretty, but I suspect the nested dictionary suggestion is for efficient lookup.

like image 135
pcoving Avatar answered Sep 23 '22 10:09

pcoving