I have a huge dictionary something like this:
d[id1][id2] = value
example:
books["auth1"]["humor"] = 20
books["auth1"]["action"] = 30
books["auth2"]["comedy"] = 20
and so on..
Each of the "auth" keys can have any set of "genres" associated wtih them. The value for a keyed item is the number of books they wrote.
Now what I want is to convert it in a form of matrix...something like:
                    "humor"       "action"        "comedy"
      "auth1"         20            30               0
      "auth2"          0            0                20
How do i do this? Thanks
pandas do this very well:
books = {} books["auth1"] = {} books["auth2"] = {} books["auth1"]["humor"] = 20 books["auth1"]["action"] = 30 books["auth2"]["comedy"] = 20  from pandas import *  df = DataFrame(books).T.fillna(0)   The output is:
       action  comedy  humor auth1      30       0     20 auth2       0      20      0 
                        Use a list comprehension to turn a dict into a list of lists and/or a numpy array:
np.array([[books[author][genre] for genre in sorted(books[author])] for author in sorted(books)])   EDIT
Apparently you have an irregular number of keys in each sub-dictionary. Make a list of all the genres:
genres = ['humor', 'action', 'comedy']   And then iterate over the dictionaries in the normal manner:
list_of_lists = [] for author_name, author in sorted(books.items()):     titles = []     for genre in genres:         try:             titles.append(author[genre])         except KeyError:             titles.append(0)     list_of_lists.append(titles)  books_array = numpy.array(list_of_lists)   Basically I'm attempting to append a value from each key in genres to a list. If the key is not there, it throws an error. I catch the error, and append a 0 to the list instead.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With