Suppose the following toyset (from a CSV file where column names are the "keys" and I'm only interested in some rows that I put in "data"):
keys = ['k1', 'k2', 'k3', 'k4']
data = [[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [13, 14, 15, 16]]
I want to get a dictionary with a list for each column, like this:
{'k1': [1, 5, 9, 13], 'k2': [2, 6, 10, 14], 'k3': [3, 7, 11, 15], 'k4': [4, 8,
12, 16]}
In my code I first initialize the dictionary with empty lists and then iterate (in the order of the keys) to append each item in their list.
my_dict = dict.fromkeys(keys, [])
for row in data:
for i, k in zip(row, keys):
my_dict[k].append(i)
But it doesn't work. It builds this dictionary:
{'k3': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16], 'k2': [1, 2, 3,
4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16], 'k1': [1, 2, 3, 4, 5, 6, 7, 8,
9, 10, 11, 12, 13, 14, 15, 16], 'k4': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
13, 14, 15, 16]}
You can see that all the elements are in all lists instead of just four elements in each list. If I print i, k in the loop it does the correct pairs of items and keys. So I guess the problem is when I add item i in the list for key k.
Does anyone know why all elements are added to all lists and what would be the right way of building my dictionary?
Thanks in advance
In Python, we can add multiple key-value pairs to an existing dictionary. This is achieved by using the update() method. This method takes an argument of type dict or any iterable that has the length of two - like ((key1, value1),) , and updates the dictionary with new key-value pairs.
Python dictionary is one of the built-in data types. Dictionary elements are key-value pairs. You can add to dictionary in Python using multiple methods.
Appending element(s) to a dictionaryTo append an element to an existing dictionary, you have to use the dictionary name followed by square brackets with the key name and assign a value to it.
zip it but transpose it first:
>>> keys = ['k1', 'k2', 'k3', 'k4']
>>> data = [[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [13, 14, 15, 16]]
>>> print dict(zip(keys, zip(*data)))
{'k3': (3, 7, 11, 15), 'k2': (2, 6, 10, 14), 'k1': (1, 5, 9, 13), 'k4': (4, 8, 12, 16)}
If you want lists not tuples in the array:
>>> print dict(zip(keys, [list(i) for i in zip(*data)]))
And if you want to use your version, just make dictionary comprehension, not fromkeys
:
my_dict = { k : [] for k in keys }
The problem in your case that you initialize my_dict
with the same value:
>>> my_dict = dict.fromkeys(keys, [])
>>> my_dict
{'k3': [], 'k2': [], 'k1': [], 'k4': []}
>>> my_dict['k3'].append(1)
>>> my_dict
{'k3': [1], 'k2': [1], 'k1': [1], 'k4': [1]}
When you do it right (with dictionary/list comprehension):
>>> my_dict = dict((k, []) for k in keys )
>>> my_dict
{'k3': [], 'k2': [], 'k1': [], 'k4': []}
>>> my_dict['k3'].append(1)
>>> my_dict
{'k3': [1], 'k2': [], 'k1': [], 'k4': []}
You are running into the issue explained in this answer: You dictionary is initialised with the same list object resued for all values. Simply use
dict(zip(keys, zip(*data)))
instead. This will transpose the list of rows into a list of columns, and then zip the keys and columns together.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With