Suppose the following toyset (from a CSV file where column names are the "keys" and I'm only interested in some rows that I put in "data"): <pre class="prettyprint"><code>keys = ['k1', 'k2', 'k3', 'k4'] data = [[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [13, 14, 15, 16]] </code></pre> I want to get a dictionary with a list for each column, like this: <pre class="prettyprint"><code>{'k1': [1, 5, 9, 13], 'k2': [2, 6, 10, 14], 'k3': [3, 7, 11, 15], 'k4': [4, 8, 12, 16]} </code></pre> In my code I first initialize the dictionary with empty lists and then iterate (in the order of the keys) to append each item in their list. <pre class="prettyprint"><code>my_dict = dict.fromkeys(keys, []) for row in data: for i, k in zip(row, keys): my_dict[k].append(i) </code></pre> But it doesn't work. It builds this dictionary: <pre class="prettyprint"><code>{'k3': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16], 'k2': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16], 'k1': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16], 'k4': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]} </code></pre> You can see that all the elements are in all lists instead of just four elements in each list. If I print i, k in the loop it does the correct pairs of items and keys. So I guess the problem is when I add item i in the list for key k. Does anyone know why all elements are added to all lists and what would be the right way of building my dictionary? Thanks in advance

zip it but transpose it first: <pre class="prettyprint"><code>>>> keys = ['k1', 'k2', 'k3', 'k4'] >>> data = [[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [13, 14, 15, 16]] >>> print dict(zip(keys, zip(*data))) {'k3': (3, 7, 11, 15), 'k2': (2, 6, 10, 14), 'k1': (1, 5, 9, 13), 'k4': (4, 8, 12, 16)} </code></pre> If you want lists not tuples in the array: <pre class="prettyprint"><code>>>> print dict(zip(keys, [list(i) for i in zip(*data)])) </code></pre> And if you want to use your version, just make dictionary comprehension, not <code>fromkeys</code>: <pre class="prettyprint"><code>my_dict = { k : [] for k in keys } </code></pre> The problem in your case that you initialize <code>my_dict</code> with the same value: <pre class="prettyprint"><code>>>> my_dict = dict.fromkeys(keys, []) >>> my_dict {'k3': [], 'k2': [], 'k1': [], 'k4': []} >>> my_dict['k3'].append(1) >>> my_dict {'k3': [1], 'k2': [1], 'k1': [1], 'k4': [1]} </code></pre> When you do it right (with dictionary/list comprehension): <pre class="prettyprint"><code>>>> my_dict = dict((k, []) for k in keys ) >>> my_dict {'k3': [], 'k2': [], 'k1': [], 'k4': []} >>> my_dict['k3'].append(1) >>> my_dict {'k3': [1], 'k2': [], 'k1': [], 'k4': []} </code></pre>

You are running into the issue explained in this answer: You dictionary is initialised with the same list object resued for all values. Simply use <pre class="prettyprint"><code>dict(zip(keys, zip(*data))) </code></pre> instead. This will transpose the list of rows into a list of columns, and then zip the keys and columns together.

Add items to a dictionary of lists

Tags:

python

dictionary

Suppose the following toyset (from a CSV file where column names are the "keys" and I'm only interested in some rows that I put in "data"):

keys = ['k1', 'k2', 'k3', 'k4']
data = [[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [13, 14, 15, 16]]

I want to get a dictionary with a list for each column, like this:

{'k1': [1, 5, 9, 13], 'k2': [2, 6, 10, 14], 'k3': [3, 7, 11, 15], 'k4': [4, 8, 
12, 16]}

In my code I first initialize the dictionary with empty lists and then iterate (in the order of the keys) to append each item in their list.

my_dict = dict.fromkeys(keys, [])
for row in data:
    for i, k in zip(row, keys):
        my_dict[k].append(i)

But it doesn't work. It builds this dictionary:

{'k3': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16], 'k2': [1, 2, 3,
4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16], 'k1': [1, 2, 3, 4, 5, 6, 7, 8, 
9, 10, 11, 12, 13, 14, 15, 16], 'k4': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 
13, 14, 15, 16]}

You can see that all the elements are in all lists instead of just four elements in each list. If I print i, k in the loop it does the correct pairs of items and keys. So I guess the problem is when I add item i in the list for key k.

Does anyone know why all elements are added to all lists and what would be the right way of building my dictionary?

Thanks in advance

677

asked Jul 23 '12 13:07

julia

2 Answers

zip it but transpose it first:

>>> keys = ['k1', 'k2', 'k3', 'k4']
>>> data = [[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [13, 14, 15, 16]]
>>> print dict(zip(keys, zip(*data)))
{'k3': (3, 7, 11, 15), 'k2': (2, 6, 10, 14), 'k1': (1, 5, 9, 13), 'k4': (4, 8, 12, 16)}

If you want lists not tuples in the array:

>>> print dict(zip(keys, [list(i) for i in zip(*data)]))

And if you want to use your version, just make dictionary comprehension, not fromkeys:

my_dict = { k : [] for k in keys }

The problem in your case that you initialize my_dict with the same value:

>>> my_dict = dict.fromkeys(keys, [])
>>> my_dict
{'k3': [], 'k2': [], 'k1': [], 'k4': []}
>>> my_dict['k3'].append(1)
>>> my_dict
{'k3': [1], 'k2': [1], 'k1': [1], 'k4': [1]}

When you do it right (with dictionary/list comprehension):

>>> my_dict = dict((k, []) for k in keys )
>>> my_dict
{'k3': [], 'k2': [], 'k1': [], 'k4': []}
>>> my_dict['k3'].append(1)
>>> my_dict
{'k3': [1], 'k2': [], 'k1': [], 'k4': []}

122

answered Sep 29 '22 23:09

Igor Chubin

You are running into the issue explained in this answer: You dictionary is initialised with the same list object resued for all values. Simply use

dict(zip(keys, zip(*data)))

instead. This will transpose the list of rows into a list of columns, and then zip the keys and columns together.

answered Sep 30 '22 00:09

Sven Marnach

Related questions
                            
                                Strange path separators on Windows
                            
                                Pythonic shorthand for keys in a dictionary?
                            
                                Sleep / Suspend / Hibernate Windows PC
                            
                                Print the first two rows of a csv file to a standard output
                            
                                Issues using python's string formatting libraries
                            
                                Bash: Variable in single quote
                            
                                Why does Django create Postgres timestamp columns with time zones?
                            
                                I Don't Understand This Use of Recursion
                            
                                SGE script: print to file during execution (not just at the end)?
                            
                                string to list conversion in python
                            
                                Using "readlines()" twice in a row [duplicate]
                            
                                Divide the number into random number of random elements?
                            
                                Get IP Mask from IP Address and Mask Length in Python
                            
                                Configuring gunicorn for Django on Heroku
                            
                                Check if one of all variables is empty
                            
                                Order a list by all item's digits in Python
                            
                                Numpy: ImportError: cannot import name TestCase
                            
                                How do I alias a command line command? (Mac)
                            
                                Surface Curvature Matlab equivalent in Python
                            
                                Nothing happens when I do: python manage.py command

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With