I have a very large collection of (p, q) tuples that I would like to convert into a dictionary of lists where the first item in each tuple is a key that indexes a list that contains q.
Example:
Original List: (1, 2), (1, 3), (2, 3) Resultant Dictionary: {1:[2, 3], 2:[3]}
Furthermore, I would like to efficiently combine these dictionaries.
Example:
Original Dictionaries: {1:[2, 3], 2:[3]}, {1:[4], 3:[1]} Resultant Dictionary: {1:[2, 3, 4], 2:[3], 3:[1]}
These operations reside within an inner loop, so I would prefer that they be as fast as possible.
Thanks in advance
In the latest update of python now we can use “|” operator to merge two dictionaries. It is a very convenient method to merge dictionaries.
Method 1: Using += sign on a key with an empty value In this method, we will use the += operator to append a list into the dictionary, for this we will take a dictionary and then add elements as a list into the dictionary.
In python, we can use the + operator to merge the contents of two lists into a new list. For example, We can use + operator to merge two lists i.e. It returned a new concatenated lists, which contains the contents of both list_1 and list_2.
Now you can: Use the zip() function in both Python 3 and Python 2. Loop over multiple iterables and perform different actions on their items in parallel. Create and update dictionaries on the fly by zipping two input iterables together.
If the list of tuples is sorted, itertools.groupby
, as suggested by @gnibbler, is not a bad alternative to defaultdict
, but it needs to be used differently than he suggested:
import itertools import operator def lot_to_dict(lot): key = operator.itemgetter(0) # if lot's not sorted, you also need...: # lot = sorted(lot, key=key) # NOT in-place lot.sort to avoid changing it! grob = itertools.groupby(lot, key) return dict((k, [v[1] for v in itr]) for k, itr in grob)
For "merging" dicts of lists into a new d.o.l...:
def merge_dols(dol1, dol2): keys = set(dol1).union(dol2) no = [] return dict((k, dol1.get(k, no) + dol2.get(k, no)) for k in keys)
I'm giving []
a nickname no
to avoid uselessly constructing a lot of empty lists, given that performance is important. If the sets of the dols' keys overlap only modestly, faster would be:
def merge_dols(dol1, dol2): result = dict(dol1, **dol2) result.update((k, dol1[k] + dol2[k]) for k in set(dol1).intersection(dol2)) return result
since this uses list-catenation only for overlapping keys -- so, if those are few, it will be faster.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With