Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to iterate over a dictionary - n key-value pairs at a time

I have a very large dictionary with thousands of elements. I need to execute a function with this dictionary as parameter. Now, instead of passing the whole dictionary in a single execution, I want to execute the function in batches - with x key-value pairs of the dictionary at a time.

I am doing the following:

mydict = ##some large hash
x = ##batch size
def some_func(data):
    ##do something on data
temp = {}
for key,value in mydict.iteritems():
        if len(temp) != 0 and len(temp)%x == 0:
                some_func(temp)
                temp = {}
                temp[key] = value
        else:
                temp[key] = value
if temp != {}:
        some_func(temp)

This looks very hackish to me. I want to know if there is an elegant/better way of doing this.

like image 229
nish Avatar asked Jan 19 '15 10:01

nish


People also ask

How do you iterate through all values in a dictionary?

You can iterate through a Python dictionary using the keys(), items(), and values() methods. keys() returns an iterable list of dictionary keys. items() returns the key-value pairs in a dictionary. values() returns the dictionary values.

Can you iterate over dictionary keys?

To iterate through the dictionary's keys, utilise the keys() method that is supplied by the dictionary. An iterable of the keys available in the dictionary is returned.


2 Answers

I often use this little utility:

import itertools

def chunked(it, size):
    it = iter(it)
    while True:
        p = tuple(itertools.islice(it, size))
        if not p:
            break
        yield p

For your use case:

for chunk in chunked(big_dict.iteritems(), batch_size):
    func(chunk)
like image 154
georg Avatar answered Nov 14 '22 23:11

georg


Here are two solutions adapted from earlier answers of mine.

Either, you can just get the list of items from the dictionary and create new dicts from slices of that list. This is not optimal, though, as it does a lot of copying of that huge dictionary.

def chunks(dictionary, size):
    items = dictionary.items()
    return (dict(items[i:i+size]) for i in range(0, len(items), size))

Alternatively, you can use some of the itertools module's functions to yield (generate) new sub-dictionaries as you loop. This is similar to @georg's answer, just using a for loop.

from itertools import chain, islice
def chunks(dictionary, size):
    iterator = dictionary.iteritems()
    for first in iterator:
        yield dict(chain([first], islice(iterator, size - 1)))

Example usage. for both cases:

mydict = {i+1: chr(i+65) for i in range(26)}
for sub_d in chunks2(mydict, 10):
    some_func(sub_d)
like image 20
tobias_k Avatar answered Nov 14 '22 23:11

tobias_k