I have found a solution but it is really slow:
def chunks(self,data, SIZE=10000): for i in xrange(0, len(data), SIZE): yield dict(data.items()[i:i+SIZE])
Do you have any ideas without using external modules (numpy and etc.)
You can't really "slice" a dictionary, since it's a mutable mapping and not a sequence.
In Python, we use “ del “ statement to delete elements from nested dictionary.
Since the dictionary is so big, it would be better to keep all the items involved to be just iterators and generators, like this
from itertools import islice def chunks(data, SIZE=10000): it = iter(data) for i in range(0, len(data), SIZE): yield {k:data[k] for k in islice(it, SIZE)}
Sample run:
for item in chunks({i:i for i in xrange(10)}, 3): print(item)
Output
{0: 0, 1: 1, 2: 2} {3: 3, 4: 4, 5: 5} {8: 8, 6: 6, 7: 7} {9: 9}
Another method is iterators zipping:
>>> from itertools import izip_longest, ifilter >>> d = {'a':1, 'b':2, 'c':3, 'd':4, 'e':5, 'f':6, 'g':7, 'h':8}
Create a list with copies of dict iterators (number of copies is number of elements in result dicts). By passing each iterator from chunks
list to izip_longest
you will get needed number of elements from source dict (ifilter
used to remove None
from zip results). With generator expression you can lower memory usage:
>>> chunks = [d.iteritems()]*3 >>> g = (dict(ifilter(None, v)) for v in izip_longest(*chunks)) >>> list(g) [{'a': 1, 'c': 3, 'b': 2}, {'e': 5, 'd': 4, 'g': 7}, {'h': 8, 'f': 6}]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With