Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to create an optimized iterator for a long list of integers?

Say I have a very large list of integers that occupies a very large amount of memory. If the list's integers were in even increments, I could then easily express the list as an iterator occupying comparatively no memory. But with more complicated patterns, it would become more difficult to express this list as an iterator.

Is there a Python package that can analyze a list of integers and return an "optimized" iterator? Or methodologies I can look into to accomplish this?

like image 541
bfletch Avatar asked Apr 29 '17 02:04

bfletch


2 Answers

My proof of concept, using lzma library (backport for python 2) with compression to memory. Instead of memory buffer you can use file on disk:

import io
import random
import struct
import sys

from backports import lzma

# Create array of integers with some duplicates
data = []
for i in xrange(0, 2000):
    data += [random.randint(-sys.maxint, sys.maxint)] * random.randint(0, 500)

print('Uncompressed: {}'.format(len(data)))
buff = io.BytesIO()

fmt = 'i'  # check https://docs.python.org/3/library/struct.html#format-characters
lzma_writer = lzma.LZMAFile(buff, 'wb')
for i in data:
    lzma_writer.write(struct.pack(fmt, i))
lzma_writer.close()
print('Compressed: {}'.format(len(buff.getvalue())))

buff.seek(0)
lzma_reader = lzma.LZMAFile(buff, 'rb')

size_of = struct.calcsize(fmt)


def generate():
    r = lzma_reader.read(size_of)
    while len(r) != 0:
        yield struct.unpack(fmt, r)[0]
        r = lzma_reader.read(size_of)


# Test if it is same array
res = list(generate())
print res == data

Result:

Uncompressed: 496225
Compressed: 11568
True
like image 92
Bartek Jablonski Avatar answered Oct 10 '22 04:10

Bartek Jablonski


I agree with Efron Licht, clearly: It entirely depends on complexity of particular list to compact (not to say 'compress'). Unless your lists are simple enought to express as generators, your only choice is to use Bartek Jablonski answer.

like image 40
internety Avatar answered Oct 10 '22 04:10

internety