Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Faster Python List Comprehension

Tags:

python

I have a bit of code that runs many thousands of times in my project:

def resample(freq, data):
    output = []
    for i, elem in enumerate(freq):
        for _ in range(elem):
            output.append(data[i])
    return output

eg. resample([1,2,3], ['a', 'b', 'c']) => ['a', 'b', 'b', 'c', 'c', 'c']

I want to speed this up as much as possible. It seems like a list comprehension could be faster. I have tried:

def resample(freq, data):
   return [item for sublist in [[data[i]]*elem for i, elem in enumerate(frequencies)] for item in sublist]

Which is hideous and also slow because it builds the list and then flattens it. Is there a way to do this with one line list comprehension that is fast? Or maybe something with numpy?

Thanks in advance!

edit: Answer does not necessarily need to eliminate the nested loops, fastest code is the best

like image 406
Luke Eller Avatar asked Jan 02 '23 04:01

Luke Eller


1 Answers

I highly suggest using generators like so:

from itertools import repeat, chain
def resample(freq, data):
    return chain.from_iterable(map(repeat, data, freq))

This will probably be the fastest method there is - map(), repeat() and chain.from_iterable() are all implemented in C so you technically can't get any better.

As for a small explanation:

repeat(i, n) returns an iterator that repeats an item i, n times.

map(repeat, data, freq) returns an iterator that calls repeat every time on an element of data and an element of freq. Basically an iterator that returns repeat() iterators.

chain.from_iterable() flattens the iterator of iterators to return the end items.

No list is created on the way, so there is no overhead and as an added benefit - you can use any type of data and not just one char strings.

While I don't suggest it, you are able to convert it into a list() like so:

result = list(resample([1,2,3], ['a','b','c']))
like image 88
Bharel Avatar answered Jan 05 '23 06:01

Bharel