Let's suppose that I have a list of elements, and I want to select only some of them, according to a certain function (for example a distance to an other element).
I want to have as a result a list of tuple, with the distance and the element. So, I wrote the following code
result = [ ( myFunction(C), C) for C in originalList if myFunction(C) < limit ]
But myFunction
is a very time-consuming function, and the originalList
quite big. So doing like that, myFunction
will be call twice for every selected element.
So, is there a way to avoid this ??
I have two other possibilities, but they are not so good:
the first one, is to create the unfiltered list
unfiltered = [ (myFunction(C),C) for C in originalList ]
and then sort it
result = [ (dist,C) for dist,C in unfiltered if dist < limit ]
but in that case, I duplicate my
originalList
and waste some memory
(the list could be quite big - more
than 10,000 elements)
the second one is tricky and not very pythonic, but efficient (the best we can do, since the function should be evaluated once per element). myFunction
stores it last
result in a global variable (lastResult
for example), and this value is re-used in the
List comprehension
result = [ (lastResult,C) for C in originalList if myFunction(C) < limit ]
Do you have any better idea to achieve that, in an efficient and pythonic way ??
Thanks for your answers.
Some options:
Sure, the difference between the following two:
[f(x) for x in list]
and this:
(f(x) for x in list)
is that the first will generate the list in memory, whereas the second is a new generator, with lazy evaluation.
So, simply write the "unfiltered" list as a generator instead. Here's your code, with the generator inline:
def myFunction(x):
print("called for: " + str(x))
return x * x
originalList = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
limit = 10
result = [C2 for C2 in ((myFunction(C), C) for C in originalList) if C2[0] < limit]
# result = [C2 for C2 in [(myFunction(C), C) for C in originalList] if C2[0] < limit]
Note that you will not see a difference in the printout from the two, but if you were to look at memory usage, the second statement which is commented out, will use more memory.
To do a simple change to your code in your question, rewrite unfiltered as this:
unfiltered = [ (myFunction(C),C) for C in originalList ]
^ ^
+---------- change these to (..) ---------+
|
v
unfiltered = ( (myFunction(C),C) for C in originalList )
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With