I have a python
function (call it myFunction
) that gets as input a list of numbers, and, following a complex calculation, returns back the result of the calculation (which is a number).
The function looks like this:
def myFunction( listNumbers ):
# initialize the result of the calculation
calcResult = 0
# looping through all indices, from 0 to the last one
for i in xrange(0, len(listNumbers), 1):
# some complex calculation goes here, changing the value of 'calcResult'
# let us now return the result of the calculation
return calcResult
I tested the function, and it works as expected.
Normally, myFunction
is provided a listNumbers
argument that contains 5,000,000 elements in it. As you may expect, the calculation takes time. I need this function to run as fast as possible
Here comes the challenge: assume that the time now is 5am, and that listNumbers
contains just 4,999,999 values in it. Meaning, its LAST VALUE is not yet available. This value will only be available at 6am.
Obviously, we can do the following (1st mode): wait until 6am. Then, append the last value into listNumbers
, and then, run myFunction
. This solution works, BUT it will take a while before myFunction
returns our calculated result (as we need to process the entire list of numbers, from the first element on). Remember, our goal is to get the results as soon as possible past 6am.
I was thinking about a more efficient way to solve this (2nd mode): since (at 5am) we have listNumbers
with 4,999,999 values in it, let us immediately start running myFunction
. Let us process whatever we can (remember, we don't have the last piece of data yet), and then -- exactly at 6am -- 'plug in' the new data piece -- and generate the computed result. This should be significantly faster, as most of the processing will be done BEFORE 6am, hence, we will only have to deal with the new data -- which means the computed result should be available immediately after 6am.
Let's suppose that there's no way for us to inspect the code of myFunction
or modify it. Is there ANY programming technique / design idea that will allow us to take myFunction
AS IS, and do something with it (without changing its code) so that we can have it operate in the 2nd mode, rather than the 1st one?
Please do not suggest using c++
/ numpy + cython
/ parallel computing
etc to solve this problem. The goal here is to see if there's any programming technique or design pattern that can be easily used to solve such problems.
You could use a generator as an input. The generator will only return when there is data available to process.
Update: thanks for the brilliant comment, I wanted to remove this entry :)
class lazylist(object):
def __init__(self):
self.cnt = 0
self.length = 5000000
def __iter__(self):
return self
def __len__(self):
return self.length
def next(self):
if self.cnt < self.length:
self.cnt += 1
#return data here or wait for it
return self.cnt #just return a counter for this example
else:
raise StopIteration()
def __getitem__(self, i):
#again, block till you have data.
return i+1 #simple counter
myFunction(lazylist())
Update: As you can see from the comments and other solutions your loop construct and len
call causes a lot of headaches, if you can eliminate it you can use a lot more elegant solution. for e in li
or enumerate
is the pythonic way to go.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With