I've been reading through some of the MIT opencourseware quizes and they had a question that goes like this:
6) Consider the two functions specified below that are used to play a “guess a number game.”
def cmpGuess(guess):
"""Assumes that guess is an integer in range(maxVal). returns -1 if guess is < than the magic number, 0 if it is equal to the magic number and 1 if it is greater than the magic number."""
def findNumber(maxVal):
"""Assumes that maxVal is a positive integer. Returns a number, num, such that cmpGuess(num) == 0."""
Write a Python implementation of findNumber that guesses the magic number defined by cmpGuess. Your program should have the lowest time complexity possible. """
Here's my implementation:
def find(maxVal):
ceiling = maxVal
floor = 0
while 1:
med = ((ceiling + floor) / 2)
res = cmp(med)
if res == 1:
ceiling = med
elif res == -1:
floor = med
else:
return med
And here's the answer sheet implementation provided by the teacher:
def findNumber(maxVal):
""" Assumes that maxVal is a positive integer. Returns a number, num, such that cmpGuess(num) == 0 """
s = range(0, maxVal)
return bsearch(s, 0, len(s) -1)
def bsearch(s, first, last):
if (last-first) < 2:
if cmp(s[first]) == 0:
return first
else:
return last
mid = first + (last -first)/2
if cmp(s[mid]) == 0:
return s[mid]
if cmp(s[mid]) == -1:
return bsearch(s, first, mid -1)
return bsearch(s, mid + 1, last)
This is the cmp function used by both our functions, according to spec:
def cmp(guess):
if guess > num:
return 1
elif guess < num:
return -1
else: return 0
One major difference is that my solution is iterative and that of the teacher is recursive. I timed 1000 runs of both our functions with a maxVal = 1,000,000. Here's the timing snippet:
t = timeit.Timer('find(1000000)', 'from __main__ import find,cmp')
t1 = timeit.Timer('findNumber(1000000)', 'from __main__ import findNumber,bsearch')
print str(t.timeit(1000))
print str(t1.timeit(1000))
My run took: 0.000621605333677s
The teachers run took: 29.627s
This can't possibly be right. I timed this several times in a row, and in all instances the second function came in with ridiculous 30s results. I copy pasted the solution function straight from the document provided by MIT. Any ideas?
The most obvious thing I can see is that every time you call the teacher's function it creates a list of 1,000,000 integers (assuming Python 2.x), and then when it returns it destroys that list again.
That is going to take a while.
You said that you tested both scripts to make sure that they give the same answer, but they don't. The teacher's script, as you have it written, will always return the last element. This is because of these lines:
def bsearch(s, first, last):
if (last-first) < 2:
if cmp(s[first]) == 0:
return first
else:
return last
should be
def bsearch(s, first, last):
if (last-first) < 2:
if cmp(s[first]) == 0:
return first
else:
return last
In other words, there is an indentation error, which I realize is also in the pdf on MIT's site. Second, the teacher's script still won't work correctly (it will always return the last number) because his binary search is going in the wrong direction with the result of cmp. This is a fault with the teacher's solution according to the spec and easily fixed by changing the line
if cmp(s[mid]) == -1:
to
if cmp(s[mid]) == 1:
This wasn't really an answer to the question of slowness which is obviously (as has been pointed out) due to the allocation of O(N) memory (this is the big one), recursion, list look-ups, calling cmp potentially twice for each bsearch call rather than once and storing the result, and having to check explicitly if (last - first) < 2
for each call to bsearch
because he is using the variable last
as being included in the range of possible values rather than it being 1 more than the highest possible value. By the way, your own code can be made a tad bit faster by changing the line:
floor = med
to
floor = med + 1
so that you are excluding med from the search range. You already know that it isn't the value based on the cmp result. By the way, I wouldn't use cmp
as my own function name because it is a Python built-in function (I know that it is cmpGuess
in the spec).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With