I was solving the Find the min
problem on facebook hackercup using python, my code works fine for sample inputs but for large inputs(10^9) it is taking hours to complete.
So, is it possible that the solution of that problem can't be computed within 6 minutes using python? Or may be my approaches are too bad?
Problem statement:
After sending smileys, John decided to play with arrays. Did you know that hackers enjoy playing with arrays? John has a zero-based index array, m
, which contains n
non-negative integers. However, only the first k
values of the array are known to him, and he wants to figure out the rest.
John knows the following: for each index i
, where k <= i < n
, m[i]
is the minimum non-negative integer which is not contained in the previous *k*
values of m
.
For example, if k = 3
, n = 4
and the known values of m
are [2, 3, 0]
, he can figure out that m[3] = 1
.
John is very busy making the world more open and connected, as such, he doesn't have time to figure out the rest of the array. It is your task to help him.
Given the first k
values of m
, calculate the nth value of this array. (i.e. m[n - 1]
).
Because the values of n
and k
can be very large, we use a pseudo-random number generator to calculate the first k
values of m
. Given positive integers a
, b
, c
and r
, the known values of m
can be calculated as follows:
m[0] = a
m[i] = (b * m[i - 1] + c) % r, 0 < i < k
Input
The first line contains an integer T (T <= 20), the number of test cases.
This is followed by T test cases, consisting of 2 lines each.
The first line of each test case contains 2 space separated integers,
n
, k
(1 <= k <= 10^5
, k < n <= 10^9
).
The second line of each test case contains 4 space separated integers
a
, b
, c
, r
(0 <= a, b, c <= 10^9, 1 <= r <= 10^9).
I tried two approaches but both failed to return results in 6 minutes, Here's my two approaches:
first:
import sys
cases=sys.stdin.readlines()
def func(line1,line2):
n,k=map(int,line1.split())
a,b,c,r =map(int,line2.split())
m=[None]*n #initialize the list
m[0]=a
for i in xrange(1,k): #set the first k values using the formula
m[i]= (b * m[i - 1] + c) % r
#print m
for j in range(0,n-k): #now set the value of m[k], m[k+1],.. upto m[n-1]
temp=set(m[j:k+j]) # create a set from the K values relative to current index
i=-1 #start at 0, lowest +ve integer
while True:
i+=1
if i not in temp: #if that +ve integer is not present in temp
m[k+j]=i
break
return m[-1]
for ind,case in enumerate(xrange(1,len(cases),2)):
ans=func(cases[case],cases[case+1])
print "Case #{0}: {1}".format(ind+1,ans)
Second:
import sys
cases=sys.stdin.readlines()
def func(line1,line2):
n,k=map(int,line1.split())
a,b,c,r =map(int,line2.split())
m=[None]*n #initialize
m[0]=a
for i in xrange(1,k): #same as above
m[i]= (b * m[i - 1] + c) % r
#instead of generating a set in each iteration , I used a
# dictionary this time.
#Now, if the count of an item is 0 then it
#means the item is not present in the previous K items
#and can be added as the min value
temp={}
for x in m[0:k]:
temp[x]=temp.get(x,0)+1
i=-1
while True:
i+=1
if i not in temp:
m[k]=i #set the value of m[k]
break
for j in range(1,n-k): #now set the values of m[k+1] to m[n-1]
i=-1
temp[m[j-1]] -= 1 #decrement it's value, as it is now out of K items
temp[m[k+j-1]]=temp.get(m[k+j-1],0)+1 # new item added to the current K-1 items
while True:
i+=1
if i not in temp or temp[i]==0: #if i not found in dict or it's val is 0
m[k+j]=i
break
return m[-1]
for ind,case in enumerate(xrange(1,len(cases),2)):
ans=func(cases[case],cases[case+1])
print "Case #{0}: {1}".format(ind+1,ans)
The last for-loop in second approach can also be written as :
for j in range(1,n-k):
i=-1
temp[m[j-1]] -= 1
if temp[m[j-1]]==0:
temp.pop(m[j-1]) #same as above but pop the key this time
temp[m[k+j-1]]=temp.get(m[k+j-1],0)+1
while True:
i+=1
if i not in temp:
m[k+j]=i
break
sample input :
5
97 39
34 37 656 97
186 75
68 16 539 186
137 49
48 17 461 137
98 59
6 30 524 98
46 18
7 11 9 46
output:
Case #1: 8
Case #2: 38
Case #3: 41
Case #4: 40
Case #5: 12
I already tried codereview, but no one replied there yet.
After at most k+1
steps, the last k+1
numbers in the array will be 0...k
(in some order). Subsequently, the sequence is predictable: m[i] = m[i-k-1]
. So the way to solve this problem is run your naive implementation for k+1
steps. Then you've got an array with 2k+1
elements (the first k
were generated from the random sequence, and the other k+1
from iterating).
Now, the last k+1 elements are going to repeat infinitely. So you can just return the result for m[n]
immediately: it's m[k + (n-k-1) % (k+1)]
.
Here's some code that implements it.
import collections
def initial_seq(k, a, b, c, r):
v = a
for _ in xrange(k):
yield v
v = (b * v + c) % r
def find_min(n, k, a, b, c, r):
m = [0] * (2 * k + 1)
for i, v in enumerate(initial_seq(k, a, b, c, r)):
m[i] = v
ks = range(k+1)
s = collections.Counter(m[:k])
for i in xrange(k, len(m)):
m[i] = next(j for j in ks if not s[j])
ks.remove(m[i])
s[m[i-k]] -= 1
return m[k + (n - k - 1) % (k + 1)]
print find_min(97, 39, 34, 37, 656, 97)
print find_min(186, 75, 68, 16, 539, 186)
print find_min(137, 49, 48, 17, 461, 137)
print find_min(1000000000, 100000, 48, 17, 461, 137)
The four cases run in 4 seconds on my machine, and the last case has the largest possible n
.
Here is my O(k) solution, which is based on the same idea as above, but runs much faster.
import os, sys
f = open(sys.argv[1], 'r')
T = int(f.readline())
def next(ary, start):
j = start
l = len(ary)
ret = start - 1
while j < l and ary[j]:
ret = j
j += 1
return ret
for t in range(T):
n, k = map(int, f.readline().strip().split(' '))
a, b, c, r = map(int, f.readline().strip().split(' '))
m = [0] * (4 * k)
s = [0] * (k+1)
m[0] = a
if m[0] <= k:
s[m[0]] = 1
for i in xrange(1, k):
m[i] = (b * m[i-1] + c) % r
if m[i] < k+1:
s[m[i]] += 1
p = next(s, 0)
m[k] = p + 1
p = next(s, p+2)
for i in xrange(k+1, n):
if m[i-k-1] > p or s[m[i-k-1]] > 1:
m[i] = p + 1
if m[i-k-1] <= k:
s[m[i-k-1]] -= 1
s[m[i]] += 1
p = next(s, p+2)
else:
m[i] = m[i-k-1]
if p == k:
break
if p != k:
print 'Case #%d: %d' % (t+1, m[n-1])
else:
print 'Case #%d: %d' % (t+1, m[i-k + (n-i+k+k) % (k+1)])
The key point here is, m[i] will never exceeds k, and if we remember the consecutive numbers we can find in previous k numbers from 0 to p, then p will never reduce.
If number m[i-k-1] is larger than p, then it's obviously we should set m[i] to p+1, and p will increase at least 1.
If number m[i-k-1] is smaller or equal to p, then we should consider whether the same number exists in m[i-k:i], if not, m[i] should set equal to m[i-k-1], if yes, we should set m[i] to p+1 just as the "m[i-k-1]-larger-than-p" case.
Whenever p is equal to k, the loop begin, and the loop size is (k+1), so we can jump out of the calculation and print out the answer now.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With