Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

My FB HackerCup code too slow of large inputs

I was solving the Find the min problem on facebook hackercup using python, my code works fine for sample inputs but for large inputs(10^9) it is taking hours to complete.

So, is it possible that the solution of that problem can't be computed within 6 minutes using python? Or may be my approaches are too bad?

Problem statement:

After sending smileys, John decided to play with arrays. Did you know that hackers enjoy playing with arrays? John has a zero-based index array, m, which contains n non-negative integers. However, only the first k values of the array are known to him, and he wants to figure out the rest.

John knows the following: for each index i, where k <= i < n, m[i] is the minimum non-negative integer which is not contained in the previous *k* values of m.

For example, if k = 3, n = 4 and the known values of m are [2, 3, 0], he can figure out that m[3] = 1.

John is very busy making the world more open and connected, as such, he doesn't have time to figure out the rest of the array. It is your task to help him.

Given the first k values of m, calculate the nth value of this array. (i.e. m[n - 1]).

Because the values of n and k can be very large, we use a pseudo-random number generator to calculate the first k values of m. Given positive integers a, b, c and r, the known values of m can be calculated as follows:

m[0] = a
m[i] = (b * m[i - 1] + c) % r, 0 < i < k

Input

  • The first line contains an integer T (T <= 20), the number of test cases.

  • This is followed by T test cases, consisting of 2 lines each.

  • The first line of each test case contains 2 space separated integers, n, k (1 <= k <= 10^5, k < n <= 10^9).

  • The second line of each test case contains 4 space separated integers a, b, c, r (0 <= a, b, c <= 10^9, 1 <= r <= 10^9).

I tried two approaches but both failed to return results in 6 minutes, Here's my two approaches:

first:

import sys
cases=sys.stdin.readlines()
def func(line1,line2):
    n,k=map(int,line1.split())
    a,b,c,r =map(int,line2.split())
    m=[None]*n                     #initialize the list
    m[0]=a
    for i in xrange(1,k):          #set the first k values using the formula
        m[i]= (b * m[i - 1] + c) % r
    #print m    
    for j in range(0,n-k):         #now set the value of m[k], m[k+1],.. upto m[n-1]

        temp=set(m[j:k+j])     # create a set from the K values relative to current index
        i=-1                   #start at 0, lowest +ve integer
        while True:           
            i+=1
            if i not in temp:  #if that +ve integer is not present in temp
                m[k+j]=i       
                break

    return m[-1]

for ind,case in enumerate(xrange(1,len(cases),2)):
    ans=func(cases[case],cases[case+1])
    print "Case #{0}: {1}".format(ind+1,ans)  

Second:

import sys
cases=sys.stdin.readlines()
def func(line1,line2):
    n,k=map(int,line1.split())
    a,b,c,r =map(int,line2.split())
    m=[None]*n                       #initialize
    m[0]=a                  
    for i in xrange(1,k):            #same as above          
        m[i]= (b * m[i - 1] + c) % r

    #instead of generating a set in each iteration , I used a 
    # dictionary this time.
    #Now, if the count of an item is 0 then it
    #means the item is not present in the previous K items
    #and can be added as the min value


    temp={}
    for x in m[0:k]:                   
        temp[x]=temp.get(x,0)+1       

    i=-1
    while True:
            i+=1
            if i not in temp:
                m[k]=i          #set the value of m[k]
                break
    for j in range(1,n-k):      #now set the values of m[k+1] to m[n-1]
        i=-1
        temp[m[j-1]] -= 1       #decrement it's value, as it is now out of K items
        temp[m[k+j-1]]=temp.get(m[k+j-1],0)+1   # new item added to the current K-1 items

        while True:
            i+=1
            if i not in temp or temp[i]==0:  #if i not found in dict or it's val is 0
                m[k+j]=i                     
                break

    return m[-1]

for ind,case in enumerate(xrange(1,len(cases),2)):
    ans=func(cases[case],cases[case+1])
    print "Case #{0}: {1}".format(ind+1,ans)  

The last for-loop in second approach can also be written as :

for j in range(1,n-k):
    i=-1
    temp[m[j-1]] -= 1
    if temp[m[j-1]]==0:
        temp.pop(m[j-1])      #same as above but pop the key this time
    temp[m[k+j-1]]=temp.get(m[k+j-1],0)+1

    while True:
        i+=1
        if i not in temp:
            m[k+j]=i
            break

sample input :

5
97 39
34 37 656 97
186 75
68 16 539 186
137 49
48 17 461 137
98 59
6 30 524 98
46 18
7 11 9 46

output:

Case #1: 8
Case #2: 38
Case #3: 41
Case #4: 40
Case #5: 12

I already tried codereview, but no one replied there yet.

like image 466
Ashwini Chaudhary Avatar asked Jan 26 '13 11:01

Ashwini Chaudhary


2 Answers

After at most k+1 steps, the last k+1 numbers in the array will be 0...k (in some order). Subsequently, the sequence is predictable: m[i] = m[i-k-1]. So the way to solve this problem is run your naive implementation for k+1 steps. Then you've got an array with 2k+1 elements (the first k were generated from the random sequence, and the other k+1 from iterating).

Now, the last k+1 elements are going to repeat infinitely. So you can just return the result for m[n] immediately: it's m[k + (n-k-1) % (k+1)].

Here's some code that implements it.

import collections

def initial_seq(k, a, b, c, r):
    v = a
    for _ in xrange(k):
        yield v
        v = (b * v + c) % r

def find_min(n, k, a, b, c, r):
    m = [0] * (2 * k + 1)
    for i, v in enumerate(initial_seq(k, a, b, c, r)):
        m[i] = v
    ks = range(k+1)
    s = collections.Counter(m[:k])
    for i in xrange(k, len(m)):
        m[i] = next(j for j in ks if not s[j])
        ks.remove(m[i])
        s[m[i-k]] -= 1
    return m[k + (n - k - 1) % (k + 1)]


print find_min(97, 39, 34, 37, 656, 97)
print find_min(186, 75, 68, 16, 539, 186)
print find_min(137, 49, 48, 17, 461, 137)
print find_min(1000000000, 100000, 48, 17, 461, 137)

The four cases run in 4 seconds on my machine, and the last case has the largest possible n.

like image 160
Paul Hankin Avatar answered Sep 22 '22 13:09

Paul Hankin


Here is my O(k) solution, which is based on the same idea as above, but runs much faster.

import os, sys

f = open(sys.argv[1], 'r')

T = int(f.readline())

def next(ary, start):
    j = start
    l = len(ary)
    ret = start - 1
    while j < l and ary[j]:
        ret = j
        j += 1
    return ret

for t in range(T):
    n, k = map(int, f.readline().strip().split(' '))
    a, b, c, r = map(int, f.readline().strip().split(' '))

    m = [0] * (4 * k)
    s = [0] * (k+1)
    m[0] = a
    if m[0] <= k:
        s[m[0]] = 1
    for i in xrange(1, k):
        m[i] = (b * m[i-1] + c) % r
        if m[i] < k+1:
            s[m[i]] += 1

    p = next(s, 0)
    m[k] = p + 1
    p = next(s, p+2)

    for i in xrange(k+1, n):
        if m[i-k-1] > p or s[m[i-k-1]] > 1:
            m[i] = p + 1
            if m[i-k-1] <= k:
                s[m[i-k-1]] -= 1
            s[m[i]] += 1
            p = next(s, p+2)
        else:
            m[i] = m[i-k-1]
        if p == k:
            break

    if p != k:
        print 'Case #%d: %d' % (t+1, m[n-1])
    else:
        print 'Case #%d: %d' % (t+1, m[i-k + (n-i+k+k) % (k+1)])

The key point here is, m[i] will never exceeds k, and if we remember the consecutive numbers we can find in previous k numbers from 0 to p, then p will never reduce.

If number m[i-k-1] is larger than p, then it's obviously we should set m[i] to p+1, and p will increase at least 1.

If number m[i-k-1] is smaller or equal to p, then we should consider whether the same number exists in m[i-k:i], if not, m[i] should set equal to m[i-k-1], if yes, we should set m[i] to p+1 just as the "m[i-k-1]-larger-than-p" case.

Whenever p is equal to k, the loop begin, and the loop size is (k+1), so we can jump out of the calculation and print out the answer now.

like image 29
zTrix Avatar answered Sep 19 '22 13:09

zTrix