Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Timeout Error in Fraudulent Activity Notification HackerRank

I am solving this problem: Farudulent Activity Notification on HackerRank. I am done with my code and is working, but it is inefficient as well for very large inputs.

I don't know but after all my efforts, I am able to give out good solution to a problem of a MEDIUM LEVEL but this timeout error happens every time for very large inputs. I have tried optimizing my code and still I get timeout errors. My agendas for this question and upcoming questions are:

  • How to put efficiency for very large inputs. What kind of intellect it requires.
  • How to reach to that level. What should I prepare for this.
  • Code optimization

I am open to learning, and I am really desperate to learn how to write a more advanced and optimized code to make myself better. I am open to do hard work.

My Algorithm:

  1. For this problem we must go from incrementing variable i till len(givenArray)-d
  2. Take a variable for the next variable to be compared, my case iterate is the variable
  3. Pass the values to the particular array to the method for counting countFraud()
  4. Add it to the count variable
  5. Increment iterate variable

Code:

# this is for counting the trailing array
def countFraud(arr, nextNum):
    count = 0
    median = 0.0
    d = len(arr)
    #for calculating the median correctly
    arr.sort()
    if d%2 != 0:
        median = arr[int(d/2)]
    else:
        n = int(d/2)
        median = (arr[n] + arr[n-1]) / 2

    #now the opeartion for count from the array
    if nextNum >= 2*median: count += 1
    return count

# Complete the activityNotifications function below.
def activityNotifications(expenditure, d):
    count = 0
    iterate = d
    # it will go upto the len of array - d, so that it will go upto the d length
    # leaving the last element everytime for the comparision
    for i in range(len(expenditure)-d):
        count += countFraud(expenditure[i:iterate], expenditure[iterate])
        iterate += 1
    return count

Now previously I was doing two loops, adding the items to the new_array and passing it to the the countFraud(). But now I have optimized it and made it sort of O(N).

I don't know but this code is not getting submitted for all TCs due to Timeout Error. There is no problem in operation part. It is just with the efficiency of the code.

Timeout Error Input Example:

200000 10000

Link for input - Input Data

Expected Output:

633

I have read upon this article: HackerRank Environment to learn about the timing issue. For Python/Python 3 it is 10 seconds. My code is definitely taking more than that for values greater than 10^3 or 4.

My code has successfully passed 3 TCs though. Please help. Thank You :)

like image 392
Alok Avatar asked Oct 06 '19 12:10

Alok


4 Answers

Since nobody actually gave me the answer. I really have to look to the solution in the leaderboard. I found every solution too much to assimilate, just one solution to be a good one.

Disclaimer: This is some advanced coding technique so you need to have a better understanding of the language before you proceed with the solution.

The Solution's algo:

  1. This takes two arrays, one is t having total number of array elem and other one let us name it as listD just the first d elements in the sorted manner
  2. A function to return the median value with the list containing first d elements
  3. With the loop starting from the d and going till n-1, if t[i] >= 2*median(): increment var noti
  4. Remove the first element from the listD using PYTHON BISECT ALGORITHM and add it the t[i] to the listD using PYTHON INSORT ALGORITHM in sorted manner
  5. Return noti

Code:

from bisect import bisect_left, insort_left

n, d = map(int, input().split())
t = list(map(int, input().split()))
noti = 0

listD = sorted(t[:d])

def median():
  return listD[d//2] if d%2 == 1 else ((listD[d//2] + listD[d//2-1])/2)

for i in range(d,n):
  if t[i] >= 2*median(): noti += 1
  del listD[bisect_left(listD, t[i-d])]
  insort_left(listD, t[i])
print(noti)

Here, we've used BISECT and INSORT, what they do is basically, return the position of the element to be added and returns the sorted list after adding the elements. So the headache of sorting the array again and again is reduced, hence the time complexity is reduced and solves all the test cases.

You can read about it here: Python Bisect and Insort Algo. Thanks, and hope it helps some one in future.

like image 70
Alok Avatar answered Nov 05 '22 23:11

Alok


Similarly to what you were doing, this approach uses two functions: one for the activity notifications and another to find the median.

Finding the median is easy. The approach used was to check if the lookback days for median spending, d, is odd or even and, based on that information, calculate accordingly.

Then, when it comes to activityNotifications, the key point was to know that expenditure[i] is between 0 and 200, including both numbers (201).

All in all

def findMedian(counter, d):
    count = 0
    median = 0

    if d%2 != 0:
        for i in range(len(counter)):
            count += counter[i]

            if count > d//2:
                median = i
                break
            
    else:
        first = 0
        second = 0

        for i, _ in enumerate(counter):
            count += counter[i]
            
            if first == 0 and count >= d//2:
                first = i
                
            if second == 0 and count >= d//2 + 1:
                second = i
                break
            
        median = (first+second) / 2
        
    return median


def activityNotifications(expenditure, d):
    count = 0
    counter = [0]*201
    
    for exp in range(d):
        counter[expenditure[exp]] += 1

    for i in range(d, len(expenditure)):
        new = expenditure[i]
        old = expenditure[i-d]
        median = findMedian(counter, d)
        
        if new >= 2*median:
            count += 1
            
        counter[old] -= 1
        counter[new] += 1
        
    return count

This will pass all the current 8 Test cases in HackerRank

It works Fraudulent Activity Notifications with Python

Sources of inspiration:

  • https://www.snoopybox.co.kr/1823
  • https://sungjun221.github.io/algorithm/fraudulent-activity-notifications/

Note that there's also a great answer in Code Review using pandas. While it is very interesting take to solve the problem, it won't work in HackerRank

import pandas as pd

def activityNotifications(expenditure, d):
    df = pd.DataFrame(expenditure)
    return (df.shift(-1) > 2 * df.rolling(d).median())[0].sum()
like image 37
Tiago Martins Peres Avatar answered Nov 05 '22 22:11

Tiago Martins Peres


This one passed all test cases:-

    public static double findMedian(int a[]) {
        int n = a.length;
        if (n % 2 != 0)
            return (double) a[n / 2];

        return (double) (a[(n - 1) / 2] + a[n / 2]) / 2.0;
    }

    static void swap(int[] arr, int i, int j) {
        int temp = arr[i];
        arr[i] = arr[j];
        arr[j] = temp;
    }

    static int activityNotifications(int[] expenditure, int d) {
        if (d >= expenditure.length) return 0;

        int numNotifications = 0;
        int[] trailingArr = new int[d];
        for (int i = 0; i < trailingArr.length; i++) {
            trailingArr[i] = expenditure[i];
        }
        Arrays.sort(trailingArr);
        for (int i = d; i < expenditure.length; i++) {
            double median = findMedian(trailingArr);
            if (expenditure[i] >= 2.0 * median) {
                numNotifications += 1;
            }
            int nextToRemoveElement = expenditure[i - d];
            int toInsertElement = expenditure[i];
            adjustTrailingArray(trailingArr, nextToRemoveElement, toInsertElement);
        }
        return numNotifications;
    }

    //This whole thing is O(d) time. Note that we are not sorting again as trailing array was already sorted
    // as preprocessing and now only one new element has to find its position in sorted array.

    private static void adjustTrailingArray(int[] trailingArr, int elementToRemove, int elementToInsert) {
        if (elementToInsert == elementToRemove) return;
        int foundIndex = 0;

        //The first element of unsorted trailing array will move out of the sliding window
        //Since the trailing array was sorted by us, we have lost the position of its first element in original array.
        //Hence, I search linearly for it and replace it with the new element.

        while (foundIndex < trailingArr.length) {
            if (trailingArr[foundIndex] != elementToRemove) {
                foundIndex++;
            } else {
                trailingArr[foundIndex] = elementToInsert;
                break;
            }
        }

        //Now we bubble the new element just inserted using bubble sort to left/right based on whether it was bigger
        //or smaller than the element that got removed.

        if (elementToInsert > elementToRemove) {
            int i = foundIndex;
            while (i < trailingArr.length - 1) {
                if (trailingArr[i] > trailingArr[i + 1]) {
                    swap(trailingArr, i, i + 1);
                    i += 1;
                } else break;
            }
        } else {
            int i = foundIndex;
            while (i > 0) {
                if (trailingArr[i] < trailingArr[i - 1]) {
                    swap(trailingArr, i, i - 1);
                    i -= 1;
                } else break;
            }
        }
    }
like image 33
Prateek Bhuwania Avatar answered Nov 05 '22 21:11

Prateek Bhuwania


I don't know why no one mentioned Median of medians algorithm, whose complexity of finding a median from an array is O(n) and its irrelevant of the order of the array.

like image 1
Piyush Kumar Avatar answered Nov 05 '22 22:11

Piyush Kumar