I am solving this problem: Farudulent Activity Notification on HackerRank. I am done with my code and is working, but it is inefficient as well for very large inputs.
I don't know but after all my efforts, I am able to give out good solution to a problem of a MEDIUM LEVEL but this
timeout error
happens every time for very large inputs. I have tried optimizing my code and still I get timeout errors. My agendas for this question and upcoming questions are:
- How to put efficiency for very large inputs. What kind of intellect it requires.
- How to reach to that level. What should I prepare for this.
- Code optimization
I am open to learning, and I am really desperate to learn how to write a more advanced and optimized code to make myself better. I am open to do hard work.
My Algorithm:
- For this problem we must go from
incrementing variable i
tilllen(givenArray)-d
- Take a variable for the next variable to be compared, my case
iterate
is the variable- Pass the values to the particular array to the method for counting
countFraud()
- Add it to the count variable
- Increment iterate variable
Code:
# this is for counting the trailing array
def countFraud(arr, nextNum):
count = 0
median = 0.0
d = len(arr)
#for calculating the median correctly
arr.sort()
if d%2 != 0:
median = arr[int(d/2)]
else:
n = int(d/2)
median = (arr[n] + arr[n-1]) / 2
#now the opeartion for count from the array
if nextNum >= 2*median: count += 1
return count
# Complete the activityNotifications function below.
def activityNotifications(expenditure, d):
count = 0
iterate = d
# it will go upto the len of array - d, so that it will go upto the d length
# leaving the last element everytime for the comparision
for i in range(len(expenditure)-d):
count += countFraud(expenditure[i:iterate], expenditure[iterate])
iterate += 1
return count
Now previously I was doing two loops, adding the items to the new_array
and passing it to the the countFraud()
. But now I have optimized it and made it sort of O(N)
.
I don't know but this code is not getting submitted for all TCs due to Timeout Error
. There is no problem in operation part. It is just with the efficiency of the code.
Timeout Error Input Example:
200000 10000
Link for input - Input Data
Expected Output:
633
I have read upon this article: HackerRank Environment to learn about the timing issue. For Python/Python 3 it is 10 seconds. My code is definitely taking more than that for values greater than 10^3 or 4
.
My code has successfully passed 3 TCs though. Please help. Thank You :)
Since nobody actually gave me the answer. I really have to look to the solution in the leaderboard. I found every solution too much to assimilate, just one solution to be a good one.
Disclaimer: This is some advanced coding technique so you need to have a better understanding of the language before you proceed with the solution.
The Solution's algo:
- This takes two arrays, one is t having total number of array elem and other one let us name it as
listD
just thefirst d elements
in the sorted manner- A function to return the median value with the list containing first d elements
- With the loop starting from the d and going till n-1,
if t[i] >= 2*median(): increment var noti
- Remove the first element from the
listD
using PYTHON BISECT ALGORITHM and add it thet[i]
to the listD using PYTHON INSORT ALGORITHM in sorted manner- Return noti
Code:
from bisect import bisect_left, insort_left
n, d = map(int, input().split())
t = list(map(int, input().split()))
noti = 0
listD = sorted(t[:d])
def median():
return listD[d//2] if d%2 == 1 else ((listD[d//2] + listD[d//2-1])/2)
for i in range(d,n):
if t[i] >= 2*median(): noti += 1
del listD[bisect_left(listD, t[i-d])]
insort_left(listD, t[i])
print(noti)
Here, we've used BISECT
and INSORT
, what they do is basically, return the position of the element to be added and returns the sorted list after adding the elements. So the headache of sorting the array again and again is reduced, hence the time complexity is reduced and solves all the test cases.
You can read about it here: Python Bisect and Insort Algo. Thanks, and hope it helps some one in future.
Similarly to what you were doing, this approach uses two functions: one for the activity notifications and another to find the median.
Finding the median is easy. The approach used was to check if the lookback days for median spending, d, is odd or even and, based on that information, calculate accordingly.
Then, when it comes to activityNotifications, the key point was to know that expenditure[i] is between 0 and 200, including both numbers (201).
All in all
def findMedian(counter, d):
count = 0
median = 0
if d%2 != 0:
for i in range(len(counter)):
count += counter[i]
if count > d//2:
median = i
break
else:
first = 0
second = 0
for i, _ in enumerate(counter):
count += counter[i]
if first == 0 and count >= d//2:
first = i
if second == 0 and count >= d//2 + 1:
second = i
break
median = (first+second) / 2
return median
def activityNotifications(expenditure, d):
count = 0
counter = [0]*201
for exp in range(d):
counter[expenditure[exp]] += 1
for i in range(d, len(expenditure)):
new = expenditure[i]
old = expenditure[i-d]
median = findMedian(counter, d)
if new >= 2*median:
count += 1
counter[old] -= 1
counter[new] += 1
return count
This will pass all the current 8 Test cases in HackerRank
Sources of inspiration:
Note that there's also a great answer in Code Review using pandas. While it is very interesting take to solve the problem, it won't work in HackerRank
import pandas as pd
def activityNotifications(expenditure, d):
df = pd.DataFrame(expenditure)
return (df.shift(-1) > 2 * df.rolling(d).median())[0].sum()
This one passed all test cases:-
public static double findMedian(int a[]) {
int n = a.length;
if (n % 2 != 0)
return (double) a[n / 2];
return (double) (a[(n - 1) / 2] + a[n / 2]) / 2.0;
}
static void swap(int[] arr, int i, int j) {
int temp = arr[i];
arr[i] = arr[j];
arr[j] = temp;
}
static int activityNotifications(int[] expenditure, int d) {
if (d >= expenditure.length) return 0;
int numNotifications = 0;
int[] trailingArr = new int[d];
for (int i = 0; i < trailingArr.length; i++) {
trailingArr[i] = expenditure[i];
}
Arrays.sort(trailingArr);
for (int i = d; i < expenditure.length; i++) {
double median = findMedian(trailingArr);
if (expenditure[i] >= 2.0 * median) {
numNotifications += 1;
}
int nextToRemoveElement = expenditure[i - d];
int toInsertElement = expenditure[i];
adjustTrailingArray(trailingArr, nextToRemoveElement, toInsertElement);
}
return numNotifications;
}
//This whole thing is O(d) time. Note that we are not sorting again as trailing array was already sorted
// as preprocessing and now only one new element has to find its position in sorted array.
private static void adjustTrailingArray(int[] trailingArr, int elementToRemove, int elementToInsert) {
if (elementToInsert == elementToRemove) return;
int foundIndex = 0;
//The first element of unsorted trailing array will move out of the sliding window
//Since the trailing array was sorted by us, we have lost the position of its first element in original array.
//Hence, I search linearly for it and replace it with the new element.
while (foundIndex < trailingArr.length) {
if (trailingArr[foundIndex] != elementToRemove) {
foundIndex++;
} else {
trailingArr[foundIndex] = elementToInsert;
break;
}
}
//Now we bubble the new element just inserted using bubble sort to left/right based on whether it was bigger
//or smaller than the element that got removed.
if (elementToInsert > elementToRemove) {
int i = foundIndex;
while (i < trailingArr.length - 1) {
if (trailingArr[i] > trailingArr[i + 1]) {
swap(trailingArr, i, i + 1);
i += 1;
} else break;
}
} else {
int i = foundIndex;
while (i > 0) {
if (trailingArr[i] < trailingArr[i - 1]) {
swap(trailingArr, i, i - 1);
i -= 1;
} else break;
}
}
}
I don't know why no one mentioned Median of medians algorithm, whose complexity of finding a median from an array is O(n) and its irrelevant of the order of the array.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With