Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best (fastest) way to find the number most frequently entered in C?

Tags:

c

optimization

Well, I think the title basically explains my doubt. I will have n numbers to read, this n numbers go from 1 to x, where x is at most 105. What is the fastest (less possible time to run it) way to find out which number were inserted more times? That knowing that the number that appears most times appears more than half of the times.

What I've tried so far:

//for (1<=x<=10⁵)
int v[100000+1];

//multiple instances , ends when n = 0
while (scanf("%d", &n)&&n>0) {
    zerofill(v);
    for (i=0; i<n; i++) {
        scanf("%d", &x);
        v[x]++;
        if (v[x]>n/2)
            i=n;
    }
    printf("%d\n", x);
}

Zero-filling a array of x positions and increasing the position vector[x] and at the same time verifying if vector[x] is greater than n/2 it's not fast enough.

Any idea might help, thank you.

Observation: No need to care about amount of memory used.

like image 896
Thums Avatar asked Mar 21 '23 11:03

Thums


2 Answers

The trivial solution of keeping a counter array is O(n) and you obviously can't get better than that. The fight is then about the constants and this is where a lot of details will play the game, including exactly what are the values of n and x, what kind of processor, what kind of architecture and so on.

On the other side this seems really the "knockout" problem, but that algorithm will need two passes over the data and an extra conditional, thus in practical terms in the computers I know it will be most probably slower than the array of counters solutions for a lot of n and x values.

The good point of the knockout solution is that you don't need to put a limit x on the values and you don't need any extra memory.

If you know already that there is a value with the absolute majority (and you simply need to find what is this value) then this could make it (but there are two conditionals in the inner loop):

  • initialize count = 0
  • loop over all elements
  • if count is 0 then set champion = element and count = 1
  • else if element != champion decrement count
  • else increment count

at the end of the loop your champion will be the value with the absolute majority of elements, if such a value is present.

But as said before I'd expect a trivial

for (int i=0,n=size; i<n; i++) {
    if (++count[x[i]] > half) return x[i];
}

to be faster.

EDIT

After your edit seems you're really looking for the knockout algorithm, but caring about speed that's probably still the wrong question with modern computers (100000 elements is nothing even for a nail-sized single chip today).

like image 165
6502 Avatar answered Mar 24 '23 02:03

6502


I think you can create a max heap for the count of number you read,and use heap sort to find all the count which greater than n/2

like image 31
sundq Avatar answered Mar 24 '23 02:03

sundq