Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Count ones in a segment (binary)

There is a problem which i am working on it right now and it's as the following :

there are two numbers x1 and x2 and x2 > x1.

for example x1 = 5; and x2 = 10;

and I must find the sum of ones between x1 and x2 in binary representations.

5 = 101   => 2 ones
6 = 110   => 2 ones
7 = 111   => 3 ones
8 = 1000  => 1 one
9 = 1001  => 2 ones
10= 1010  => 2 ones
so the sum will be 
sum = 2 + 2 + 3 + 1 + 2 + 2 = 12 ones;

so I have managed to make a code without even transfer the numbers to binary and wasting execution time.

I noticed that the numbers of ones in every 2^n with n >= 1 is 1 Ex : 2^1 => num of ones is 1 2^2 => 1 2^15 => 1

you can test it here if you want: https://www.rapidtables.com/convert/number/decimal-to-binary.html?x=191

and between each 2^n and 2^(n+1) there are Consecutive numbers as you will see in this example :

      num              number of ones
2^4 = 16                      1
      17                      2
      18                      2
      19                      3
      20                      2
      21                      3
      22                      3
      23                      4
      24                      2
      25                      3
      26                      3
      27                      4
      28                      3
      29                      4
      30                      4
      31                      5
2^5 = 32                      1

so I write a code that can find how many ones between 2^n and 2^(n+1)

int t;                ////turns
int bin = 1;              //// numbers of ones in the binary format ,,, and 1 for 2^5
int n1 = 32;          //// 2^5  this is just for clarification 
int n2 = 64;          //// 2^6
int *keep = malloc(sizeof(int) * (n2 - n1);             ///this is to keep numbers because 
                                      /// i'll need it later in my consecutive numbers
int i = 0;
int a = 0;
n1 = 33               //// I'll start from 33 cause "bin" of 32 is "1";

while (n1 < n2)       /// try to understand it now by yourself
{
    t = 0;
    while (t <= 3)
    {
        
        if (t == 0 || t == 2) 
            bin = bin + 1;
        
        else if (t == 1) 
            bin = bin;
        
        else if (t == 3)
        {
            bin = keep[i];
            i++;
        }
        keep[a] = bin;
        a++;
        t++;
    }
    n1++;
}

anyway as you see I am close to solve the problem but they give me huge numbers and I must find the ones between them, unfortunately I have tried a lot of methods to calculate the "sum" using this above code and I ended up with time execution problem.
Ex: 1, 1000000000 the numbers of ones is >>> 14846928141

so can you give me a little hint what to do next, thanks in advance.

I'm doing this for CodeWar challenge: https://www.codewars.com/kata/596d34df24a04ee1e3000a25/train/c

like image 904
Holy semicolon Avatar asked Sep 03 '25 13:09

Holy semicolon


2 Answers

Here is a proposal for a speedup:

  1. Find smallest y1 such that y1 >= x1 and that y1 is a power of 2

  2. Find largest y2 such that y2 <= x2 and that y2 is a power of 2

  3. Find p1 and p2 such that 2^p1=y1 and 2^p2=y2

  4. Calculate the amount of 1:s between y1 and y2

  5. Deal with x1 to y1 and y2 to x2 separately

  6. Sum the results from 4 and 5

Let's focus on step 4. Let f(n) be the sum of ones up to (2^n)-1. We can quickly realize that f(n) = 2f(n-1) + 2^(n-1) and that f(1)=1. This can be even further refined so that you don't have to deal with recursive calls, but I highly doubt it will be of any importance. Anyway, f(n) = n2^(n-1)

To get the result between y1 and y2, just use f(p2)-f(p1)

For step 5, you can likely use a modified version of step 4.

EDIT:

Maybe I was to quick to say "quickly realize". Here is a way to understand it. The amounts of ones up to 2¹-1 is easy to see. The only two binary numbers below 2¹ are 0 and 1. To get the number of ones up to 2² we take the numbers below 2¹ and make a column:

0
1

Clone it:

0
1
0
1

And put 0:s before the first half and 1:s before the second half:

00
01
10
11

To get 2³ we do the same. Clone it:

00
01
10
11
00
01
10
11

And add 0 and 1:

000
001
010
011
100
101
110
111

Now it should be easy to see why f(n) = 2*f(n-1) + 2^(n-1). The cloning gives 2f(n-1) and adding the 0:s and 1:s gives 2^(n-1). If 2^(n-1) is hard to understand, remember that 2^(n-1)=(2^n)/2. In each step we have 2^n rows and half of them get an extra 1.

EDIT 2:

When I looked at these columns, I got an idea for how to do step 5. Let's say that you want to find the amounts of 1:s from 10 to 15. Binary table for this would be:

10: 1010
11: 1011
12: 1100
13: 1101
14: 1110
15: 1111

Look at the interval 12-15. The last two digits in binary is a copy of the corresponding table for 0-3. That could be utilized, but I leave that to you.

EDIT 3:

This was a fun problem. I wrote some python code that does this. I get some problems with too many recursive calls, but that could be solved pretty easily, and it should not be too complicated to convert this to C:

def f(n):
    return n*2**(n-1)

def numberOfOnes(x):
    if(x==0):
        return 0
    p = floor(log(x,2))
    a = f(p)
    b = numberOfOnes(x-2**p)
    c = x - 2**p +1
    return a+b+c

EDIT 4, Oct 31, 2024, @abhijit-sarkar:

Let f(n) be the number of 1s until 2^n where n >= 1. Since we clone the numbers from the previous step, and then prepend 1s to half of them, f(n) = f(n-1) + 2^(n-1). The 2nd factor comes from the fact that there are 2^n numbers from 0 until 2^n, and we append 1 to half of them.

f(n) = 2 * f(n-1) + 2^(n-1)
     = 2 * 2 * f(n-2) + 2 * 2^(n-2) + 2^(n-1)  # Expanding f(n-1)
     = 2 * 2 * 2 * f(n-3) + 2 * 2 * 2^(n-3) + 2 * 2^(n-2) + 2^(n-1)  # Expanding f(n-2)
     = 2^(n-1) + (n-1) * 2^(n-1)  # When i=n-1, f(1)=1
     = n * 2^(n-1)

This formula is used to calculate a.

-- End EDIT 4

I made an image so that you easier can understand what a, b and c does in the function numberOfOnes if we call it with numberOfOnes(12):

enter image description here

I have finally converted it to C. Of course I have used some code I found here on Stack overflow. I borrowed code for integer versions of log2 and pow, and made some small modifications.

This code is probably possible to optimize further, but it is not necessary. It is lighting fast, and I was not able to measure it's performance.

#include <stdio.h>
#include <math.h>
#include <assert.h>
#include <stdint.h>
#include <inttypes.h>
typedef uint64_t T;

// https://stackoverflow.com/a/11398748/6699433                                                                    
const int tab64[64] = {
    63,  0, 58,  1, 59, 47, 53,  2,
    60, 39, 48, 27, 54, 33, 42,  3,
    61, 51, 37, 40, 49, 18, 28, 20,
    55, 30, 34, 11, 43, 14, 22,  4,
    62, 57, 46, 52, 38, 26, 32, 41,
    50, 36, 17, 19, 29, 10, 13, 21,
    56, 45, 25, 31, 35, 16,  9, 12,
    44, 24, 15,  8, 23,  7,  6,  5};

T log2_64 (T value) {
    value |= value >> 1;
    value |= value >> 2;
    value |= value >> 4;
    value |= value >> 8;
    value |= value >> 16;
    value |= value >> 32;
    return tab64[((T)((value - (value >> 1))*0x07EDD5E59A4E28C2)) >> 58];
}

// https://stackoverflow.com/a/101613/6699433                                                                      
T ipow(T base, T exp) {
    T result = 1;
    for (;;) {
        if (exp & 1) result *= base;
        exp >>= 1;
        if (!exp) break;
        base *= base;
    }
    return result;
}

T f(T n) { return ipow(2,n-1)*n; }

T numberOfOnes(T x) {
    if(x==0) return 0;
    
    T p = floor(log2(x));
    T a = f(p);
    T e = ipow(2,p);
    T b = numberOfOnes(x-e);
    T c = x - e + 1;
    return a+b+c;
}

void test(T u, T v) {
    assert(numberOfOnes(u) == v);
}

int main() {
    // Sanity checks
    test(0,0);
    test(1,1);
    test(2,2);
    test(3,4);
    test(4,5);
    test(5,7);
    test(6,9);
    // Test case provided in question
    test(1000000000,14846928141);
}
like image 117
klutt Avatar answered Sep 05 '25 10:09

klutt


You can solve this problem efficiently by computing the number of bits in the range 1 to n and use a simple subtraction for any subrange. There is a rather simple algorithm that runs in logarithmic time:

#include <stdio.h>
#include <stdlib.h>

/* compute the number of bits set in all numbers between `0` and `x` included */
unsigned long long bitpop(unsigned long long x) {
    unsigned long long count = 0;
    unsigned long long p = 1;     // slice width: 2**(n + 1)
    unsigned long long limit = x + 1;
    for (int n = 0; n < 64; n++) {
        /* count the number of numbers below limit that have the n-th bit set:
         * split the set of numbers into complete slices of 2**(n+1)
         * numbers and a potentially empty last slice.
         */
        unsigned long long complete_slices, last_slice;
        p += p;
        complete_slices = limit / p;
        last_slice = limit % p;
        /* half the numbers in complete slices have the n-th bit set */
        count += complete_slices * p / 2;
        if (last_slice > p / 2) {
            /* only the numbers above p / 2 in the last partial slice have the n-th bit set */
            count += last_slice - p / 2;
        }
        /* stop this loop when p >= limit, ie: no complete slices */ 
        if (complete_slices == 0)
            break;
    }
    return count;
}

int main(int argc, char *argv[]) {
    unsigned long long from = 1000, to = 2000;
    unsigned long long count = 0;

    if (argc > 1) {
        to = from = strtoull(argv[1], NULL, 0);
        if (argc > 2) {
            to = strtoull(argv[2], NULL, 0);
        }
    }

    if (to < from) {
        unsigned long long tmp = to;
        to = from;
        from = tmp;
    }
    count = bitpop(to);
    if (from) {
        count -= bitpop(from - 1);
    }
    printf("bitpop from %llu to %llu: %llu (0x%llx)\n", from, to, count, count);
    return 0;
}
like image 29
chqrlie Avatar answered Sep 05 '25 11:09

chqrlie