Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

More bit-twiddling: Efficiently implementing a binary search over a fixed-size array

Tags:

c

optimization

Once again, I have a problem for which I would like to shave off nanoseconds. I have a small, constant array and I would like to search it to see if a given number is a member*.

Input: A 64-bit number n.

Output: True if n is in the array, false if n is not.

What are good techniques for making binary searches fast, given the possibility to optimize for the specific elements and their distribution.

Specifics

I have an array with around 136 members (though see below: there's some flexibility) to search. The members are not equally distributed through the range: they cluster toward the beginning and end of the range. The input numbers can be assumed to the chosen with uniform probability. It's probably worthwhile to take advantage of this irregularity.

Here's a sample picture of the distribution for the 136-element array. Note that only 12 of the 136 elements are between 1% and 99% of the range; the balance are below 1% or over 99%.


(source: crg4.com)

I assume that branch misprediction will be the largest cost of any implementation. I'd be happy to be proved wrong.

Notes

* Actually, I have two arrays. Actually actually, I have a choice of what arrays to use: efficiency suggests that the first should have perhaps 10-40 members, while the second can have no more than (exactly) 136 members. My problem gives real flexibility in selecting sizes, along with limited freedom to decide precisely which members to use. If a method performs better with certain sizes or restrictions, please mention this because I may be able to use it. All things equal, I'd prefer to have the second array as large as possible. For reasons unrelated to the binary search I may need to reduce the size of the second array to <= 135 or <= 66 (this is related to the difficulty of determining the input number, which depends on the array selected).

Here's one of the possible arrays, if it helps in testing ideas. (This pretty well reveals my purpose...!) Don't jump to unwarranted conclusions on the basis of the first few members, though.

0, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181, 6765, 10946, 17711, 28657, 46368, 75025, 121393, 196418, 317811, 514229, 832040, 1346269, 2178309, 3524578, 5702887, 9227465, 14930352, 24157817, 39088169, 63245986, 102334155, 165580141, 267914296, 433494437, 701408733, 1134903170, 1836311903, 2971215073, 4807526976, 7778742049, 12586269025, 20365011074, 32951280099, 53316291173, 86267571272, 139583862445, 225851433717, 365435296162, 591286729879, 956722026041, 1548008755920, 2504730781961, 4052739537881, 6557470319842, 10610209857723, 17167680177565, 27777890035288, 44945570212853, 72723460248141, 117669030460994, 190392490709135, 308061521170129, 498454011879264, 806515533049393, 1304969544928657, 2111485077978050, 3416454622906707, 5527939700884757, 8944394323791464, 14472334024676221, 23416728348467685, 37889062373143906, 61305790721611591, 99194853094755497, 160500643816367088, 259695496911122585, 420196140727489673, 679891637638612258, 1100087778366101931, 1779979416004714189, 2880067194370816120, 4660046610375530309, 7540113804746346429, 9320093220751060618, 9999984858389672876, 10259680355300795461, 10358875208395550958, 10396764270768694864, 10411236604793371085, 10416764544494255842, 10418876029572233892, 10419682545105283285, 10419990606626453414, 10420108275656914408, 10420153221227127261, 10420170388907304826, 10420176946377624668, 10420179451108406629, 10420180407830432670, 10420180773265728832, 10420180912849591277, 10420180966165882450, 10420180986530893524, 10420180994309635573, 10420180997280850646, 10420180998415753816, 10420180998849248253, 10420180999014828394, 10420180999078074380, 10420180999102232197, 10420180999111459662, 10420180999114984240, 10420180999116330509, 10420180999116844738, 10420180999117041156, 10420180999117116181, 10420180999117144838, 10420180999117155784, 10420180999117159965, 10420180999117161562, 10420180999117162172, 10420180999117162405, 10420180999117162494, 10420180999117162528, 10420180999117162541, 10420180999117162546, 10420180999117162548

I will initially run the program on a Phenom II x4, but optimizations for other architectures are welcome.

like image 646
Charles Avatar asked Mar 08 '11 22:03

Charles


2 Answers

If all you're interested in is member/not-member, rather than location, you could eliminate some conditional branches with the following arrangement:

bool b = false;
b |= (n == x[i]);
b |= (n == x[i+1]);
// ... etc. ...

Clearly, you probably don't want to do this for all 136 entries. But there may be a sweet spot where you are able to mix a coarse-grained binary search to first locate which batch of e.g. 4 elements n could be in, and then switch to the above approach.

like image 171
Oliver Charlesworth Avatar answered Oct 07 '22 00:10

Oliver Charlesworth


Since you know the array data at compile time, you might consider using a hash instead of a binary search. A carefully chosen hash could be faster, especially if you can find a simple hash function that is collision-free for your data, and fits within your memory constraints.

Edit: to further explain...

You hash the 64-bit value to a smaller value, which is an index into an array. In your case, you would aim to have a collision-free hash, so your array would just be an array of a) for hits, the 1 valid value that hashes to that array index, or b) for misses, an invalid value.

You pick the hash function that suits your purposes. In your case, the main parameters are:

  • the size of the hash output, which determines the size of the array, and also affects the probability of a collision.
  • a function which is as simple (fast) as possible.
  • a function which doesn't produce collisions.

Assuming no collisions, you use it at run-time as follows:

  1. Hash your input.
  2. Use the resulting hash value to index into your array.
  3. Test whether the array value matches your input.

If your hash function produces collisions, your choices are:

  • Make your hash output bigger, to reduce the chance of collisions.
  • Try to find a different hash function that doesn't produce collisions with your particular data set.
  • Make an array of something slightly more complicated, like a linked list of valid 64-bit inputs that hash to that value. But this slows you down, since step 3 above becomes more complicated: you have to scan a linked list, rather than just testing a single value.
like image 21
Craig McQueen Avatar answered Oct 07 '22 02:10

Craig McQueen