I am looking for a fast way to generate k non-negative integers smaller than 2^64, of which, in base 2, the minimal Hamming distance between any two of the numbers is as high as possible.
For example, if I were looking for k=4 numbers and they should be smaller than 2^4 they could be:
0000
0011
1100
1111
And the minimal Hamming distance would be 2.
Is there a fast algorithm to generate those numbers for a given k? I will have k's in the order of 10^4.
Alternatively an algorithm generating a set of numbers, which have all pairwise an Hamming distance greater than a given value would work fine too.
The Hamming space consists of 8 words 000, 001, 010, 011, 100, 101, 110 and 111. The codeword "000" and the single bit error words "001","010","100" are all less than or equal to the Hamming distance of 1 to "000".
Since the two codewords differ in 6 bit positions as indicated by bold letter bits, the hamming distance is 6.
The Hamming distance is a metric (in the mathematical sense) used in error correction theory to measure the distance between two codewords. In detail, the Hamming distance measures the number of different bits in two strings of the same length.
Here's a fairly trivial way. Find the smallest number of bits = b that can express k different numbers. E.g. for k=4 use b = 2 bits. Divide 64 bits up into chunks of size 2. For each chunk, give each number to be generated a different number among the 2^b >= k available.
E.g. for k=4 numbers the b bits are 00, 01, 10, 11 and here are 4 possibilities:
0000
0101
1010
1111
If you have c chunks, then each number is different from each other number in at least one bit of each of c chunks, so the minimum guaranteed hamming separation is c.
You could also permute the selection of numbers within each chunk, which would produce more random looking examples such as
0011
0101
1000
1110
The answer due to mcdowella is a very good way to rapidly generate numbers with a certain minimum Hamming distance from each other. But, it doesn't guarantee that the resulting Hamming distances are particularly large (if I understand it correctly it would guarantee a Hamming distance of at least 4 between any two of 10^4 64-bit numbers, although the actual minimum Hamming distance could be larger). If you really want to get the minimum Hamming distance as large as possible, and are willing to expend some more CPU cycles in doing so, I would recommend using a Reed-Solomon code. If you need 10^4 numbers, you can interpret each number in 1,2,...,10000 as a binary message of length 14 to be coded in binary code words of length 64. For that you would use the Reed-Solomon code [64,14,51]_2, which would guarantee a Hamming distance of at least 51 between any two of the resulting 64-bit numbers. As you can see from the linked Wikipedia article, the construction is fairly complex, though you should be able to use an open-source implementation (the Wikipedia article gives some links) so that you wouldn't have to reinvent the wheel.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With