Consider the set, S, of all binary vectors of length n where each contains exactly m ones; so there are n-m zeros in each vector.
My goal is to construct a number, k, of vectors from S such that these vectors are as different as possible from each other.
As a simple example, take n=4, m=2 and k=2, then a possible solution is: [1,1,0,0] and [0,0,1,1].
It seems that this is an open problem in the coding theory literature (?).
Is there any way (i.e. algorithm) to find a suboptimal yet good solution ?
Is Hamming distance the right performance measure to use in this case ?
Some thoughts:
In this paper, the authors propose a couple of algorithms to find the subset of vectors such that the pairwise Hamming distance is >= a certain value, d.
I have implemented the Random approach as follows: take a set SS, which is initialized by any vector from S. Then, I consider the remaining vectors
in S. For each of these vectors, I check if this vector has at least a distance d with respect to each vector in SS. If so, then it is added to SS.
By taking the maximal possible d, if the size of SS is >= k, then I consider SS as an optimal solution, and I choose any subset of k vectors from SS.
Using this approach, I think that the resulting SS will depend on the identity of the initial vector in SS; i.e. there are multiple solutions(?).
But how to proceed if the size of SS is < k ?
From the proposed algorithms in the paper, I have only understood the Random one. I am interested in the Binary lexicographic search (section 2.3) but I don't know how to implement it (?).
Maybe you find this paper useful (I wrote it). It contains algorithms that efficiently create permutations of bitstrings.
For example, the inc() algorithm:
long inc(long h_in , long m0 , long m1) {
long h_out = h_in | (~m1); //pre -mask
h_out ++;
// increment
h_out = (h_out & m1) | m0; //post -mask
return h_out;
}
It takes an input h_in and return the next higher value that is at least 1 larger than h_in and 'matches' the boundaries m0 and m1. 'Matching' means: the result has a 1 whereever m0 has a 1, and the result has a 0 whereever m1 has a 0. Not that h_in MUST BE a valid value with regards to mo and m1! Also, note that m0 has to be bitwise smaller than m1, which means that m0 cannot have a 1 in a position where m1 has a 0.
This could be used to generate permutations with a minimum edit distance to a given input string:
Let's assume you have 0110, you first NEGATE it to 1001 (edit distance = k).
Set 'm0=1001' and 'm1=1001'. Using this would result only on '1001' itself.
Now to get all values with edit distance k-1, you can do the following, simply flip one of the bits of m0 or m1, then inc() will return an ordered series of all bitstring that have a difference of k or k-1.
I know, not very interesting yet, but you can modify up to k bits, and inc() will always return all permutations with the maximum allowed edit difference with regard to m0 and m1.
Now, to get all permutations, you would have to re-run the algorithm with all possibly combinations of m0 and m1.
Example: To get all possible permutations of 0110 with edit distance 2, you would have to run inc() with the following permutations of m0=0110 and m1=0110 (to get permutations, a bit position has to be expanded, meaning that m0 is set to 0 and m1 is set to 1:
m0=0010 and m1=1110 m0=0100 and m1=1110 m0=0110 and m1=1111 m0=0000 and m1=0110 m0=0010 and m1=0111 m0=0100 and m1=0111 As starting value for h_0 I suggest to use simply m0. Iteration can be aborted once inc() returns m1.
Summary
The above algorithm generates in O(x) all x binary vectors that differ in at least y bits (configurable) from a given vector v.
Using your definition of n=number of bits in a vector v, setting y=n generates exactly 1 vector which is the exact opposite of the input vector v. For y=n-1, it will generate n+1 vectors: n vectors which differ in all but one bits and 1 vector that differs in all bits. And so on different values of y.
**EDIT: Added summary and replaced erroneous 'XOR' with 'NEGATE' in the text above.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With