Consider the set, S, of all binary vectors of length n where each contains exactly m ones; so there are n-m zeros in each vector.
My goal is to construct a number, k, of vectors from S such that these vectors are as different as possible from each other.
As a simple example, take n=4, m=2 and k=2, then a possible solution is: [1,1,0,0] and [0,0,1,1].
It seems that this is an open problem in the coding theory literature (?).
Is there any way (i.e. algorithm) to find a suboptimal yet good solution ?
Is Hamming distance the right performance measure to use in this case ?
Some thoughts:
In this paper, the authors propose a couple of algorithms to find the subset of vectors such that the pairwise Hamming distance is >= a certain value, d.
I have implemented the Random approach as follows: take a set SS, which is initialized by any vector from S. Then, I consider the remaining vectors
in S. For each of these vectors, I check if this vector has at least a distance d with respect to each vector in SS. If so, then it is added to SS.
By taking the maximal possible d, if the size of SS is >= k, then I consider SS as an optimal solution, and I choose any subset of k vectors from SS.
Using this approach, I think that the resulting SS will depend on the identity of the initial vector in SS; i.e. there are multiple solutions(?).
But how to proceed if the size of SS is < k ?
From the proposed algorithms in the paper, I have only understood the Random one. I am interested in the Binary lexicographic search (section 2.3) but I don't know how to implement it (?).
Maybe you find this paper useful (I wrote it). It contains algorithms that efficiently create permutations of bitstrings.
For example, the inc()
algorithm:
long inc(long h_in , long m0 , long m1) {
long h_out = h_in | (~m1); //pre -mask
h_out ++;
// increment
h_out = (h_out & m1) | m0; //post -mask
return h_out;
}
It takes an input h_in
and return the next higher value that is at least 1 larger than h_in
and 'matches' the boundaries m0
and m1
. 'Matching' means: the result has a 1
whereever m0
has a 1
, and the result has a 0
whereever m1
has a 0
. Not that h_in
MUST BE a valid value with regards to mo
and m1
! Also, note that m0
has to be bitwise smaller than m1
, which means that m0
cannot have a 1
in a position where m1
has a 0
.
This could be used to generate permutations with a minimum edit distance to a given input string:
Let's assume you have 0110
, you first NEGATE it to 1001
(edit distance = k).
Set 'm0=1001' and 'm1=1001'. Using this would result only on '1001' itself.
Now to get all values with edit distance k-1, you can do the following, simply flip one of the bits of m0
or m1
, then inc()
will return an ordered series of all bitstring that have a difference of k
or k-1
.
I know, not very interesting yet, but you can modify up to k
bits, and inc()
will always return all permutations with the maximum allowed edit difference with regard to m0
and m1
.
Now, to get all permutations, you would have to re-run the algorithm with all possibly combinations of m0
and m1
.
Example: To get all possible permutations of 0110
with edit distance 2
, you would have to run inc() with the following permutations of m0=0110
and m1=0110
(to get permutations, a bit position has to be expanded, meaning that m0
is set to 0
and m1
is set to 1
:
m0=0010
and m1=1110
m0=0100
and m1=1110
m0=0110
and m1=1111
m0=0000
and m1=0110
m0=0010
and m1=0111
m0=0100
and m1=0111
As starting value for h_0
I suggest to use simply m0
. Iteration can be aborted once inc()
returns m1
.
Summary
The above algorithm generates in O(x) all x
binary vectors that differ in at least y
bits (configurable) from a given vector v
.
Using your definition of n
=number of bits in a vector v
, setting y=n
generates exactly 1 vector which is the exact opposite of the input vector v
. For y=n-1
, it will generate n+1
vectors: n
vectors which differ in all but one bits and 1 vector that differs in all bits. And so on different values of y
.
**EDIT: Added summary and replaced erroneous 'XOR' with 'NEGATE' in the text above.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With