Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

A fast way to find an all zero answer

For every array of length n+h-1 with values from 0 and 1, I would like to check if there exists another non-zero array of length n with values from -1,0,1 so that all the h inner products are zero. My naive way to do this is

import numpy as np
import itertools
(n,h)= 4,3
for longtuple in itertools.product([0,1], repeat = n+h-1):
    bad = 0
    for v in itertools.product([-1,0,1], repeat = n):
        if not any(v):
            continue
        if (not np.correlate(v, longtuple, 'valid').any()):
            bad = 1
            break
    if (bad == 0):
        print "Good"
        print longtuple

This is very slow if we set n = 19 and h = 10 which is what I would like to test.

My goal is to find a single "Good" array of length n+h-1. Is there a way to speed this up so that n = 19 and h = 10 is feasible?

The current naive approach takes 2^(n+h-1)3^(n) iterations, each one of which takes roughly n time. That is 311,992,186,885,373,952 iterations for n = 19 and h = 10 which is impossible.

Note 1 Changed convolve to correlate so that the code considers v the right way round.


July 10 2015

The problem is still open with no solution fast enough for n=19 and h=10 given yet.

like image 878
graffe Avatar asked May 15 '15 14:05

graffe


3 Answers

Consider the following "meet in the middle" approach.

First, recast the situation in the matrix formulation provided by leekaiinthesky.

Next, note that we only have to consider "short" vectors s of the form {0,1}^n (i.e., short vectors containing only 0's and 1's) if we change the problem to finding an h x n Hankel matrix H of 0's and 1's such that Hs1 is never equal to Hs2 for two different short vectors of 0's and 1's. That is because Hs1 = Hs2 implies H(s1-s2)=0 which implies there is a vector v of 1's, 0's and -1's, namely s1-s2, such that Hv = 0; conversely, if Hv = 0 for v in {-1,0,1}^n, then we can find s1 and s2 in {0,1}^n such that v = s1 - s2 so Hs1 = Hs2.

When n=19 there are only 524,288 vectors s in {0,1}^n to try; hash the results Hs and if the same result occurs twice then H is no good and try another H. In terms of memory this approach is quite feasible. There are 2^(n+h-1) Hankel matrices H to try; when n=19 and h=10 that's 268,435,456 matrices. That's 2^38 tests, or 274,877,906,944, each with about nh operations to multiply the matrix H and the vector s, about 52 trillion ops. That seems feasible, no?

Since you're now only dealing with 0's and 1's, not -1's, you might also be able to speed up the process by using bit operations (shift, and, and count 1's).

Update

I implemented my idea in C++. I'm using bit operations to calculate dot products, encoding the resulting vector as a long integer, and using unordered_set to detect duplicates, taking an early exit from a given long vector when a duplicate vector of dot products is found.

I obtained 00000000010010111000100100 for n=17 and h=10 after a few minutes, and 000000111011110001001101011 for n=18 and h=10 in a little while longer. I'm just about to run it for n=19 and h=10.

#include <iostream>
#include <bitset>
#include <unordered_set>

/* Count the number of 1 bits in 32 bit int x in 21 instructions.
 * From /Hackers Delight/ by Henry S. Warren, Jr., 5-2
 */
int count1Bits(int x) {
  x = x - ((x >> 1) & 0x55555555);
  x = (x & 0x33333333) + ((x >> 2) & 0x33333333);
  x = (x + (x >> 4)) & 0x0F0F0F0F;
  x = x + (x >> 8);
  x = x + (x >> 16);
  return x & 0x0000003F;
}

int main () {
  const int n = 19;
  const int h = 10;
  std::unordered_set<long> dotProductSet;

  // look at all 2^(n+h-1) possibilities for longVec
  // upper n bits cannot be all 0 so we can start with 1 in pos h
  for (int longVec = (1 << (h-1)); longVec < (1 << (n+h-1)); ++longVec) {

    dotProductSet.clear();
    bool good = true;

    // now look at all n digit non-zero shortVecs
    for (int shortVec = 1; shortVec < (1 << n); ++shortVec) {

      // longVec dot products with shifted shortVecs generates h results
      // each between 0 and n inclusive, can encode as h digit number in
      // base n+1, up to (n+1)^h = 20^10 approx 13 digits, need long
      long dotProduct = 0;

      // encode h dot products of shifted shortVec with longVec
      // as base n+1 integer
      for(int startShort = 0; startShort < h; ++startShort) {
        int shortVecShifted = shortVec << startShort;
        dotProduct *= n+1;
        dotProduct += count1Bits(longVec & shortVecShifted);
      }

      auto ret = dotProductSet.insert(dotProduct);
      if (!ret.second) {
        good = false;
        break;
      }
    }

    if (good) {
      std::cout << std::bitset<(n+h-1)>(longVec) << std::endl;
      break;
    }
  }

  return 0;
}

Second Update

The program for n=19 and h=10 ran for two weeks in the background on my laptop. At the end, it just exited without printing any results. Barring some kind of error in the program, it looks like there are no long vectors with the property you want. I suggest looking for theoretical reasons why there are no such long vectors. Perhaps some kind of counting argument will work.

like image 140
Edward Doolittle Avatar answered Nov 13 '22 18:11

Edward Doolittle


There may be a faster way*

What you're looking for is related to the concept of the kernel or null space of a matrix.

In particular, for each n+h-1 "longtuple" and given n, construct an h by n matrix whose rows are the n-subtuples of the longtuple. In other words, if your longtuple is [0,0,0,1,0,0] and n = 3, then your matrix is:

[[0 0 0]
 [0 0 1]
 [0 1 0]
 [1 0 0]]

Call this matrix A. You're looking for a vector x such that Ax = 0, where 0 is a vector of all 0s. If such an x exists (that is not itself all 0s) and can be scaled to contain only {-1, 0, 1}, then you want to throw out A and move on to the next longtuple.

I'm not quite sure what the (theoretically most efficient) computational complexity of computing the kernel is, but it seems to be on the order of O(h+n)^3 or so, which is in any case a lot better than O(3^n). See the Wikipedia link above or Python (NumPy, SciPy), finding the null space of a matrix for some examples on how to compute the kernel.

Anyway, once you identify the kernel, you'll have to do some additional work to figure out if any vectors of the form {-1, 0, 1}^n reside in there, but I don't think that's as big of a computational burden.

*NB: In the comments, @vib points out that this could in fact be a big computational burden. I am not sure what the best algorithm is for figuring out whether these vectors intersect the kernel. Perhaps it cannot be solved in polynomial time, in which case this answer doesn't provide a speedup to the original problem!

Example code

Adapting code from the other Stack Overflow question linked to above for the example you gave in the comments:

import scipy
from scipy import linalg, matrix
def null(A, eps=1e-15):
    u, s, vh = scipy.linalg.svd(A)
    null_mask = (s <= eps)
    null_space = scipy.compress(null_mask, vh, axis=0)
    return scipy.transpose(null_space)

A = matrix([[0,0,0,1],[0,0,1,0],[0,1,0,0],[0,0,0,0]])
print null(A)

#> [[-1.]
#>  [ 0.]
#>  [ 0.]
#>  [ 0.]]

The code gives an example (in fact, the same example you gave) of an n-tuple that invalidates [0, 0, 0, 1, 0, 0] as a "good" longtuple. If the code returned [], then presumably there's no such n-tuple, and the longtuple is "good". (If the code does return something though, you still have to check the {-1, 0, 1} part.)

Further thoughts

Whether such an x exists, disregarding the {-1, 0, 1} constraint for now, is equivalent to the question of whether the nullity (dimension of the kernel) of A is greater than 0. This would be equivalent to asking whether the rank of A is equal to n. So if you found some way to be clever about the {-1, 0, 1} constraint and broke it down just to needing to calculate the rank of A, I am sure this could be done even faster.

By the way, it seems highly likely that you (or whoever gave you this problem) may know all this already... Otherwise why would you have called the length of the longtuple "n+h-1", if you hadn't already started with the matrix of height h...!

like image 31
leekaiinthesky Avatar answered Nov 13 '22 20:11

leekaiinthesky


This is only a partial answer since this still seems too slow to check the case n=19, h=10 (and no "good" vectors may even exist in this case).

Here is an implementation of the checking algorithm that @vib describes, and using random sampling over all 2^(n+h-1) vectors, in Mathematica.

TestFullNullSpace[A_, ns_] := Module[{dim, n},
  {dim, n} = Dimensions[ns];
  For[i = 1, i < 3^dim, i++,
   testvec = Mod[IntegerDigits[i, 3, dim].ns, 3] /. {2 -> -1};
   If[Norm[A.testvec] == 0,
    Return[False]];
   ];
  Return[True];
]

n = 17;
h = 10;

Do[
 v = Table[RandomChoice[{0, 1}], {n + h - 1}];
 A = Table[v[[i ;; i + n - 1]], {i, 1, h}];
 ns = NullSpace[A, Modulus -> 3] /. {2 -> -1};
 If[TestFullNullSpace[A, ns],
  Print[v]];,
 {1000}]

Sample output for the above run, after a few seconds of computation:

{0,0,1,1,0,0,0,0,0,0,1,0,1,1,1,0,1,0,1,1,0,0,0,1,1,0}
{1,1,0,1,0,0,0,1,1,0,1,1,1,1,1,0,1,0,1,0,0,1,1,0,0,0}
{1,0,1,1,1,1,1,0,0,0,1,1,0,1,0,1,1,0,0,1,1,0,1,1,1,0}
{0,0,0,0,1,0,1,1,1,0,1,1,0,0,1,1,1,1,0,1,0,1,0,0,1,1}

So from 1000 checked vectors, 4 were "good" (unless I have a bug). Unfortunately, for n=18, I ran this for several minutes and still didn't find a "good" vector. I don't know if they don't exist or are just exceedingly rare.

like image 2
cfh Avatar answered Nov 13 '22 19:11

cfh