I need to run millions of queries of the following kind.
Each input consists of a small collection (<100) of boolean vectors of various sizes (<20000 elements), with each having a few 1s and many 0s :
A = [ 0 0 0 1 0 0 0 0 0 0 0 ... ]
B = [ 0 0 0 0 1 0 ... ]
...
I also have many (>20000) boolean AND expressions. These expressions are constant for all queries.
S[1] = A[10] AND B[52] AND F[15] AND U[2]
S[2] = I[8] AND Z[4]
...
Each expression can reference zero or one element from each vector. Variables rarely appear in more than one expression. For each query, output is the set of true expressions.
What's a good algorithm to compute the queries quickly, preferably more quickly than evaluating each expression in order? The algorithm needs to run once for each input, and there are millions of inputs to run on, so speed is important. Since the expressions are constant, I can optimize them ahead of time. I'm working with C.
Return early. As soon as you find a false boolean, you know that the and
expression will return false
, so don't check the rest.
In C, you get this behavior by default in hardcoded boolean expressions:
(A[10] && B[52] && F[15] && U[2])
Depending on how predictable the inputs are, you might be able to get a lot of performance out of ranking each expression's variables per likelihood of being false, and reordering the expression from most likely to less likely.
You seem to be using lots of data. It's a guess, but I'd say you'll get optimal behavior by preprocessing your expressions into cache optimal passes. Consider the two expressions given:
S[1] = A[10] AND B[52] AND F[15] AND U[2]
S[2] = I[8] AND Z[4]
rewrite these as:
S[1] = 1;
S[1] &= A[10];
S[1] &= B[52];
S[1] &= F[15];
S[1] &= U[2];
S[2] = 1;
S[2] &= I[8];
S[2] &= Z[4];
Then sort all of the expressions together to create one long list of operations:
S[1] = 1;
S[2] = 1;
S[1] &= A[10];
S[1] &= B[52];
S[1] &= F[15];
S[2] &= I[8];
S[1] &= U[2];
S[2] &= Z[4];
Consider the size of the machine cache on hand. We want all of the input vectors in cache. That probably can't happen so we know we will be pulling the input vectors and the result vectors in and out of memory multiple times. We want to partition the available machine cache into three parts: input vector chunk, result vector chunk, and some working space (where our current list of operations will be pulled from).
Now, walk the list of expressions pulling out expressions that fall into the A-I and S[1]-S[400] range. Then walk again pulling J-T (or whatever fits in cache) and pull those operations next, once you get to the end of the operations list repeat for s[401]-s[800]. This is the final order of execution for the operations. Note that this can be parallelized without contention across the S bands.
The down side is that you do not get the early out behavior. The upside is you only have cache failures as you transition blocks of computation. For such a large data set I suspect this (and the elimination of all branching) will overwhelm the early out advantage.
If you still want to try to use the early out optimization you can it is just harder to implement. Consider: once you have your cache bracket A-I & S[1]-s[400], and you have created a list of operations across that bracket:
S[1] &= A[10];
S[1] &= B[52];
S[1] &= F[15];
S[2] &= I[8];
You can then reorder the operations to group them by S[x] (which this example already was). Now if you find A[10] is false you can "early out" to the S[2] block. As far as how to implement this? Well, your operations now need to know how many to skip forward from the current operation:
Operation[x ] => (S[1] &= A[10], on false, x+=3)
Operation[x+1] => (S[1] &= B[52], on false, x+=2)
Operation[x+2] => (S[1] &= F[15], on false, x+=1)
Operation[x+3] => (S[2] &= I[8]...
Again, I suspect simply adding the branching in will be slower than just performing all of the other work. This is not a full early out process since the when you move to the next input block chunk you'll have to reinspect each S[x] value accessed to determine if it has already failed and should be skipped.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With