Consider the following code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main (int argc, char *argv[])
{
time_t seed;
time (&seed);
srand (seed);
int i, j, k, l;
// init random values s1 .. s8
int s[8];
for (l = 0; l < 8; l++) s[l] = rand ();
// zero result
int r[16];
for (j = 0; j < 16; j++) r[j] = 0;
// do 100 random xor functions
for (i = 0; i < 100; i++)
{
// generates random function to show why CSE must be computed in runtime
int steps[16];
for (j = 0; j < 16; j++) steps[j] = rand ();
// _here_ is optimization possible
// run function MANY times to show that optimization makes sense
for (l = 0; l < 1000000; l++)
{
for (j = 0; j < 16; j++)
{
int tmp = 0;
for (k = 0; k < 8; k++) tmp ^= ((steps[j] >> k) & 1) ? s[k] : 0;
r[j] += tmp;
}
}
for (j = 0; j < 16; j++) printf ("%08x\n", r[j]);
puts ("");
}
return 0;
}
Inside the code, the following unrolled function is executed many times in a loop:
r[ 0] += s01 ^ s03;
r[ 1] += s02 ^ s04;
r[ 2] += s03 ^ s05;
r[ 3] += s02;
r[ 4] += s03;
r[ 5] += s04 ^ s06;
r[ 6] += s03;
r[ 7] += s04;
r[ 8] += s02 ^ s04 ^ s05 ^ s07;
r[ 9] += s03 ^ s04 ^ s05 ^ s07;
r[10] += s04 ^ s05 ^ s06;
r[11] += s05 ^ s06 ^ s08;
r[12] += s03 ^ s06;
r[13] += s06;
r[14] += s02 ^ s03 ^ s04 ^ s05 ^ s06 ^ s07;
r[15] += s03 ^ s04 ^ s05 ^ s06;
Makes a total of 23 XOR.
But the implementation is bad. An optimized version is this:
int s04___s05 = s04 ^ s05;
int s03___s06 = s03 ^ s06;
int s04___s05___s07 = s04___s05 ^ s07;
int s03___s04___s05___s06 = s03___s06 ^ s04___s05;
r[ 0] += s01 ^ s03;
r[ 1] += s02 ^ s04;
r[ 2] += s03 ^ s05;
r[ 3] += s02;
r[ 4] += s03;
r[ 5] += s04 ^ s06;
r[ 6] += s03;
r[ 7] += s04;
r[ 8] += s02 ^ s04___s05___s07;
r[ 9] += s03 ^ s04___s05___s07;
r[10] += s04___s05 ^ s06;
r[11] += s05 ^ s06 ^ s08;
r[12] += s03___s06;
r[13] += s06;
r[14] += s02 ^ s03___s04___s05___s06 ^ s07;
r[15] += s03___s04___s05___s06;
Makes a total of 15 XOR.
I am searching for an algorithm that automates this step and finds a solution that uses the lowest number of XOR.
If there are multiple solutions find the one with the lowest number of storage for precomputation.
If there are still multiple solution it does not matter which to choose.
Some additional informations:
I am a bit lost on how to write this.
We want to compute r[i]
. It is equal to maximum 8 inputs XOR'ed between themselves.
Now, think about this: s8 ^ s6 ^ s5 ^ s4 ^ s3 ^ s2 ^ s1, like about a number 10111111.
1 if we use corresponding s
in XORing, 0 if not.
We can pre-compute all possible 2^8 variations:
t[0] = 0 (00000000, nothing)
t[1] = s1 (00000001)
t[2] = s2 (00000010)
t[3] = s2 ^ s1 (00000011)
t[4] = s3 (00000100)
t[5] = s3 ^ s1 (00000101)
...
t[255] = s8 ^ s7 ^ s6 ^ s5 ^ s4 ^ s3 ^ s2 ^ s1 (11111111)
Then in loop if you want for example calculate:
r[0] = s1 ^ s3
s1 ^ s3 in our representation is 00000101 = 5, which gives us index to pre-computed lookup table:
r[0] = t[5]
That solves your problem without any XOR in loop.
Let's first search for an abstract problem definition: You have a bitvector type with a length of 8 bit, which represents a combination of your 8 input signals. For each signal, you have a bitvector value like 10000000
(first signal) or 00100000
(third signal). These values are given. You want to generate the following values (I left out the trivial ones):
r[0] = 10100000
r[1] = 01010000
r[2] = 00101000
r[5] = 00010100
r[8] = 01011010
r[9] = 00111010
r[10] = 00011100
r[11] = 00001101
r[12] = 00100100
r[14] = 01111110
r[15] = 00111100
We now want to search for the minimum of combinations (executions of XOR
) to generate these values. This is an optimization problem. I won't do a complete proof for the lowest amount of XOR
executions here, but this is what I get:
int i1 = s02 ^ s04; // 01010000
int i2 = s03 ^ s05; // 00101000
int i3 = s04 ^ s06; // 00010100
int i4 = s05 ^ s07; // 00001010
int i5 = s03 ^ s06; // 00100100
int i6 = i1 ^ i4; // 01011010
int i7 = i2 ^ i3; // 00111100
int i8 = s06 ^ s07; // 00000110
r[0] = s01 ^ s03;
r[1] = i1;
r[2] = i2;
r[5] = i3;
r[8] = i6;
r[9] = i7 ^ i8;
r[10] = i3 ^ s05;
r[11] = i4 ^ i8 ^ s08;
r[12] = i5;
r[14] = i6 ^ i5;
r[15] = i7;
14 XOR
s.
To formulate a general algorithm: You start with a Set S={10000000, 01000000, ... , 00000001}
. You need a weighting function that tells you the value of your set. Define this as: The number of XOR
s needed to calculate all goal values from values in S
without storing additional temporary values plus the number of values in S
minus 8 (initial values). The first part of the weighting function can be implemented with brute force (find all possible combinations for a goal value that use each value in S
at most once, choose the one with the least XOR
executions).
To optimize the value of your weighting function, you combine two values from S
with XOR
and add them to S
, giving S1
. Choose those two values which grant the lowest new value of the weighting function (again, this can be determined by brute force). S1 now has one more value (which will be a temporary value like the i
values in my solution). To create this value, one XOR
is needed (therefore, the weighting function counts the number of values in S
).
Continue this step until you don't find any new value to add to S
that reduces the value of the weighting function. The resulting set contains the initial values plus all temporary values you have to calculate. The steps you took will tell you how to calculate the immediate values.
This is a greedy algorithm. It doesn't necessarily find the minimum number of XOR
s, but shows you an easy way to at least get a good solution. It might be that the algorithm actually always finds the best solution, but this would have to be proven. If you want to be absolutely sure, you can do a complete traversal of all possible steps that reduce the value of the weighting function, starting with the initial S
values. This would be a tree traversal, and the tree will be finite - as the value cannot drop below 0 - so it's definitely solvable.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With