I am writing a tablebase for a Japanese chess variant. To index the table base, I encode each chess position as an integer. In one of the encoding steps, I encode where the pieces are on the board. Since the actual method is a bit complicated, let me explain the problem in a simplified manner.
In the endgame tablebase, I have (let's say) six distinct chess pieces that I want to distribute over a board with 9 squares. I can naïvely represent their positions by a six-tuple (a, b, c, d, e, f ) where each of the variables a to f is a number in the range 0 to 8 inclusive indicating where the corresponding chess piece is located.
However, this representation is not optimal: no two chess pieces can occupy the same square but the aforementioned encoding happily allows this. We can encode the same position by a six-tuple [a, b', c', d', e', f' ] where a is the same a as before, b' is a number from 0 to 7 inclusive indicating the number of the square the second piece is on. This works by assigning a number from 0 to 7 to each square the first piece is not on. For example, if the first piece is on square 3, the square numbers for the second piece are:
1st piece: 0 1 2 3 4 5 6 7 8
2nd piece: 0 1 2 - 3 4 5 6 7
the other pieces are encoded similarly, c' as a number from 0 to 6, d' as a number from 0 to 5, etc. For example the naïve encoding (5, 2, 3, 0, 7, 4) yields the compact encoding (5, 2, 2, 0, 3, 1):
1st: 0 1 2 3 4 5 6 7 8 --> 5
2nd: 0 1 2 3 4 - 5 6 7 --> 2
3rd: 0 1 - 2 3 - 4 5 6 --> 2
4th: 0 1 - - 2 - 3 4 5 --> 0
5th: - 0 - - 1 - 2 3 4 --> 3
6th: - 0 - - 1 - 2 - 3 --> 1
In my actual encoding, the number of pieces I want to encode is not fixed. The number of squares on the board however is.
How can I efficiently convert the naïve representation to the compact representation and vice versa? I use standard C99 for the program. In the context of this question, I am not interested in answers that use non-standard constructs, inline assembly or intrinsics.
As there seems to be some confusion about the question:
I have found a more elegant solution for up to 16 positions using 64-bit integers with a single loop for both encoding and decoding:
#include <stdio.h>
#include <stdlib.h>
void encode16(int dest[], int src[], int n) {
unsigned long long state = 0xfedcba9876543210;
for (int i = 0; i < n; i++) {
int p4 = src[i] * 4;
dest[i] = (state >> p4) & 15;
state -= 0x1111111111111110 << p4;
}
}
void decode16(int dest[], int src[], int n) {
unsigned long long state = 0xfedcba9876543210;
for (int i = 0; i < n; i++) {
int p4 = src[i] * 4;
dest[i] = (state >> p4) & 15;
unsigned long long mask = ((unsigned long long)1 << p4) - 1;
state = (state & mask) | ((state >> 4) & ~mask);
}
}
int main(int argc, char *argv[]) {
int naive[argc], compact[argc];
int n = argc - 1;
for (int i = 0; i < n; i++) {
naive[i] = atoi(argv[i + 1]);
}
encode16(compact, naive, n);
for (int i = 0; i < n; i++) {
printf("%d ", compact[i]);
}
printf("\n");
decode16(naive, compact, n);
for (int i = 0; i < n; i++) {
printf("%d ", naive[i]);
}
printf("\n");
return 0;
}
The code uses 64-bit unsigned integers to hold arrays of 16 values in the range 0..15
. Such an array can be updated in parallel in a single step, extracting a value is straightforward and deleting a value is a bit more cumbersome but still only a few steps.
You could extend this method to 25 positions using non-portable 128-bit integers (type __int128
is supported by both gcc and clang), encoding each position on 5 bits, taking advantage of the fact that 5 * 25 < 128
, but the magical constants are more cumbersome to write.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With