Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Obfuscated C Code Contest 2006. Please explain sykes2.c

How does this C program work?

main(_){_^448&&main(-~_);putchar(--_%64?32|-~7[__TIME__-_/8%8][">'txiZ^(~z?"-48]>>";;;====~$::199"[_*2&8|_/64]/(_&2?1:8)%8&1:10);} 

It compiles as it is (tested on gcc 4.6.3). It prints the time when compiled. On my system:

    !!  !!!!!!              !!  !!!!!!              !!  !!!!!!      !!  !!  !!              !!      !!              !!  !!  !!      !!  !!  !!              !!      !!              !!  !!  !!      !!  !!!!!!    !!        !!      !!    !!        !!  !!!!!!      !!      !!              !!      !!              !!  !!  !!      !!      !!              !!      !!              !!  !!  !!      !!  !!!!!!              !!      !!              !!  !!!!!! 

Source: sykes2 - A clock in one line, sykes2 author hints

Some hints: No compile warnings per default. Compiled with -Wall, the following warnings are emitted:

sykes2.c:1:1: warning: return type defaults to ‘int’ [-Wreturn-type] sykes2.c: In function ‘main’: sykes2.c:1:14: warning: value computed is not used [-Wunused-value] sykes2.c:1:1: warning: implicit declaration of function ‘putchar’ [-Wimplicit-function-declaration] sykes2.c:1:1: warning: suggest parentheses around arithmetic in operand of ‘|’ [-Wparentheses] sykes2.c:1:1: warning: suggest parentheses around arithmetic in operand of ‘|’ [-Wparentheses] sykes2.c:1:1: warning: control reaches end of non-void function [-Wreturn-type] 
like image 287
corny Avatar asked Mar 13 '13 18:03

corny


1 Answers

Let's de-obfuscate it.

Indenting:

main(_) {     _^448 && main(-~_);     putchar(--_%64         ? 32 | -~7[__TIME__-_/8%8][">'txiZ^(~z?"-48] >> ";;;====~$::199"[_*2&8|_/64]/(_&2?1:8)%8&1         : 10); } 

Introducing variables to untangle this mess:

main(int i) {     if(i^448)         main(-~i);     if(--i % 64) {         char a = -~7[__TIME__-i/8%8][">'txiZ^(~z?"-48];         char b = a >> ";;;====~$::199"[i*2&8|i/64]/(i&2?1:8)%8;         putchar(32 | (b & 1));     } else {         putchar(10); // newline     } } 

Note that -~i == i+1 because of twos-complement. Therefore, we have

main(int i) {     if(i != 448)         main(i+1);     i--;     if(i % 64 == 0) {         putchar('\n');     } else {         char a = -~7[__TIME__-i/8%8][">'txiZ^(~z?"-48];         char b = a >> ";;;====~$::199"[i*2&8|i/64]/(i&2?1:8)%8;         putchar(32 | (b & 1));     } } 

Now, note that a[b] is the same as b[a], and apply the -~ == 1+ change again:

main(int i) {     if(i != 448)         main(i+1);     i--;     if(i % 64 == 0) {         putchar('\n');     } else {         char a = (">'txiZ^(~z?"-48)[(__TIME__-i/8%8)[7]] + 1;         char b = a >> ";;;====~$::199"[(i*2&8)|i/64]/(i&2?1:8)%8;         putchar(32 | (b & 1));     } } 

Converting the recursion to a loop and sneaking in a bit more simplification:

// please don't pass any command-line arguments main() {     int i;     for(i=447; i>=0; i--) {         if(i % 64 == 0) {             putchar('\n');         } else {             char t = __TIME__[7 - i/8%8];             char a = ">'txiZ^(~z?"[t - 48] + 1;             int shift = ";;;====~$::199"[(i*2&8) | (i/64)];             if((i & 2) == 0)                 shift /= 8;             shift = shift % 8;             char b = a >> shift;             putchar(32 | (b & 1));         }     } } 

This outputs one character per iteration. Every 64th character, it outputs a newline. Otherwise, it uses a pair of data tables to figure out what to output, and puts either character 32 (a space) or character 33 (a !). The first table (">'txiZ^(~z?") is a set of 10 bitmaps describing the appearance of each character, and the second table (";;;====~$::199") selects the appropriate bit to display from the bitmap.

The second table

Let's start by examining the second table, int shift = ";;;====~$::199"[(i*2&8) | (i/64)];. i/64 is the line number (6 to 0) and i*2&8 is 8 iff i is 4, 5, 6 or 7 mod 8.

if((i & 2) == 0) shift /= 8; shift = shift % 8 selects either the high octal digit (for i%8 = 0,1,4,5) or the low octal digit (for i%8 = 2,3,6,7) of the table value. The shift table ends up looking like this:

row col val 6   6-7 0 6   4-5 0 6   2-3 5 6   0-1 7 5   6-7 1 5   4-5 7 5   2-3 5 5   0-1 7 4   6-7 1 4   4-5 7 4   2-3 5 4   0-1 7 3   6-7 1 3   4-5 6 3   2-3 5 3   0-1 7 2   6-7 2 2   4-5 7 2   2-3 3 2   0-1 7 1   6-7 2 1   4-5 7 1   2-3 3 1   0-1 7 0   6-7 4 0   4-5 4 0   2-3 3 0   0-1 7 

or in tabular form

00005577 11775577 11775577 11665577 22773377 22773377 44443377 

Note that the author used the null terminator for the first two table entries (sneaky!).

This is designed after a seven-segment display, with 7s as blanks. So, the entries in the first table must define the segments that get lit up.

The first table

__TIME__ is a special macro defined by the preprocessor. It expands to a string constant containing the time at which the preprocessor was run, in the form "HH:MM:SS". Observe that it contains exactly 8 characters. Note that 0-9 have ASCII values 48 through 57 and : has ASCII value 58. The output is 64 characters per line, so that leaves 8 characters per character of __TIME__.

7 - i/8%8 is thus the index of __TIME__ that is presently being output (the 7- is needed because we are iterating i downwards). So, t is the character of __TIME__ being output.

a ends up equalling the following in binary, depending on the input t:

0 00111111 1 00101000 2 01110101 3 01111001 4 01101010 5 01011011 6 01011111 7 00101001 8 01111111 9 01111011 : 01000000 

Each number is a bitmap describing the segments that are lit up in our seven-segment display. Since the characters are all 7-bit ASCII, the high bit is always cleared. Thus, 7 in the segment table always prints as a blank. The second table looks like this with the 7s as blanks:

000055   11  55   11  55   116655   22  33   22  33   444433   

So, for example, 4 is 01101010 (bits 1, 3, 5, and 6 set), which prints as

----!!-- !!--!!-- !!--!!-- !!!!!!-- ----!!-- ----!!-- ----!!-- 

To show we really understand the code, let's adjust the output a bit with this table:

  00   11  55 11  55   66   22  33 22  33   44 

This is encoded as "?;;?==? '::799\x07". For artistic purposes, we'll add 64 to a few of the characters (since only the low 6 bits are used, this won't affect the output); this gives "?{{?}}?gg::799G" (note that the 8th character is unused, so we can actually make it whatever we want). Putting our new table in the original code:

main(_){_^448&&main(-~_);putchar(--_%64?32|-~7[__TIME__-_/8%8][">'txiZ^(~z?"-48]>>"?{{?}}?gg::799G"[_*2&8|_/64]/(_&2?1:8)%8&1:10);} 

we get

          !!              !!                              !!        !!  !!              !!  !!  !!  !!              !!  !!  !!      !!  !!              !!  !!  !!  !!              !!  !!  !!            !!      !!              !!      !!                        !!  !!  !!          !!  !!      !!              !!  !!  !!      !!  !!  !!          !!  !!      !!              !!  !!  !!            !!              !!                              !!    

just as we expected. It's not as solid-looking as the original, which explains why the author chose to use the table he did.

like image 170
nneonneo Avatar answered Oct 07 '22 10:10

nneonneo