Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Obfuscated way of accessing a character in string

I found today interesting piece of code:

auto ch = (double(), (float(), int()))["\t\a\r\n\0"]["abcdefghij"];

which works same as:

char str[] = "abcdefghij";
char ch = str['\t'];

Why is it even possible? Especially why is the compiler picking first char from string and using it as subscript instead of throwing error?

like image 935
erjot Avatar asked Sep 23 '10 16:09

erjot


2 Answers

So first of all, all that double and float stuff is pure misdirection. The comma operator's return value is its right-side argument, so (double(), (float(), int())) boils down to just int(), although it creates and discards a double and a float value along the way. So consider:

 auto ch = int()["\t\a\r\n\0"]["abcdefghij"];

The first part of this that will be evaluated is

 int()["\t\a\r\n\0"]

Now, recognize that int() default-constructs an integer, which gives it the value 0. So the statement is equivalent to:

 0["\t\a\r\n\0"]

It's a fairly well known trick in C and C++ that a[b] and b[a] are equivalent, since the subscript operator is defined as a[b] === *(a + b) and addition is commutative. So this is really the same as:

 "\t\a\r\n\0"[0]

which is of course equal to '\t'. Now the full piece of code is:

 auto ch = '\t'["abcdefghij"];

which for the same reason is equivalent to:

 auto ch = "abcdefghij"['\t'];

Which of course could also be written as

char str[] = "abcdefghij";
char ch = str['\t'];

If you gave the "abcdefghij" string a name and forwent the use of the C++0x auto keyword when declaring ch.

Finally, note that \t is equal to 9 since the tab character has ASCII value 9, so str['\t'] is the same as str[9]. str consists of 10 characters followed by a NUL character terminator (\0), which is implicitly added to the string literal that it was initialized with.

So in both cases the final value of ch is 'j'.

like image 55
Tyler McHenry Avatar answered Oct 21 '22 06:10

Tyler McHenry


I'll explain as rewrite:

auto ch = (double(), (float(), int()))["\t\a\r\n\0"]["abcdefghij"];

is equivalent to (just evaluate all the double, float, int temporaries with comma operator)

auto ch = (0["\t\a\r\n\0"])["abcdefghij"];

Now the standard says that:

x[y] == *(x + y)

No matter which one is a pointer. so you get:

0["\t\a\r\n\0"] == "\t\a\r\n\0"[0] == '\t';
like image 36
Yakov Galka Avatar answered Oct 21 '22 05:10

Yakov Galka