Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How is this valid Java code? (obfuscated Java)

This code looks obviously incorrect and yet it happily compiles and runs on my machine. Can someone explain how this works? For example, what makes the ")" after the class name valid? What about the random words strewn around?

class M‮{public static void main(String[]a‭){System.out.print(new char[]{'H','e','l','l','o',' ','W','o','r','l','d','!'});}}

Test online: https://ideone.com/t1W5Vm
Source: https://codegolf.stackexchange.com/a/60561

like image 920
WoodenKitty Avatar asked Apr 24 '16 14:04

WoodenKitty


3 Answers

One way to decipher what is going on is to look at the program character-by-character (demo).

There you may discover that characters in positions 7 and 42 are special UNICODE characters RLO (right-to-left order) and LRO (left-to-right order) characters.

Once you remove them, the program starts to look normal:

class M{public static void main(String[]a){System.out.print(new char[]{'H','e','l','l','o',' ','W','o','r','l','d','!'});}}

The trick to why the obfuscated program compiles is that Java compiler ignores these special characters as a format character.

like image 59
Sergey Kalinichenko Avatar answered Oct 23 '22 01:10

Sergey Kalinichenko


This is valid java code, but it uses the arabic "align right" invisible zero-width ubicode characters. Try to place your cursor in the text and press the right arrow. There's ine between "M" and ")", and one "char[]" and "a[]".

I tried to format the code, but it's just frustrating to navigate in it.

like image 1
Bálint Avatar answered Oct 22 '22 23:10

Bálint


You will find two unicode sequences in your source

0xE2 0x80 0xAE http://www.fileformat.info/info/unicode/char/202e/index.htm

0xE2 0x80 0xAD http://www.fileformat.info/info/unicode/char/202d/index.htm

effectively writing the part: {public static void main(String[]a right to left

like image 1
revau.lt Avatar answered Oct 23 '22 00:10

revau.lt