Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why proguard does not obfuscate method body?

I am using ProGuard to obfuscate my .jar program. Everything works fine, except for the fact that ProGuard does not obfuscate local variables in method bodies. Here is an example:

Raw:

enter image description here

Obfuscated:

enter image description here

The variable names that are highlighted in yellow should be obfuscated, but they are not. How can I obfuscate them too (make them renamed to a, b, c etc.?)

Here is my ProGuard config: http://pastebin.com/sb3DMRcC (the above method is NOT from one of the excluded classes).

like image 388
Victor2748 Avatar asked Jul 20 '15 02:07

Victor2748


People also ask

Does ProGuard obfuscate code?

You can obfuscate Android code to provide security against reverse engineering. You can use the Android ProGuard tool to obfuscate, shrink, and optimize your code.

How does ProGuard obfuscation work?

In the obfuscation step, ProGuard renames classes and class members that are not entry points. In this entire process, keeping the entry points ensures that they can still be accessed by their original names. The preverification step is the only step that doesn't have to know the entry points.


1 Answers

Why proguard does not obfuscate method body?

Because it can't.
The names of method arguments and local variables are simply not stored when compiling.
The names you're seeing are generated by your decompiler.

For compiled code, there are two ways to store data locally (i.e. within a method):

  • On the operand stack
  • In local variables

The operand stack is really just a stack.
See Table 7.2 from the Java VM Specification for stack operators.
You can pop values (pop), duplicate the top value (dup), swap the top two values (swap) and the same with slightly altered behaviour (pop2, dup_x1, dup_x2, dup2, dup2_x1, dup2_x2).
And most, if not all instructions that produce a return value will drop said value onto the stack.

The important thing for this question is how things on the stack are referred to, which is like with any other stack:
Relative to the top position, and based on the instruction used.
There are no assigned numbers or names, it's just whatever's currently there.

Now, for the so-called "local variables":

Think of them more as an ArrayList than variables in Java.
Because that's exactly how you access them: by index.
For the variables 0 to 3, there are special instructions (i.e. single byte) because they are used so often, all other variables can only be accessed via a two-byte instruction, where the second byte is the index.
See Table 7.2 again, "Loads" and "Stores".
The first five entries in both tables are the wide (two-byte) store/load instructions for each data type (note that, for single values, boolean, char, byte and short are all converted to int, leaving only int, float and Object as single-slot values and long and double as double-slot ones), the next twenty instructions are the instructions for direct access to registers 0 to 3, and the last eight instructions are to access array indices (note that inside arrays, boolean, byte, char and short are not converted to int, to not waste space, which is why there are three more instructions (not four, since byte and char have the same size)).

Both the maximum stack size and the number of local variables are limited, and must be given in the header of the Code attribute of each method, as defined in Section 4.7.3 (max_stack and max_locals).

The interesting thing about local variables, though, is that they double as method arguments, meaning that the number of local variables can never be lower than the number of method arguments.
Note that when counting values for the Java VM, variables of the type long and double are treated as two values, and need two "slots" accordingly.
Also note that for non-static methods, the argument 0 will be this, which requires another "slot" for itself.

That being said, let's look at some code!

Example:

class Test
{
    public static void main(String[] myArgs) throws NumberFormatException
    {
        String myString = "42";
        int myInt = Integer.parseInt(myString);
        double myDouble = (double)myInt * 42.0d;
        System.out.println(myDouble);
    }
}

Here we have three local variables myString, myInt and myDouble, plus one argument myArgs.
In addition, we have two constants "42" and 42.0d, and a lot of external references:

  • java.lang.String[] - class
  • java.lang.NumberFormatException - class
  • java.lang.String - class
  • java.lang.Integer.parseInt - method
  • java.lang.System.out - field
  • java.io.PrintStream.println - method

And some exports: Test and main, plus the default constructor that the compiler will generate for us.

All constants, references and exports will be exported to the Constant Pool - the local variables and argument names will not.

Compiling and disassembling the class (using javap -c Test) yields:

Compiled from "Test.java"
class Test {
  Test();
    Code:
       0: aload_0
       1: invokespecial #1                  // Method java/lang/Object."<init>":()V
       4: return

  public static void main(java.lang.String[]) throws java.lang.NumberFormatException;
    Code:
       0: ldc           #2                  // String 42
       2: astore_1
       3: aload_1
       4: invokestatic  #3                  // Method java/lang/Integer.parseInt:(Ljava/lang/String;)I
       7: istore_2
       8: iload_2
       9: i2d
      10: ldc2_w        #4                  // double 42.0d
      13: dmul
      14: dstore_3
      15: getstatic     #6                  // Field java/lang/System.out:Ljava/io/PrintStream;
      18: dload_3
      19: invokevirtual #7                  // Method java/io/PrintStream.println:(D)V
      22: return
}

Besides the default constructor, we can see our main method, step by step.
Note how myString is accessed with astore_1 and aload_1, myInt with istore_2 and iload_2, and myDouble with dstore_3 and dload_3.
myArgs isn't accessed anywhere, so there's no bytecode dealing with it either, but at the beginning of the method, a reference to the String array will be in local variable 1, which gets soon overwritten by a reference to "42".

javap will also show you the Constant Pool if you pass it the -v flag, but it doesn't really add any value to the output, since all relevant information from the Constant Pool is displayed in comments anyway.

But now, let's look at what the decompilers produce!

JD-GUI 0.3.5 (JD-Core 0.6.2):

import java.io.PrintStream;

class Test
{
  public static void main(String[] paramArrayOfString)
    throws NumberFormatException
  {
    String str = "42";
    int i = Integer.parseInt(str);
    double d = i * 42.0D;
    System.out.println(d);
  }
}

Procyon 0.5.28:

class Test
{
    public static void main(final String[] array) throws NumberFormatException {
        System.out.println(Integer.parseInt("42") * 42.0);
    }
}

Note how everything that was exported to the Constant Pool persists, while JD-GUI simply picks some names for local variables, and Procyon optimizes them out entirely.
The name of the argument - paramArrayOfString vs array (vs the original myArgs) - is a perfect example, though, to show that there is no "correct" name anymore, and the decompilers simply have to rely on some pattern of picking a name.

I don't know where the "true" names in your decompiled code are coming from, but I'm fairly certain that they're not contained in the jar file.
Feature of your IDE maybe?

like image 154
Siguza Avatar answered Sep 27 '22 16:09

Siguza