Question

I have a strange jar file, it contains some class that when I use JD Decompiler, it show a segment like this:

public final void a(ak aa) {
    this.jdField_a_of_type_Ak = aa;
}

public final void a(cn ccn) {
  this.jdField_a_of_type_Cn = ccn;
}

public final cN a() {
  return this.jdField_a_of_type_CN;
}

public final void a() {
  super.b();
}

public final boolean a() {
    return this.jdField_a_of_type_Boolean;
}

I just wonder why a/an compiler/obfuscator can product a class byte code like that, I mean the method signature. Did any one know a obfuscator can do this?

Was it helpful?

Solution

As @Joachim Sauer correctly points out: The JVM specification poses less constraints on method overloading in the bytecode than the JLS does on Java programs.

From the JVM Specification (Section 4.6, Methods):

No two methods in one class file may have the same name and descriptor (§4.3.3).

And a method descriptor includes the return type: (4.3.3 Method Descriptors)

MethodDescriptor:
    ( ParameterDescriptor* ) ReturnDescriptor

The methods you mentioned in your question all have distinct descriptors, so they are okay:

public final void a(ak aa)     ->     (Lsomepkg1/ak;)V
public final void a(cn ccn)    ->     (Lsomepkg2/ccn;)V
public final cN a()            ->     ()Lsomepkg3/cN;
public final void a()          ->     ()V
public final boolean a()       ->     ()Z

This is cleverly exploited by obfuscators. A valid bytecode-program no longer has a "directly corresponding" Java program. ProGuard does this for instance. Here is a snippet from their manual:

-overloadaggressively

Specifies to apply aggressive overloading while obfuscating. Multiple fields and methods can then get the same names, as long as their arguments and return types are different (not just their arguments).

There are other similar techniques using for instance the jsr bytecode instruction or using variable identifiers that are reserved words in the Java language. Here is a webpage listing a few techniques.


To answer the obvious follow-up question: How does the JVM know which method to call at the call-site?

The invoke-instructions require you to specify a reference to a complete method signature, (including the return type of the method) that is to be called.

OTHER TIPS

The Java bytecode supports constructs that are not valid in Java source code. Obfuscators exploit that fact by modifying bytecode to use those constructs (while still giving the same result as the un-obfuscated bytecode).

... An obfuscator produces method names/signatures like this because that's its job. Any obfuscator should work for this purpose.

The class has been compiled without debugging information (at least local variable information is missing) and obfuscated later.

One basic obfuscating strategy is to replace (nearly) all package, class and methodnames by new, senseless names, so that one can't understand the decompiled code.

Additional strategies are obfuscating strings and adding bytecode constructs that can't be decompiled to java code.

You'll still be able to create a java source equivalent for the obfuscated class file but only with great effort.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top