Pergunta

I decompiled some Java code the other day and found this:

String s1 = "something";
String s2 = "something_else";

if (s1 == s2) {
// Path 1
} else {
// Path 2
}

Obviously using '==' to test for string equality is bad

But I wondered - This code has been compiled and decompiled. If all the strings have been defined at compile time and interned and the code has been compiled - is it possible that s1.equals(s2) could have been optimized down to 's1 == s2'?

Foi útil?

Solução

I highly doubt it. As a rule, Java compilers do very little by way of bytecode optimization, leaving optimization to the JIT phase.

I've experimented with this a little, and my compiler doesn't do anything interesting with the following:

public class Clazz {

    public static void main(String args[]) {
        final String s1 = "something";
        final String s2 = "something_else";
        if (s1.equals(s2)) {
            System.out.println("yes");
        } else {
            System.out.println("no");
        }
    }

}

This would probably be the easiest case to optimize. However, the bytecodes are:

  public static void main(java.lang.String[]);
    Code:
       0: ldc           #16                 // String something
       2: astore_1      
       3: ldc           #18                 // String something_else
       5: astore_2      
       6: ldc           #16                 // String something
       8: ldc           #18                 // String something_else
      10: invokevirtual #20                 // Method java/lang/String.equals:(Ljava/lang/Object;)Z
      13: ifeq          27
      16: getstatic     #26                 // Field java/lang/System.out:Ljava/io/PrintStream;
      19: ldc           #32                 // String yes
      21: invokevirtual #34                 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
      24: goto          35
      27: getstatic     #26                 // Field java/lang/System.out:Ljava/io/PrintStream;
      30: ldc           #40                 // String no
      32: invokevirtual #34                 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
      35: return        

I therefore strongly suspect the == was part of the original source code.

Outras dicas

No, it does not look like Java optimizes this away (by default).

I just benchmarked both solutions. If it is unoptimized, we expect to see s1.equals(s2) slower than s1 == s2. This is exactly what we see. If it were optimized, then s1.equals(s2) would take the same amount of time as s1==s2. However, they take different amounts of time (on the order of 50,000 nanoseconds). This is not a direct measurement of this compilation, but it is a reasonable inference.

The reason this will not be optimized to == is because the equals operator, for objects, will compare the object memory address, not the contents of the object itself. So, if you change s1, then, if the compiler optimized this, you would also be changing s2.

However, that risks breaking code, so the compiler won't do that. It will leave the memory addresses of s1 and s2 be.

The main rule is that if the compiler could deduct exact value from the source code it has in the single class. Because it does all optimizations using only smallest compilation unit - class. If I write a code

public class Test
{
    private static final String i = "1";
    public static void main(String[] args)
    {
        if(i == "2")
            System.out.println("hello");
        System.out.println("world");
    }
}

The compiler sees all code related to the statement in this class and optimizes out the if condition. After de-compiler the code looks like

public class Test
{
  private static final String i = "1";

  public static void main(String[] paramArrayOfString)
  {
    System.out.println("world");
  }
}

(I've used jd-gui)

However, if you replace == by the .equals, compiler cannot assume how the method .equals works. Because, after compilation of the Test class, you could hack your JDK and place another version of the java.lang.String class which returns true for "1".equals("2").

So, thinking about optimization which compiler could do, first of all think how compiler could behave if any class could be recompiled later.

As another example, you could see how enum is implemented and why does it need such "weird" way.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top