Question

I have been told that creating String instance like this

String s = new String("Don't do this"); // explicit

has a performance problem since it creates two instance of string on for double quoted phrase "Don't do this" and one for the new String() constructor!

today i had the time to test it by my self I created two classes:

public class String1 {
public static void main(String[] args) {
    String s = new String("Hello");
    System.out.println(s);
}
}

public class String2 {
public static void main(String[] args) {
    String s = "Hello";
    System.out.println(s);
}
}

here is the output of javap:

C:\jav>javap String1
Compiled from "String1.java"
public class String1 extends java.lang.Object{
    public String1();
    public static void main(java.lang.String[]);
}

C:\jav>javap String2
Compiled from "String2.java"
public class String2 extends java.lang.Object{
    public String2();
    public static void main(java.lang.String[]);
}

seems they are same however with the -c flag the outputs are deferent.

C:\jav>javap -c String1
Compiled from "String1.java"
public class String1 extends java.lang.Object{
public String1();
  Code:
  0:   aload_0
  1:   invokespecial   #1; //Method java/lang/Object."<init>":()V
  4:   return

public static void main(java.lang.String[]);
  Code:
  0:   new     #2; //class java/lang/String
  3:   dup
  4:   ldc     #3; //String Hello
  6:   invokespecial   #4; //Method java/lang/String."<init>":(Ljava/lang/String;)V
  9:   astore_1
  10:  getstatic       #5; //Field java/lang/System.out:Ljava/io/PrintStream;
  13:  aload_1
  14:  invokevirtual   #6; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
  17:  return

}


C:\jav>javap -c String2
Compiled from "String2.java"
public class String2 extends java.lang.Object{
public String2();
  Code:
  0:   aload_0
  1:   invokespecial   #1; //Method java/lang/Object."<init>":()V
  4:   return

public static void main(java.lang.String[]);
  Code:
  0:   ldc     #2; //String Hello
  2:   astore_1
  3:   getstatic       #3; //Field java/lang/System.out:Ljava/io/PrintStream;
  6:   aload_1
  7:   invokevirtual   #4; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
  10:  return

}

so here is my questions :) first what is "ldc", astore_1 etc ? are there any documentation describing those? second does javac really can't figure out these two sentences are equal??

Was it helpful?

Solution

Wikipedia has a very convenient summary of all the possible Java Bytecode instructions. Also, to get the full picture, it's better to use javap -v, to see the entire content of the file, including the constant pool:

Classfile /.../String1.class
  Last modified 02/05/2013; size 458 bytes
  MD5 checksum e3c355bf648c7441784ffc6b9765ba4d
  Compiled from "String1.java"
public class String1
  SourceFile: "String1.java"
  minor version: 0
  major version: 51
  flags: ACC_PUBLIC, ACC_SUPER
Constant pool:
   #1 = Methodref          #8.#17         //  java/lang/Object."<init>":()V
   #2 = Class              #18            //  java/lang/String
   #3 = String             #19            //  Hello
   #4 = Methodref          #2.#20         //  java/lang/String."<init>":(Ljava/l
ang/String;)V
   #5 = Fieldref           #21.#22        //  java/lang/System.out:Ljava/io/Prin
tStream;
   #6 = Methodref          #23.#24        //  java/io/PrintStream.println:(Ljava
/lang/String;)V
   #7 = Class              #25            //  String1
   #8 = Class              #26            //  java/lang/Object
   #9 = Utf8               <init>
  #10 = Utf8               ()V
  #11 = Utf8               Code
  #12 = Utf8               LineNumberTable
  #13 = Utf8               main
  #14 = Utf8               ([Ljava/lang/String;)V
  #15 = Utf8               SourceFile
  #16 = Utf8               String1.java
  #17 = NameAndType        #9:#10         //  "<init>":()V
  #18 = Utf8               java/lang/String
  #19 = Utf8               Hello
  #20 = NameAndType        #9:#27         //  "<init>":(Ljava/lang/String;)V
  #21 = Class              #28            //  java/lang/System
  #22 = NameAndType        #29:#30        //  out:Ljava/io/PrintStream;
  #23 = Class              #31            //  java/io/PrintStream
  #24 = NameAndType        #32:#27        //  println:(Ljava/lang/String;)V
  #25 = Utf8               String1
  #26 = Utf8               java/lang/Object
  #27 = Utf8               (Ljava/lang/String;)V
  #28 = Utf8               java/lang/System
  #29 = Utf8               out
  #30 = Utf8               Ljava/io/PrintStream;
  #31 = Utf8               java/io/PrintStream
  #32 = Utf8               println
{
  public String1();
    flags: ACC_PUBLIC
    Code:
      stack=1, locals=1, args_size=1
         0: aload_0
         1: invokespecial #1                  // Method java/lang/Object."<init>
":()V
         4: return
      LineNumberTable:
        line 1: 0

  public static void main(java.lang.String[]);
    flags: ACC_PUBLIC, ACC_STATIC
    Code:
      stack=3, locals=2, args_size=1
         0: new           #2                  // class java/lang/String
         3: dup
         4: ldc           #3                  // String Hello
         6: invokespecial #4                  // Method java/lang/String."<init>
":(Ljava/lang/String;)V
         9: astore_1
        10: getstatic     #5                  // Field java/lang/System.out:Ljav
a/io/PrintStream;
        13: aload_1
        14: invokevirtual #6                  // Method java/io/PrintStream.prin
tln:(Ljava/lang/String;)V
        17: return
      LineNumberTable:
        line 3: 0
        line 4: 10
        line 5: 17
}

And now it's clear from where ldc loads the constant.

Regarding your question about why javac doesn't bother with these optimizations - it's mostly because almost the entire optimization done on Java is deferred to runtime, where a different compiler runs: the JIT compiler, which compiles Java Bytecode to native machine code. javac does make some effort to optimize the "common" cases, but it's far from the aggressiveness of the jitter.

OTHER TIPS

ldc, astore_n, ... are bytecode instructions. You can find a list of them on Wikipedia, and deeper information by reading the JVM specification.

ldc pushes a constant (here a string) onto the stack for a further instruction. astore_1 stores the value that is on top of the stack into the local variable #1 (the local variable #0 is the parameter of the method). So in your second example, it loads "Hello" from the constants and stores it in the local variable #1.

Your first implementation shows that a new instance of String is created then stored in the local variable #1. So it is less efficient than your second snippet of code. Besides, you can not compare the two Strings by using == in your first implementation, since it is not the same instance and that the new string is not interned.

I think this answers your main question, but not sure about the ldc Difference between string object and string literal The main point here is the literal can be interned. Of course it could figure out they are equivalent, but that is what the equals method is for, the == is testing object equality and in the first case java is required to instantiate them as separate objects.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top