Question

Assuming String a and b:

a += b
a = a.concat(b)

Under the hood, are they the same thing?

Here is concat decompiled as reference. I'd like to be able to decompile the + operator as well to see what that does.

public String concat(String s) {

    int i = s.length();
    if (i == 0) {
        return this;
    }
    else {
        char ac[] = new char[count + i];
        getChars(0, count, ac, 0);
        s.getChars(0, i, ac, count);
        return new String(0, count + i, ac);
    }
}
Was it helpful?

Solution

No, not quite.

Firstly, there's a slight difference in semantics. If a is null, then a.concat(b) throws a NullPointerException but a+=b will treat the original value of a as if it were null. Furthermore, the concat() method only accepts String values while the + operator will silently convert the argument to a String (using the toString() method for objects). So the concat() method is more strict in what it accepts.

To look under the hood, write a simple class with a += b;

public class Concat {
    String cat(String a, String b) {
        a += b;
        return a;
    }
}

Now disassemble with javap -c (included in the Sun JDK). You should see a listing including:

java.lang.String cat(java.lang.String, java.lang.String);
  Code:
   0:   new     #2; //class java/lang/StringBuilder
   3:   dup
   4:   invokespecial   #3; //Method java/lang/StringBuilder."<init>":()V
   7:   aload_1
   8:   invokevirtual   #4; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
   11:  aload_2
   12:  invokevirtual   #4; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
   15:  invokevirtual   #5; //Method java/lang/StringBuilder.toString:()Ljava/lang/    String;
   18:  astore_1
   19:  aload_1
   20:  areturn

So, a += b is the equivalent of

a = new StringBuilder()
    .append(a)
    .append(b)
    .toString();

The concat method should be faster. However, with more strings the StringBuilder method wins, at least in terms of performance.

The source code of String and StringBuilder (and its package-private base class) is available in src.zip of the Sun JDK. You can see that you are building up a char array (resizing as necessary) and then throwing it away when you create the final String. In practice memory allocation is surprisingly fast.

Update: As Pawel Adamski notes, performance has changed in more recent HotSpot. javac still produces exactly the same code, but the bytecode compiler cheats. Simple testing entirely fails because the entire body of code is thrown away. Summing System.identityHashCode (not String.hashCode) shows the StringBuffer code has a slight advantage. Subject to change when the next update is released, or if you use a different JVM. From @lukaseder, a list of HotSpot JVM intrinsics.

OTHER TIPS

Niyaz is correct, but it's also worth noting that the special + operator can be converted into something more efficient by the Java compiler. Java has a StringBuilder class which represents a non-thread-safe, mutable String. When performing a bunch of String concatenations, the Java compiler silently converts

String a = b + c + d;

into

String a = new StringBuilder(b).append(c).append(d).toString();

which for large strings is significantly more efficient. As far as I know, this does not happen when you use the concat method.

However, the concat method is more efficient when concatenating an empty String onto an existing String. In this case, the JVM does not need to create a new String object and can simply return the existing one. See the concat documentation to confirm this.

So if you're super-concerned about efficiency then you should use the concat method when concatenating possibly-empty Strings, and use + otherwise. However, the performance difference should be negligible and you probably shouldn't ever worry about this.

I ran a similar test as @marcio but with the following loop instead:

String c = a;
for (long i = 0; i < 100000L; i++) {
    c = c.concat(b); // make sure javac cannot skip the loop
    // using c += b for the alternative
}

Just for good measure, I threw in StringBuilder.append() as well. Each test was run 10 times, with 100k reps for each run. Here are the results:

  • StringBuilder wins hands down. The clock time result was 0 for most the runs, and the longest took 16ms.
  • a += b takes about 40000ms (40s) for each run.
  • concat only requires 10000ms (10s) per run.

I haven't decompiled the class to see the internals or run it through profiler yet, but I suspect a += b spends much of the time creating new objects of StringBuilder and then converting them back to String.

Most answers here are from 2008. It looks that things have changed over the time. My latest benchmarks made with JMH shows that on Java 8 + is around two times faster than concat.

My benchmark:

@Warmup(iterations = 5, time = 200, timeUnit = TimeUnit.MILLISECONDS)
@Measurement(iterations = 5, time = 200, timeUnit = TimeUnit.MILLISECONDS)
public class StringConcatenation {

    @org.openjdk.jmh.annotations.State(Scope.Thread)
    public static class State2 {
        public String a = "abc";
        public String b = "xyz";
    }

    @org.openjdk.jmh.annotations.State(Scope.Thread)
    public static class State3 {
        public String a = "abc";
        public String b = "xyz";
        public String c = "123";
    }


    @org.openjdk.jmh.annotations.State(Scope.Thread)
    public static class State4 {
        public String a = "abc";
        public String b = "xyz";
        public String c = "123";
        public String d = "!@#";
    }

    @Benchmark
    public void plus_2(State2 state, Blackhole blackhole) {
        blackhole.consume(state.a+state.b);
    }

    @Benchmark
    public void plus_3(State3 state, Blackhole blackhole) {
        blackhole.consume(state.a+state.b+state.c);
    }

    @Benchmark
    public void plus_4(State4 state, Blackhole blackhole) {
        blackhole.consume(state.a+state.b+state.c+state.d);
    }

    @Benchmark
    public void stringbuilder_2(State2 state, Blackhole blackhole) {
        blackhole.consume(new StringBuilder().append(state.a).append(state.b).toString());
    }

    @Benchmark
    public void stringbuilder_3(State3 state, Blackhole blackhole) {
        blackhole.consume(new StringBuilder().append(state.a).append(state.b).append(state.c).toString());
    }

    @Benchmark
    public void stringbuilder_4(State4 state, Blackhole blackhole) {
        blackhole.consume(new StringBuilder().append(state.a).append(state.b).append(state.c).append(state.d).toString());
    }

    @Benchmark
    public void concat_2(State2 state, Blackhole blackhole) {
        blackhole.consume(state.a.concat(state.b));
    }

    @Benchmark
    public void concat_3(State3 state, Blackhole blackhole) {
        blackhole.consume(state.a.concat(state.b.concat(state.c)));
    }


    @Benchmark
    public void concat_4(State4 state, Blackhole blackhole) {
        blackhole.consume(state.a.concat(state.b.concat(state.c.concat(state.d))));
    }
}

Results:

Benchmark                             Mode  Cnt         Score         Error  Units
StringConcatenation.concat_2         thrpt   50  24908871.258 ± 1011269.986  ops/s
StringConcatenation.concat_3         thrpt   50  14228193.918 ±  466892.616  ops/s
StringConcatenation.concat_4         thrpt   50   9845069.776 ±  350532.591  ops/s
StringConcatenation.plus_2           thrpt   50  38999662.292 ± 8107397.316  ops/s
StringConcatenation.plus_3           thrpt   50  34985722.222 ± 5442660.250  ops/s
StringConcatenation.plus_4           thrpt   50  31910376.337 ± 2861001.162  ops/s
StringConcatenation.stringbuilder_2  thrpt   50  40472888.230 ± 9011210.632  ops/s
StringConcatenation.stringbuilder_3  thrpt   50  33902151.616 ± 5449026.680  ops/s
StringConcatenation.stringbuilder_4  thrpt   50  29220479.267 ± 3435315.681  ops/s

Tom is correct in describing exactly what the + operator does. It creates a temporary StringBuilder, appends the parts, and finishes with toString().

However, all of the answers so far are ignoring the effects of HotSpot runtime optimizations. Specifically, these temporary operations are recognized as a common pattern and are replaced with more efficient machine code at run-time.

@marcio: You've created a micro-benchmark; with modern JVM's this is not a valid way to profile code.

The reason run-time optimization matters is that many of these differences in code -- even including object-creation -- are completely different once HotSpot gets going. The only way to know for sure is profiling your code in situ.

Finally, all of these methods are in fact incredibly fast. This might be a case of premature optimization. If you have code that concatenates strings a lot, the way to get maximum speed probably has nothing to do with which operators you choose and instead the algorithm you're using!

How about some simple testing? Used the code below:

long start = System.currentTimeMillis();

String a = "a";

String b = "b";

for (int i = 0; i < 10000000; i++) { //ten million times
     String c = a.concat(b);
}

long end = System.currentTimeMillis();

System.out.println(end - start);
  • The "a + b" version executed in 2500ms.
  • The a.concat(b) executed in 1200ms.

Tested several times. The concat() version execution took half of the time on average.

This result surprised me because the concat() method always creates a new string (it returns a "new String(result)". It's well known that:

String a = new String("a") // more than 20 times slower than String a = "a"

Why wasn't the compiler capable of optimize the string creation in "a + b" code, knowing the it always resulted in the same string? It could avoid a new string creation. If you don't believe the statement above, test for your self.

Basically, there are two important differences between + and the concat method.

  1. If you are using the concat method then you would only be able to concatenate strings while in case of the + operator, you can also concatenate the string with any data type.

    For Example:

    String s = 10 + "Hello";
    

    In this case, the output should be 10Hello.

    String s = "I";
    String s1 = s.concat("am").concat("good").concat("boy");
    System.out.println(s1);
    

    In the above case you have to provide two strings mandatory.

  2. The second and main difference between + and concat is that:

    Case 1: Suppose I concat the same strings with concat operator in this way

    String s="I";
    String s1=s.concat("am").concat("good").concat("boy");
    System.out.println(s1);
    

    In this case total number of objects created in the pool are 7 like this:

    I
    am
    good
    boy
    Iam
    Iamgood
    Iamgoodboy
    

    Case 2:

    Now I am going to concatinate the same strings via + operator

    String s="I"+"am"+"good"+"boy";
    System.out.println(s);
    

    In the above case total number of objects created are only 5.

    Actually when we concatinate the strings via + operator then it maintains a StringBuffer class to perform the same task as follows:-

    StringBuffer sb = new StringBuffer("I");
    sb.append("am");
    sb.append("good");
    sb.append("boy");
    System.out.println(sb);
    

    In this way it will create only five objects.

So guys these are the basic differences between + and the concat method. Enjoy :)

For the sake of completeness, I wanted to add that the definition of the '+' operator can be found in the JLS SE8 15.18.1:

If only one operand expression is of type String, then string conversion (§5.1.11) is performed on the other operand to produce a string at run time.

The result of string concatenation is a reference to a String object that is the concatenation of the two operand strings. The characters of the left-hand operand precede the characters of the right-hand operand in the newly created string.

The String object is newly created (§12.5) unless the expression is a constant expression (§15.28).

About the implementation the JLS says the following:

An implementation may choose to perform conversion and concatenation in one step to avoid creating and then discarding an intermediate String object. To increase the performance of repeated string concatenation, a Java compiler may use the StringBuffer class or a similar technique to reduce the number of intermediate String objects that are created by evaluation of an expression.

For primitive types, an implementation may also optimize away the creation of a wrapper object by converting directly from a primitive type to a string.

So judging from the 'a Java compiler may use the StringBuffer class or a similar technique to reduce', different compilers could produce different byte-code.

The + operator can work between a string and a string, char, integer, double or float data type value. It just converts the value to its string representation before concatenation.

The concat operator can only be done on and with strings. It checks for data type compatibility and throws an error, if they don't match.

Except this, the code you provided does the same stuff.

I don't think so.

a.concat(b) is implemented in String and I think the implementation didn't change much since early java machines. The + operation implementation depends on Java version and compiler. Currently + is implemented using StringBuffer to make the operation as fast as possible. Maybe in the future, this will change. In earlier versions of java + operation on Strings was much slower as it produced intermediate results.

I guess that += is implemented using + and similarly optimized.

When using +, the speed decreases as the string's length increases, but when using concat, the speed is more stable, and the best option is using the StringBuilder class which has stable speed in order to do that.

I guess you can understand why. But the totally best way for creating long strings is using StringBuilder() and append(), either speed will be unacceptable.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top