getting better performance appending strings than going through standard Java stringbuilder.append

StackOverflow https://stackoverflow.com/questions/1750604

  •  20-09-2019
  •  | 
  •  

Question

as part of the process of populating a search engine, I populate a Berekely-DB value-store as well. This process is repeated each night and at the moment +/- 60% of the total running time each night is caused by creating the values to be inserted into the value-store ( so excluding the actual insertion into Berekely-DB and the time caused by the Berekely client

These values are created by having a stringbuilder assigned to each key, and appending on average about 1000 strings to such a stringbuilder. The resulting values on average are about 10k. Im wondering if this can be done more effciently, given that: - the (on average) 1000 strings appended to each of the Stringbuilders are of fixed length: i.e: each String has the same length and this length is known up front) - all strings are appended to the end.

Would for example swapping out the stringbuilder for a char[] or characterStream / writer be more performant? That way I could keep and index where to write to in the char[].

Thanks, Geert-Jan

Was it helpful?

Solution

Revision III:

If string concatenation in StringBuilders is taking inordinately long, perhaps your memory is very full. So our goal is to achieve string concatenation without chewing up a lot of memory. Hopefully the savings in CPU time will follow automatically.

My plan went like this: Instead of concatenating those substrings into a long StringBuilder, you could build a List of references to those (pre-existing) Strings. The list of references should be shorter than the sum of the substrings and thus consume less memory.

Only when it becomes time to store that big String do we concatenate the parts in one big StringBuilder, pull out the String, store the String, throw away the reference to the String, clear the StringBuilder, repeat. I felt this was a brilliant solution!

However, from this article from 2002, a String reference in an array, probably likewise in an ArrayList, takes a whopping 8 bytes! A recent StackOverflow post confirmed that this is still so. Thus, a list of references to 10-byte Strings saves only 2 bytes per String. Thus, I present my "solution" as a possibility for similar problems, but I don't see this particular problem being able to benefit from it.

OTHER TIPS

You could create your stringbuilders with a higher initial capacity to reduce the amount of resizing, i.e. there's a constructor that allows you to say

int SIZE=10000;
StringBuilder b = new StringBuilder(SIZE);

I would expect that manually juggling char[] and indexes wouldn't improve much on this, as (I assume) that's what StringBuilder is already doing for you.

Where are these 1000 Strings coming from? I have trouble believing that the creation time for those 1000 objects doesn't completely dwarf the time needed for amortized expansion of your StringBuilder.

You should give ropes a try. The site is skimpy on details, but there's a great article here with a better description and some good benchmarks comparing append performance.

I haven't actually used the ropes package, haven't had a good enough excuse to. Looks promising, though.

Edit: Additional benchmark info

I downloaded the PerformanceTest class from the ropes article, and added tests for StringBuilder in addition to StringBuffer. The performance improvement of StringBuilder seems negligible.

I downloaded the test code from the ropes article and changed the test to include StringBuilder and StringBuffer.

Append plan length: 260
[StringBuilder] Average=     117,146,000 ns Median=     114,717,000ns
[StringBuffer]  Average=     117,624,400 ns Median=     115,552,000ns
[Rope]          Average=         484,600 ns Median=         483,000ns

Append plan length: 300
[StringBuilder] Average=     178,329,000 ns Median=     178,009,000ns
[StringBuffer]  Average=     217,147,800 ns Median=     216,819,000ns
[Rope]          Average=         252,800 ns Median=         253,000ns

Append plan length: 500
[StringBuilder] Average=     221,356,200 ns Median=     214,435,000ns
[StringBuffer]  Average=     227,432,200 ns Median=     219,650,000ns
[Rope]          Average=         510,000 ns Median=         507,000ns

The difference between StringBuilder and StringBuffer isn't that great. For the task at hand, Ropes seem like a clear win here.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top