Question

In languages like Java and C#, strings are immutable and it can be computationally expensive to build a string one character at a time. In said languages, there are library classes to reduce this cost such as C# System.Text.StringBuilder and Java java.lang.StringBuilder.

Does php (4 or 5; I'm interested in both) share this limitation? If so, are there similar solutions to the problem available?

Was it helpful?

Solution

No, there is no type of stringbuilder class in PHP, since strings are mutable.

That being said, there are different ways of building a string, depending on what you're doing.

echo, for example, will accept comma-separated tokens for output.

// This...
echo 'one', 'two';

// Is the same as this
echo 'one';
echo 'two';

What this means is that you can output a complex string without actually using concatenation, which would be slower

// This...
echo 'one', 'two';

// Is faster than this...
echo 'one' . 'two';

If you need to capture this output in a variable, you can do that with the output buffering functions.

Also, PHP's array performance is really good. If you want to do something like a comma-separated list of values, just use implode()

$values = array( 'one', 'two', 'three' );
$valueList = implode( ', ', $values );

Lastly, make sure you familiarize yourself with PHP's string type and it's different delimiters, and the implications of each.

OTHER TIPS

I was curious about this, so I ran a test. I used the following code:

<?php
ini_set('memory_limit', '1024M');
define ('CORE_PATH', '/Users/foo');
define ('DS', DIRECTORY_SEPARATOR);

$numtests = 1000000;

function test1($numtests)
{
    $CORE_PATH = '/Users/foo';
    $DS = DIRECTORY_SEPARATOR;
    $a = array();

    $startmem = memory_get_usage();
    $a_start = microtime(true);
    for ($i = 0; $i < $numtests; $i++) {
        $a[] = sprintf('%s%sDesktop%sjunk.php', $CORE_PATH, $DS, $DS);
    }
    $a_end = microtime(true);
    $a_mem = memory_get_usage();

    $timeused = $a_end - $a_start;
    $memused = $a_mem - $startmem;

    echo "TEST 1: sprintf()\n";
    echo "TIME: {$timeused}\nMEMORY: $memused\n\n\n";
}

function test2($numtests)
{
    $CORE_PATH = '/Users/shigh';
    $DS = DIRECTORY_SEPARATOR;
    $a = array();

    $startmem = memory_get_usage();
    $a_start = microtime(true);
    for ($i = 0; $i < $numtests; $i++) {
        $a[] = $CORE_PATH . $DS . 'Desktop' . $DS . 'junk.php';
    }
    $a_end = microtime(true);
    $a_mem = memory_get_usage();

    $timeused = $a_end - $a_start;
    $memused = $a_mem - $startmem;

    echo "TEST 2: Concatenation\n";
    echo "TIME: {$timeused}\nMEMORY: $memused\n\n\n";
}

function test3($numtests)
{
    $CORE_PATH = '/Users/shigh';
    $DS = DIRECTORY_SEPARATOR;
    $a = array();

    $startmem = memory_get_usage();
    $a_start = microtime(true);
    for ($i = 0; $i < $numtests; $i++) {
        ob_start();
        echo $CORE_PATH,$DS,'Desktop',$DS,'junk.php';
        $aa = ob_get_contents();
        ob_end_clean();
        $a[] = $aa;
    }
    $a_end = microtime(true);
    $a_mem = memory_get_usage();

    $timeused = $a_end - $a_start;
    $memused = $a_mem - $startmem;

    echo "TEST 3: Buffering Method\n";
    echo "TIME: {$timeused}\nMEMORY: $memused\n\n\n";
}

function test4($numtests)
{
    $CORE_PATH = '/Users/shigh';
    $DS = DIRECTORY_SEPARATOR;
    $a = array();

    $startmem = memory_get_usage();
    $a_start = microtime(true);
    for ($i = 0; $i < $numtests; $i++) {
        $a[] = "{$CORE_PATH}{$DS}Desktop{$DS}junk.php";
    }
    $a_end = microtime(true);
    $a_mem = memory_get_usage();

    $timeused = $a_end - $a_start;
    $memused = $a_mem - $startmem;

    echo "TEST 4: Braced in-line variables\n";
    echo "TIME: {$timeused}\nMEMORY: $memused\n\n\n";
}

function test5($numtests)
{
    $a = array();

    $startmem = memory_get_usage();
    $a_start = microtime(true);
    for ($i = 0; $i < $numtests; $i++) {
        $CORE_PATH = CORE_PATH;
        $DS = DIRECTORY_SEPARATOR;
        $a[] = "{$CORE_PATH}{$DS}Desktop{$DS}junk.php";
    }
    $a_end = microtime(true);
    $a_mem = memory_get_usage();

    $timeused = $a_end - $a_start;
    $memused = $a_mem - $startmem;

    echo "TEST 5: Braced inline variables with loop-level assignments\n";
    echo "TIME: {$timeused}\nMEMORY: $memused\n\n\n";
}

test1($numtests);
test2($numtests);
test3($numtests);
test4($numtests);
test5($numtests);

... And got the following results. Image attached. Clearly, sprintf is the least efficient way to do it, both in terms of time and memory consumption. EDIT: view image in another tab unless you have eagle vision. enter image description here

When you do a timed comparison, the differences are so small that it isn't very relevant. It would make more since to go for the choice that makes your code easier to read and understand.

StringBuilder analog is not needed in PHP.

I made a couple of simple tests:

in PHP:

$iterations = 10000;
$stringToAppend = 'TESTSTR';
$timer = new Timer(); // based on microtime()
$s = '';
for($i = 0; $i < $iterations; $i++)
{
    $s .= ($i . $stringToAppend);
}
$timer->VarDumpCurrentTimerValue();

$timer->Restart();

// Used purlogic's implementation.
// I tried other implementations, but they are not faster
$sb = new StringBuilder(); 

for($i = 0; $i < $iterations; $i++)
{
    $sb->append($i);
    $sb->append($stringToAppend);
}
$ss = $sb->toString();
$timer->VarDumpCurrentTimerValue();

in C# (.NET 4.0):

const int iterations = 10000;
const string stringToAppend = "TESTSTR";
string s = "";
var timer = new Timer(); // based on StopWatch

for(int i = 0; i < iterations; i++)
{
    s += (i + stringToAppend);
}

timer.ShowCurrentTimerValue();

timer.Restart();

var sb = new StringBuilder();

for(int i = 0; i < iterations; i++)
{
    sb.Append(i);
    sb.Append(stringToAppend);
}

string ss = sb.ToString();

timer.ShowCurrentTimerValue();

Results:

10000 iterations:
1) PHP, ordinary concatenation: ~6ms
2) PHP, using StringBuilder: ~5 ms
3) C#, ordinary concatenation: ~520ms
4) C#, using StringBuilder: ~1ms

100000 iterations:
1) PHP, ordinary concatenation: ~63ms
2) PHP, using StringBuilder: ~555ms
3) C#, ordinary concatenation: ~91000ms // !!!
4) C#, using StringBuilder: ~17ms

I know what you're talking about. I just created this simple class to emulate the Java StringBuilder class.

class StringBuilder {

  private $str = array();

  public function __construct() { }

  public function append($str) {
    $this->str[] = $str;
  }

  public function toString() {
    return implode($this->str);
  }

}

PHP strings are mutable. You can change specific characters like this:

$string = 'abc';
$string[2] = 'a'; // $string equals 'aba'
$string[3] = 'd'; // $string equals 'abad'
$string[5] = 'e'; // $string equals 'abad e' (fills character(s) in between with spaces)

And you can append characters to a string like this:

$string .= 'a';

Yes. They do. For e.g., if you want to echo couple of strings together, use

echo str1,str2,str3 

instead of

echo str1.str2.str3 
to get it a little faster.

I wrote the code at the end of this post to test the different forms of string concatenation and they really are all almost exactly equal in both memory and time footprints.

The two primary methods I used are concatenating strings onto each other, and filling an array with strings and then imploding them. I did 500 string additions with a 1MB string in php 5.6 (so the result is a 500MB string). At every iteration of the test, all memory and time footprints were very very close (at ~$IterationNumber*1MB). The runtime of both tests was 50.398 seconds and 50.843 seconds consecutively which are most likely within acceptable margins of error.

Garbage collection of strings that are no longer referenced seems to be pretty immediate, even without ever leaving the scope. Since the strings are mutable, no extra memory is really required after the fact.

HOWEVER, The following tests showed that there is a different in peak memory usage WHILE the strings are being concatenated.

$OneMB=str_repeat('x', 1024*1024);
$Final=$OneMB.$OneMB.$OneMB.$OneMB.$OneMB;
print memory_get_peak_usage();

Result=10,806,800 bytes (~10MB w/o the initial PHP memory footprint)

$OneMB=str_repeat('x', 1024*1024);
$Final=implode('', Array($OneMB, $OneMB, $OneMB, $OneMB, $OneMB));
print memory_get_peak_usage();

Result=6,613,320 bytes (~6MB w/o the initial PHP memory footprint)

So there is in fact a difference that could be significant in very very large string concatenations memory-wise (I have run into such examples when creating very large data sets or SQL queries).

But even this fact is disputable depending upon the data. For example, concatenating 1 character onto a string to get 50 million bytes (so 50 million iterations) took a maximum amount of 50,322,512 bytes (~48MB) in 5.97 seconds. While doing the array method ended up using 7,337,107,176 bytes (~6.8GB) to create the array in 12.1 seconds, and then took an extra 4.32 seconds to combine the strings from the array.

Anywho... the below is the benchmark code I mentioned at the beginning which shows the methods are pretty much equal. It outputs a pretty HTML table.

<?
//Please note, for the recursion test to go beyond 256, xdebug.max_nesting_level needs to be raised. You also may need to update your memory_limit depending on the number of iterations

//Output the start memory
print 'Start: '.memory_get_usage()."B<br><br>Below test results are in MB<br>";

//Our 1MB string
global $OneMB, $NumIterations;
$OneMB=str_repeat('x', 1024*1024);
$NumIterations=500;

//Run the tests
$ConcatTest=RunTest('ConcatTest');
$ImplodeTest=RunTest('ImplodeTest');
$RecurseTest=RunTest('RecurseTest');

//Output the results in a table
OutputResults(
  Array('ConcatTest', 'ImplodeTest', 'RecurseTest'),
  Array($ConcatTest, $ImplodeTest, $RecurseTest)
);

//Start a test run by initializing the array that will hold the results and manipulating those results after the test is complete
function RunTest($TestName)
{
  $CurrentTestNums=Array();
  $TestStartMem=memory_get_usage();
  $StartTime=microtime(true);
  RunTestReal($TestName, $CurrentTestNums, $StrLen);
  $CurrentTestNums[]=memory_get_usage();

  //Subtract $TestStartMem from all other numbers
  foreach($CurrentTestNums as &$Num)
    $Num-=$TestStartMem;
  unset($Num);

  $CurrentTestNums[]=$StrLen;
  $CurrentTestNums[]=microtime(true)-$StartTime;

  return $CurrentTestNums;
}

//Initialize the test and store the memory allocated at the end of the test, with the result
function RunTestReal($TestName, &$CurrentTestNums, &$StrLen)
{
  $R=$TestName($CurrentTestNums);
  $CurrentTestNums[]=memory_get_usage();
  $StrLen=strlen($R);
}

//Concatenate 1MB string over and over onto a single string
function ConcatTest(&$CurrentTestNums)
{
  global $OneMB, $NumIterations;
  $Result='';
  for($i=0;$i<$NumIterations;$i++)
  {
    $Result.=$OneMB;
    $CurrentTestNums[]=memory_get_usage();
  }
  return $Result;
}

//Create an array of 1MB strings and then join w/ an implode
function ImplodeTest(&$CurrentTestNums)
{
  global $OneMB, $NumIterations;
  $Result=Array();
  for($i=0;$i<$NumIterations;$i++)
  {
    $Result[]=$OneMB;
    $CurrentTestNums[]=memory_get_usage();
  }
  return implode('', $Result);
}

//Recursively add strings onto each other
function RecurseTest(&$CurrentTestNums, $TestNum=0)
{
  Global $OneMB, $NumIterations;
  if($TestNum==$NumIterations)
    return '';

  $NewStr=RecurseTest($CurrentTestNums, $TestNum+1).$OneMB;
  $CurrentTestNums[]=memory_get_usage();
  return $NewStr;
}

//Output the results in a table
function OutputResults($TestNames, $TestResults)
{
  global $NumIterations;
  print '<table border=1 cellspacing=0 cellpadding=2><tr><th>Test Name</th><th>'.implode('</th><th>', $TestNames).'</th></tr>';
  $FinalNames=Array('Final Result', 'Clean');
  for($i=0;$i<$NumIterations+2;$i++)
  {
    $TestName=($i<$NumIterations ? $i : $FinalNames[$i-$NumIterations]);
    print "<tr><th>$TestName</th>";
    foreach($TestResults as $TR)
      printf('<td>%07.4f</td>', $TR[$i]/1024/1024);
    print '</tr>';
  }

  //Other result numbers
  print '<tr><th>Final String Size</th>';
  foreach($TestResults as $TR)
    printf('<td>%d</td>', $TR[$NumIterations+2]);
  print '</tr><tr><th>Runtime</th>';
    foreach($TestResults as $TR)
      printf('<td>%s</td>', $TR[$NumIterations+3]);
  print '</tr></table>';
}
?>

Firstly, if you don't need the strings to be concatenated, don't do it: it will always be quicker to do

echo $a,$b,$c;

than

echo $a . $b . $c;

However, at least in PHP5, string concatenation is really quite fast, especially if there's only one reference to a given string. I guess the interpreter uses a StringBuilder-like technique internally.

If you're placing variable values within PHP strings, I understand that it's slightly quicker to use in-line variable inclusion (that's not it's official name - I can't remember what is)

$aString = 'oranges';
$compareString = "comparing apples to {$aString}!";
echo $compareString
   comparing apples to oranges!

Must be inside double-quotes to work. Also works for array members (i.e.

echo "You requested page id {$_POST['id']}";

)

no such limitation in php, php can concatenate strng with the dot(.) operator

$a="hello ";
$b="world";
echo $a.$b;

outputs "hello world"

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top