The output of a PHP script has little to do with the memory actually used by the PHP (plus webserver) process. The overhead varies depending on loaded modules and server configuration; basically in your setup each request spawns a suphp
instance, and even if it is dynamically linked (it is, isn't it?), lots of internal structures will be duplicated. I've found that the Apache PHP module is thriftier in its memory usage (but it too depends on loaded modules, and often not even PHP modules; so check your configuration if you want to jump ship).
You can use the PHP memory reporting functions to better evaluate the memory footprint for your requests during their lifetime, not only at the end, and see what operations consume the most memory. Freeing unneeded objects with unset()
is often beneficial.
You can run the requests one after the other and save the intermediate results to a disk file to reduce the overall memory footprint. The last call can then collect the results and send them along with one of the passthru functions.
Disabling buffering with ob_end_clean()
may help reduce the memory usage of the server process (even if it can do nothing for the PHP part). In your case I'm not sure it will work, seeing as how the process is actually an independent suphp
binary, but you can always try and see what happens.
Merging requests in a large JSON object
If several calls use the same input parameters, or "mergeable" parameters, you can try and merge the calls in a single one, thus also probably making everything that much faster: instead of returning one object, you return several objects rolled into one
Header('Content-Type: application/json');
die(json_encode(array(
'column1' => 'Output for the call to column1',
'column2' => ...
));
and in jQuery you get all of them and dispatch them where they need to go
function loadPage() {
$.get('/test-big.php', {
parcol1: 'parameter the code generating column 1'
}, function(data, textStatus, jqXHR) {
$('#column_1').append('<div>').html(data.column1);
$('#column_2').append('<div>').html(data.column2);
...
});
}
Even if the total time to satisfy all requests is longer, and there is then a longer delay before "something happens on the page", I feel that this approach is almost always preferrable for several reasons:
- it is a single HTTP connection, saving the connection overhead. With pipelining HTTP this is no longer a very big advantage, but it still is an advantage.
- the compression functions normally used by HTTP servers have a "performance curve" and don't compress, say, the first kB as well as the second and subsequent ones, especially if there are many repetitions. So sending five 2-Kb packets turns them into five 0.5 Kb packets for a total of 2.5 Kb. But actually the first Kb becomes 0.3 and the second 0.2; so sending one 10-Kb packet turns it into 0.3+0.2+0.18+0.18+... = maybe 2 Kb, which is a 20% saving at no cost.
- unless the requests are completely uncorrelated and hit totally and wildly different areas of your application, which is very unlikely, they will all incur the same setup costs (database connections, filters, buffers, loading classes, caching...). Incurring those costs only once instead of several times is a sweet deal.
- you use far fewer resources: not only RAM (one overhead versus many), but one DB connection, one cache connection, and so on and so forth. It's true that the total resource usage is not proportional to the number of requests because some of them will have terminated before some others are even started, but this way you are sure that the proportionality is constant.
- if the requests are very correlated (e.g. you ask for the total of one stock column to display, say, a semaphore graphic, and the total of another for a different semaphore...), then you can reengineer the SQL query or whatever in order to return all information at once. Instead of
SELECT opening, closing FROM nyse WHERE st='XXXX';
for twenty values of XXXX, you can doSELECT st, opening, closing FROM nyse WHERE st IN ('XXXX1','XXXX2',...);
and replace twenty queries with a single one which is only marginally slower than the first. This can't be done always, but it happens often enough that it's worth keeping in mind. By appropriately crafting the jQuery code, you might even be able to dispatch all the answers with a single.each
loop (e.g. if each target DIV has an ID which is identical to the code, you might send a JSON containing, say,'#XXX1.opening' => 1.23, '#XXX1.closing' => 1.24, ...
).