Question


The HttpRequestPool class provides a solution. Many thanks to those who pointed this out.

A brief tutorial can be found at: http://www.phptutorial.info/?HttpRequestPool-construct


Problem

I'd like to make concurrent/parallel/simultaneous HTTP requests in PHP. I'd like to avoid consecutive requests as:

  • a set of requests will take too long to complete; the more requests the longer
  • the timeout of one request midway through a set may cause later requests to not be made (if a script has an execution time limit)

I have managed to find details for making simultaneuos [sic] HTTP requests in PHP with cURL, however I'd like to explicitly use PHP's HTTP functions if at all possible.

Specifically, I need to POST data concurrently to a set of URLs. The URLs to which data are posted are beyond my control; they are user-set.

I don't mind if I need to wait for all requests to finish before the responses can be processed. If I set a timeout of 30 seconds on each request and requests are made concurrently, I know I must wait a maximum of 30 seconds (perhaps a little more) for all requests to complete.

I can find no details of how this might be achieved. However, I did recently notice a mention in the PHP manual of PHP5+ being able to handle concurrent HTTP requests - I intended to make a note of it at the time, forgot, and cannot find it again.

Single request example (works fine)

<?php
$request_1 = new HttpRequest($url_1, HTTP_METH_POST);
$request_1->setRawPostData($dataSet_1);
$request_1->send();
?>

Concurrent request example (incomplete, clearly)

<?php
$request_1 = new HttpRequest($url_1, HTTP_METH_POST);
$request_1->setRawPostData($dataSet_1);

$request_2 = new HttpRequest($url_2, HTTP_METH_POST);
$request_2->setRawPostData($dataSet_2);

// ...

$request_N = new HttpRequest($url_N, HTTP_METH_POST);
$request_N->setRawPostData($dataSet_N);

// Do something to send() all requests at the same time
?>

Any thoughts would be most appreciated!

Clarification 1: I'd like to stick to the PECL HTTP functions as:

  • they offer a nice OOP interface
  • they're used extensively in the application in question and sticking to what's already in use should be beneficial from a maintenance perspective
  • I generally have to write fewer lines of code to make an HTTP request using the PECL HTTP functions compared to using cURL - fewer lines of code should also be beneficial from a maintenance perspective

Clarification 2: I realise PHP's HTTP functions aren't built in and perhaps I worded things wrongly there, which I shall correct. I have no concerns about people having to install extra stuff - this is not an application that is to be distributed, it's a web app with a server to itself.

Clarification 3: I'd be perfectly happy if someone authoritatively states that the PECL HTTP cannot do this.

Was it helpful?

Solution

I'm pretty sure HttpRequestPool is what you're looking for.

To elaborate a little, you can use forking to achieve what you're looking for, but that seems unnecessarily complex and not very useful in a HTML context. While I haven't tested, this code should be it:

// let $requests be an array of requests to send
$pool = new HttpRequestPool();
foreach ($requests as $request) {
  $pool->attach($request);
}
$pool->send();
foreach ($pool as $request) {
  // do stuff
}

OTHER TIPS

Did you try HttpRequestPool (it's part of Http)? It looks like it would pool up the request objects and work them. I know I read somewhere that Http would support simultaneous requests and aside from pool I can't find anything either.

I once had to solve similar problem: doing multiple requests without cumulating the response times.

The solution ended up being a custom-build function which used non-blocking sockets. It works something like this:

$request_list = array(
  # address => http request string
  #
   '127.0.0.1' => "HTTP/1.1  GET /index.html\nServer: website.com\n\n",
   '192.169.2.3' => "HTTP/1.1 POST /form.dat\nForm-data: ...",
  );

foreach($request_list as $addr => $http_request) {
    # first, create a socket and fire request to every host
    $socklist[$addr] = socket_create();
    socket_set_nonblock($socklist[$addr]); # Make operation asynchronious

    if (! socket_connect($socklist[$addr], $addr, 80))
        trigger_error("Cannot connect to remote address");

    # the http header is send to this host
    socket_send($socklist[$addr], $http_request, strlen($http_request), MSG_EOF);
}

$results = array();

foreach(array_keys($socklist) as $host_ip) {
    # Now loop and read every socket until it is exhausted
    $str = socket_read($socklist[$host_ip], 512, PHP_NORMAL_READ);
    if ($str != "") 
        # add to previous string
        $result[$host_ip] .= $str;
    else
        # Done reading this socket, close it
        socket_close($socklist[$host_ip]);
}
# $results now contains an array with the full response (including http-headers)
# of every connected host.

It's much faster since thunked reponses are fetched in semi-parallel since socket_read doesn't wait for the response but returns if the socket-buffer isn't full yet.

You can wrap this in appropriate OOP interfaces. You will need to create the HTTP-request string yourself, and process the server response of course.

A friend pointed me to CurlObjects ( http://trac.curlobjects.com/trac ) recently, which I found quite useful for using curl_multi.

$curlbase = new CurlBase; $curlbase->defaultOptions[ CURLOPT_TIMEOUT ] = 30; $curlbase->add( new HttpPost($url, array('name'=> 'value', 'a' => 'b'))); $curlbase->add( new HttpPost($url2, array('name'=> 'value', 'a' => 'b'))); $curlbase->add( new HttpPost($url3, array('name'=> 'value', 'a' => 'b'))); $curlbase->perform();

foreach($curlbase->requests as $request) { ... }

PHP's HTTP functions aren't built in, either - they're a PECL extension. If your concern is people having to install extra stuff, both solutions will have the same problem - and cURL is more likely to be installed, I'd imagine, as it comes default with every web host I've ever been on.

You could use pcntl_fork() to create a separate process for each request, then wait for them to end:

http://www.php.net/manual/en/function.pcntl-fork.php

Is there any reason you don't want to use cURL? The curl_multi_* functions would allow for multiple requests at the same time.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top