Have to call curl_close() twice before handle is closed and cookie jar can be read. Is this a bug?

StackOverflow https://stackoverflow.com/questions/22751581

  •  24-06-2023
  •  | 
  •  

문제

I've been banging my head against the wall for hours trying to understand why cURL's cookie jar file was empty when I tried reading it. I just discovered that my code works if I call curl_close() twice instead of once, however, and I'm wondering if this is a bug with cURL.

Here's an example:

curl_close($chInfo['handle']);
var_dump(is_resource($chInfo['handle']));

That outputs boolean true. So, in other words, the handle isn't closed, despite the fact that I called curl_close().

My next thought was that maybe it takes some time for the handle to be closed, so I tried using sleep() for a few seconds after the curl_close() call, but there wasn't any difference.

Out of desperation, I tried copying the curl_close() line, like this:

curl_close($chInfo['handle']);
curl_close($chInfo['handle']);
var_dump(is_resource($chInfo['handle']));

That outputs boolean false, meaning the handle is closed, and I am able to read from the cookie jar file (cURL writes the cookies to the file when the handle is closed).

So what's going on here? This seems an awful lot like a bug!

EDIT: I can't post my full code (you wouldn't want to read it anyway!), but here is a simplified example (note that only one url is fetched in this example, whereas in my real code curl_multi is utilized to fetch many URLs simultaneously):

$curlOptions = array(
    CURLOPT_USERAGENT      => 'Mozilla/5.001 (windows; U; NT4.0; en-US; rv:1.0) Gecko/25250101',
    CURLOPT_CONNECTTIMEOUT => 5, // the number of seconds to wait while trying to connect.
    CURLOPT_TIMEOUT        => 5, // the maximum number of seconds to allow cURL functions to execute.
    CURLOPT_RETURNTRANSFER => 1, // TRUE to return the transfer as a string of the return value of curl_exec() instead of outputting it out directly.
    CURLOPT_FOLLOWLOCATION => 1,
    CURLOPT_MAXREDIRS      => 10,
    CURLOPT_AUTOREFERER    => 1,
    CURLOPT_REFERER        => null,
    CURLOPT_POST           => 0,  // GET request by default
    CURLOPT_POSTFIELDS     => '', // no POST data by default
    CURLINFO_HEADER_OUT    => 1, // allows the request header to be retrieved
    CURLOPT_HEADER         => 1, // returns the response header along with the page body
    CURLOPT_URL            => 'http://www.example.com/',
    CURLOPT_COOKIEJAR      => __DIR__ . '/cookie.txt',
    CURLOPT_COOKIEFILE     => __DIR__ . '/cookie.txt'
);


$ch = curl_init();
curl_setopt_array($ch, $curlOptions); // set the options for this handle

$mh = curl_multi_init();
$responses = array();
curl_multi_add_handle($mh, $ch); // add the handle to the curl_multi object

do
{
    $result   = curl_multi_exec($mh, $running);
    $activity = curl_multi_select($mh);    // blocks until there's activity on the curl_multi connection (in which case it returns a number > 0), or until 1 sec has passed

    while($chInfo = curl_multi_info_read($mh))
    {
        $chStatus = curl_getinfo($chInfo['handle']);

        if($chStatus['http_code'] == 200) // if the page was retrieved successfully
        {
            $response = curl_multi_getcontent($chInfo['handle']); // get the response

            curl_multi_remove_handle($mh, $chInfo['handle']); // remove the curl handle that was just completed
            curl_close($chInfo['handle']);                    // close the curl handle that was just completed (cookies are saved when the handle is closed?)
            curl_close($chInfo['handle']);

            var_dump(is_resource($chInfo['handle']));
        }
        else // request failed
        {
            echo 'Error: Request failed with http_code: ' . $chStatus['http_code'] . ', curl error: ' . curl_error($chInfo['handle']). PHP_EOL;
        }
    }
} while ($running > 0);

curl_multi_close($mh);

If you run the above code, the output will be

boolean false

Indicating that the handle is closed. However, if you remove the second call to curl_close(), then the output changes to

boolean true

Indicating the handle is not closed.

도움이 되었습니까?

해결책

This is not realy a bug, but just the way it works. If you look at the source code you can see what is happening.

At first you open the handle with $ch = curl_init(); and looking at the source in ext\curl\interface.c you can see that internally it sets ch->uses = 0;

Then you call curl_multi_add_handle($mh, $ch); and looking at ext\curl\multi.c this method does ch->uses++;. At this point ch->uses==1

Now the last part, looking at curl_close($chInfo['handle']);, again in ext\curl\interface.c it has the following code:

if (ch->uses) {
    ch->uses--;
} else {
    zend_list_delete(Z_LVAL_P(zid));
}

So the first attempt to close it will decrease ch->uses and the second attempt it will actually close it.

This internal pointer only increases when using curl_multi_add_handle or when using curl_copy_handle. So I guess the idea was for curl_multi_add_handle to use a copy of the handle and not the actual handle.

다른 팁

Here is no issue. When using multi-curl you don't need to call curl_close. Instead, you have to call curl_multi_remove_handle on each used handle. So the curl_close call(s) in your code is redundant.

See examples of proper multi-curl flow here: 1, 2.

The 'handle' is not closed in the loop after the loop you can remove the handles

    curl_multi_remove_handle($mh, $ch1);
    /* this is not suppose to be required but the remove sometimes fails to close the connection */
    curl_close($ch1); 
    curl_multi_remove_handle($mh, $ch2);
    curl_close($ch2);

if you set up your connections as an array you can remove them through a separate loop after the main loop.

    /* init and add connection */
    foreach ($multi_urls as $i => $url) 
    {
        $ch[$i] = curl_init($url);
        curl_setopt($ch[$i], CURLOPT_RETURNTRANSFER, 1);
        curl_multi_add_handle ($mh, $ch[$i]);
    }

    main loop {
        ....
    }

    /* remove and close connection */
    foreach($ch AS $i => $conn)
    { 
       curl_multi_remove_handle($mh, $ch[$i]);
       curl_close($ch[$i]);
    }

I think there is only 1 mistake after looking into the code i.e.

while($chInfo = curl_multi_info_read($mh))

change with

while($chInfo == curl_multi_info_read($mh))
라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top