質問

I am facing a problem when a remote web client with slow connection fails to send complete POST request with multipart/form-data content but PHP still uses partially received data to populate $_POST array. As a result one value in $_POST array can be incomplete and more values can be missing. I tried to ask same question in Apache list first and got an answer that Apache doesn't buffer the request body and passes it to PHP module as a giant blob.

Here is my sample POST request:

POST /test.php HTTP/1.0
Connection: close
Content-Length: 10000
Content-Type: multipart/form-data; boundary=ABCDEF

--ABCDEF
Content-Disposition: form-data; name="a"

A
--ABCDEF

You can see that Content-Length is 10000 bytes, but I send just one var a=A.

The PHP script is:

<?php print_r($_REQUEST); ?>

Web server waits for about 10 seconds for the rest of my request (but I don't send anything) and then returns this response:

HTTP/1.1 200 OK
Date: Wed, 27 Nov 2013 19:42:20 GMT
Server: Apache/2.2.22 (Debian)
X-Powered-By: PHP/5.4.4-14+deb7u3
Vary: Accept-Encoding
Content-Length: 23
Connection: close
Content-Type: text/html

Array
(
     [a] => A
)

So here is my question: How can I verify in PHP that the post request was received completely? $_SERVER['CONTENT_LENGTH'] would show 10000 from the request header, but is there a way to check the real content length received?

役に立ちましたか?

解決

I guess that the remote client is actually a browser with HTML page. otherwise, let me know and i'll try to adapt my solution.

You can add field <input type="hidden" name="complete"> (for example) as the last parameter. in PHP check firstly whether this parameter was sent from client. if this parameter sent - you can be sure that you got the entire data.

Now, i'm not sure that the order of parameters must be preserved according the RFC (of both, HTML and HTTP). but i've tried some variations and i saw that the order kept indeed.

Better solution will be, calculate (on client side) hash of the parameters and send him as another parameter. so you can be absolutely sure that you got the entire data. But this is starting to sound complicated...

他のヒント

As far as I know there is no way to check if the size of received content matches the value of the Content-Length header when using multipart/form-data as Content-Type, because you cannot get hold of the raw content.

1) If you can change Content-Type (to application/x-www-form-urlencoded for example) you can read php://input, which will contain the raw content of the request. The size of php://input should match Content-Length (assuming the value of Content-Length is correct). If there's a match, you can still use $_POST to get the processed content (regular post data). Read about php://input here.

2) Or you can serialize the data on the client and send it as text/plain. The server can check the size the same way as described above. The server will need to unserialize the received content to be able to work with it. And if the client generates a hash of the serialized data and send it along in a header (X-Content-Hash for example), the server can also generate a hash and check if it matches the one in the header. You won't need to check the hash, and can be a 100% sure the content is correct.

3) If you cannot change Content-Type, you'll need something different from size to verify the content. The client could use an extra header (something like X-Form-Data-Fields) to sum up the fields/keys/names of the content you're sending. The server could then check if all fields mentioned in the header are present in the content.

4) Another solution would be for the client to have a predefined key/value as last entry in the content. Something like:

--boundary
Content-Disposition: form-data; name="_final_field_"

TRUE
--boundary--

The server can check if that field is present in the content, if so the content must be complete.

update

When you need to pass binary data, you can't use option 1, but can still use option 2:

The client can base64 encode the binary entries, serialize the data (with any technique you like), generate a hash of the serialized data, send the hash as header and data as body. The server can generate a hash of the received content, check the hash with the one in the header (and report a mismatch), unserialize the content, base64 decode the binary entries.

This is a bit more work then plainly using multipart/form-data, but the server can verify with a 100% guarantee the content is the same as what the client sent.

If you can change the enctype to

multipart/form-data-alternate

the you can check

strlen(file_get_contents('php://input'))

vs.

$_SERVER['CONTENT_LENGTH']

This is a known bug in PHP and needs to be fixed there - https://bugs.php.net/bug.php?id=61471

They probably get cutoff by limits in Apache or PHP. I believe Apache also has a config variable for this.

Here are the PHP settings;

php.ini

post_max_size=20M
upload_max_filesize=20M

.htaccess

php_value post_max_size 20M
php_value upload_max_filesize 20M

Regarding form values that are completely missing due to connectivity issues, you can just check if they are set:

if(isset($_POST['key']){
    //value is set
}else{
    //connection was interrupted
}

For the large form data (such as an image upload) you could check the size of the received file using

$_FILES['key']['size']

A simple solution might use JavaScript to calculate the file size on the client side, and append that value to the form as a hidden input on form submission. You get the file size in JS using something like

var filesize = input.files[0].size;

Reference: JavaScript file upload size validation

Then on file upload, if the hidden form input's value matches the size of the uploaded file, the request was not interrupted by network connectivity issues.

maybe you can check with a valid variable, but not length, eg:

// client
$clientVars = array('var1' => 'val1', 'otherVar' => 'some value');
ksort($clientVars);  // dictionary sorted
$validVar = md5(implode('', $clientVars));
$values = 'var1=val1&otherVar=some value&validVar=' . $validVar;
httpRequest($url, values);

// server
$validVar = $_POST['validVar'];
unset($_POST['validVar']);
ksort($_POST);  // dictionary sorted
if (md5(implode('', $_POST)) == $validVar) {
    // completed POST, do something
} else {
    // not completed POST, log error and do something
}

I was also going to recommend using a hidden value, or hashing like MeNa mentions. (the issue there is that some algorithms are differently implemented over platforms, so your CRC32 in js might be different from a CRC32 in PHP. But with some testing you should be able to find a compatible one)

I'm going to suggest using symmetric encryption, just for the fact that it's an option. (I don't believe it's faster than hashing). Encryption offers, aside from confidentiality also integrity, ie. is this received message the one that was send.

Although streamciphers are very fast, blockciphers, like AES can be very fast as well, but this depends on your system, the languages you use etc. (also here, different implementations mean not all encryption is created equal)

If you can't decrypt the message (or it gives a garbled mess) than the message was incomplete.

But seriously, use hashing. hash the POST on the client, check the length first of the hash on the server. (some?) hashes are fixed length, so if the length doesn't match, it's wrong. Then hash the received POST and compare with the POST-hash. If you do it over the full POST, in a specified order (so any reordering is undone) the overhead is minimal.

All this, is assuming you just can't check the post message to see if fields are missing and is_set==True, length > 0 , !empty()...

I think what you are looking for is $HTTP_RAW_POST_DATA, this will give you the real POST length and then you can compare it to $_SERVER['CONTENT_LENGTH'].

I don't think it's possible to calculate original content size from $_REQUEST superglobal, at least for multipart/form-data requests.

I would add a custom header to your http request with all parameter=value hash, to be checked server side. Headers will arrive for sure so your hash header is always there. Be sure to join parameters in the same order, otherwise hash will be different. Also pay attention to encoding, must be the same on client and server.

If you can configure Apache, you could add a vhost with mod_proxy, configured to proxy on another vhost on the same server. This should filter uncomplete requests. Note that you're wasting 2 sockets per request this way, so keep an eye at resources usage if you think to go this way.

Some other solution that might be usefull... If the connection from the other side is slow, just remove the limit for executing the post.

set_time_limit(0);

And you`ll be sure that the hole post data will be sent.

If computing the content length isn't reasonable, you could probably get away with signing the data sent by the client.

Using javascript, serialize the form data to a json string or equivalent in a reasonably sane manner (i.e. sort it as needed) before submitting. Hash this string using one or two reasonably fast algorithms (e.g. crc32, md5, sha1), and add this extra hash data to what is about to get sent as a signature.

On the server, strip this extra hash data from the $_POST request, and then redo the same work in PHP. Compare the hashes accordingly: nothing got lost in translation if the hashes match. (Use two hashes if you want to void the minuscule risk of getting false positives.)

I'd wager there's a reasonable means to do something similar for files, e.g. fetching their name and size in JS, and adding that additional information to the data that gets signed.

This is somewhat related to what some PHP frameworks do to avoid tampering with session data, when the latter gets managed and stored in client-side cookies, so you'll probably find some readily available code to do this in the latter context.


Original answer:

Insofar as I'm aware, the difference between sending a GET or a POST request more or less to amounts sending something like:

GET /script.php?var1=foo&var2=bar
headers

vs sending something like:

POST /script.php
headers

var1=foo&var2=bar              <— content length is the length of this chunk

So for each part, you could calculate the length and check that vs the length advertised by the content-length header.

  • $_FILES entries have a handy size field which you can use directly.
  • For $_POST data, rebuild the query string that was sent and compute its length.

Points to be wary about:

  1. You need to know how the data is expected to be sent in some cases, e.g. var[]=foo&var[]=baz vs var[0]=foo&var[1]=baz
  2. You're dealing with the C-string length rather than the multibyte length in the latter case. (Though, I wouldn't be surprised to learn that an odd browser behaves inconsistently here and there.)

Further reading:

ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top