Question

I'm using the code below to simply decode and then re-encode a (relatively long) URL encoded string, without ever modifying the contents of the string:

<?php

$string = 'Data=%7B%22Data%22%3A%22%5B%7B%5C%22RoomId%5C%22%3A%5C%225241306%5C%22%2C%5C%22ControlName%5C%22%3A%5C%221%3A0%3Arate%5C%22%2C%5C%22Rate%5C%22%3A%5C%22RACK%20RATE%5C%22%2C%5C%22Allocation%5C%22%3A%5C%222%5C%22%2C%5C%22Status%5C%22%3A%5C%22a%5C%22%2C%5C%22MinStay%5C%22%3A%5C%221%5C%22%2C%5C%22ErrorControlId%5C%22%3A%5C%221%3A0%3Arate%5C%22%2C%5C%22AllocationDate%5C%22%3A%5C%2216%2F09%2F2013%2000%3A00%3A00%5C%22%7D%2C%7B%5C%22RoomId%5C%22%3A%5C%225241306%5C%22%2C%5C%22ControlName%5C%22%3A%5C%220%3A0%3Arate%5C%22%2C%5C%22Rate%5C%22%3A%5C%22RACK%20RATE%5C%22%2C%5C%22Allocation%5C%22%3A%5C%221%5C%22%2C%5C%22Status%5C%22%3A%5C%22a%5C%22%2C%5C%22MinStay%5C%22%3A%5C%221%5C%22%2C%5C%22ErrorControlId%5C%22%3A%5C%220%3A0%3Arate%5C%22%2C%5C%22AllocationDate%5C%22%3A%5C%2215%2F09%2F2013%2000%3A00%3A00%5C%22%7D%5D%22%2C%22IsWizard%22%3Afalse%2C%22InitialBindDate%22%3A%2215%2F09%2F13%22%2C%22EndBindDate%22%3A%2217%2F09%2F2013%22%7D';
echo $string;

$decoded = urldecode($string);
echo "<br><br>$decoded";

$encoded = urlencode($decoded);
echo "<br><br>$encoded";

?>

The original string is 930 chars long. After decoding and re-encoding, it's down to 924 chars. Why is this happening and how can I prevent it?

EDIT:

It should be noted that if I decode $encoded as such:

$decodedTwo = urldecode($encoded);
echo "<br><br>$decodedTwo";

Then I notice that both decoded strings are of the same length. But I need to know why the original encoded string and the re-encoded string are of different length and how I can prevent that.

Was it helpful?

Solution

The reencoded string is doing two things differently:

= is being reencoded to %3D making your string 2 characters longer.

%20 is being reencoded as + making the string 8 characters shorter (4 occurances)

The net difference is the 6 characters you are seeing.

Doing a simple str_replace like

$encoded = str_replace(["%3D", "+"], ["=", "%20"], $encoded);

Should resolve the problem in this case but both are valid representations of the encoding, I am curious as to why the length difference is a problem.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top