Question

One of our clients requires that we send some transaction information to their APIs using SOAP. One of the fields needs to be trimmed to 30 characters, so we us mb_substr() as follows:

$params->Request->Description = mb_substr($title, 0, 30, 'UTF-8');

We instantiate the SoapClient object as follows:

$client = new SoapClient(
                    $wsdlUri,
                    array(
                        'trace' => 1,
                        'exceptions' => true, 
                        'cache_wsdl' => WSDL_CACHE_NONE, 
                        'soap_version' => SOAP_1_2, 
                        'encoding' => 'UTF-8'
                    )
                );

My understanding is that this will tell SoapClient that strings will be provided in UTF-8 format, and when we trim to 30 characters we are doing it to 30 UTF-8 characters, and not 30 bytes.

Sound of Contact - Möbius Slip is being sent as Sound of Contact - Möbius Slip. Call of Duty®: Ghosts Gold Edition is being sent as Call of Duty®: Ghosts Gold Edi. I can see with these that we have 31 characters here, which is why the remote service is rejecting the call. If the title is less than 31 characters then it goes through fine, even when characters are mangled by the encoding.

We know that $title is OK, because we send this (the whole thing) to other sources via SOAP with no problem; it is stored in the remote system and is displayed correctly. It's just this one web service that we're having a problem with. Am I doing something wrong when instantiating the SoapClient object? Am I using mb_substr() incorrectly? Is there something else that I've missed out?

This is an example of the XML that is being generated:

<?xml version="1.0" encoding="UTF-8"?>
<env:Envelope xmlns:env="http://www.w3.org/2003/05/soap-envelope" xmlns:ns1="http://ourclient/webservices/" xmlns:ns2="http://ourclient/webservices/method/" xmlns:ns3="http://www.w3.org/2005/08/addressing">
  <env:Header>
    <ns3:Action>http://ourclient/webservices/method/action</ns3:Action>
  </env:Header>
  <env:Body>
    <ns1:Method>
      <ns1:Request>
        <!-- CROPPED -->
        <ns1:Description>Call of Duty®: Ghosts Gold Edi</ns1:Description>
      </ns1:Request>
    </ns1:Method>
  </env:Body>
</env:Envelope>

Thanks

Was it helpful?

Solution

So it turns out that we were doing everything right, but we were having issues with strings UTF-8 characters where the string was > 30 bytes as the API was rejecting this data. After lots of testing and back-and-forth with the service provider, we eventually established that the problem was actually at their end. They were telling us that they would accept 30 UTF-8 characters, but in actual fact it was 30-bytes of UTF-8 characters. The solution was actually to use the mb_strcut() method: http://php.net/manual/en/function.mb-strcut.php

$params->Request->Description = mb_strcut($title, 0, 30, mb_detect_encoding($title));

This will trim to 30 bytes (as a simple substr() would) but if it trims midway through a multi-byte character it will back-track to before the current character and only return full characters at <=30-bytes.

Since we deployed this solution we've had no further issues. The reason it looked like we were sending broken information was due to the logs being output badly (facepalm)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top