Question

I develop a server and a mobile client that communicate over HTTP. Server is written in Delphi 7 (because it has to be compatible with old code), client is mobile application written in XE6. Server sends to client stream of data that contains strings. A problem is connected to encoding.

On the server I try to pass strings in UTF8:

//Writes string to stream
procedure TStreamWrap.WriteString(Value: string);
var
  BytesCount: Longint;
  UTF8: string;
begin
  UTF8 := AnsiToUtf8(Value);
  BytesCount := Length(UTF8);

  WriteLongint(BytesCount); //It writes Longint to FStream: TStream

  if BytesCount > 0 then
    FStream.WriteBuffer(UTF8[1], BytesCount);
end;

As it's written in Delphi7, Value is a single byte string.

On the client I read string in UTF8 and encode it to Unicode

//Reads string from current position of stream
function TStreamWrap.ReadString: string;
var
  BytesCount: Longint;
  UTF8: String;
begin
  BytesCount := ReadLongint;
  if BytesCount = 0 then
    Result := ''
  else
  begin
    SetLength(UTF8, BytesCount);

    FStream.Read(Pointer(UTF8)^, BytesCount);

    Result := UTF8ToUnicodeString(UTF8);
  end;
end;

But it doesn't work, when I display the string with ShowMessage the letters are wrong. So how to store string in Delphi 7 and restore it in XE6 on the mobile app? Should I add BOM at the beginning of data representing the string?

Was it helpful?

Solution

To read your UTF8 encoded string in your mobile application you use a byte array and the TEncoding class. Like this:

function TStreamWrap.ReadString: string;
var
  ByteCount: Longint;
  Bytes: TBytes;
begin
  ByteCount := ReadLongint;
  if ByteCount = 0 then
  begin
    Result := '';
    exit;
  end;

  SetLength(Bytes, ByteCount);
  FStream.Read(Pointer(Bytes)^, ByteCount);
  Result := TEncoding.UTF8.GetString(Bytes);
end;

This code does what you need in XE6, but of course, this code will not compile in Delphi 7 because it uses TEncoding. What's more, your TStreamWrap.WriteString implementation does what you want in Delphi 7, but is broken in XE6.

Now it looks like you are using the same code base for both Delphi 7 and Delphi XE6 versions. Which means that you may need to use some conditional compilation to handle the treatment of text which differs between these versions.

Personally I would do this by following the example of TEncoding. What you need is a function that converts a native Delphi string to a UTF-8 encoded byte array, and a corresponding function in the reverse direction.

So, let's consider the string to bytes function. I cannot remember whether or not Delphi 7 has a TBytes type. I suspect not. So let us define it:

{$IFNDEF UNICODE} // definitely use a better conditional than this in real code
type
  TBytes = array of Byte;
{$ENDIF}

Then we can define our function:

function StringToUTF8Bytes(const s: string): TBytes;
{$IFDEF UNICODE}
begin
  Result := TEncoding.UTF8.GetBytes(s);
end;
{$ELSE}
var
  UTF8: UTF8String;
begin
  UTF8 := AnsiToUtf8(s);
  SetLength(Result, Length(UTF8));
  Move(Pointer(UTF8)^, Pointer(Result)^, Length(Result));
end;
{$ENDIF}

The function in the opposite direction should be trivial for you to produce.

Once you have the differences in handling of text encoding between the two Delphi versions encapsulated, you can then write conditional free code in the rest of your program. For example, you would code WriteString like this:

procedure TStreamWrap.WriteString(const Value: string);
var
  UTF8: TBytes;
  ByteCount: Longint;
begin
  UTF8 := StringToUTF8Bytes(Value);
  ByteCount := Length(UTF8);
  WriteLongint(ByteCount);
  if ByteCount > 0 then
    FStream.WriteBuffer(Pointer(UTF8)^, ByteCount);
end;

OTHER TIPS

Instead of

Utf8 : String;

Use

Utf8 : Utf8String;

on client. Then conversion is Automatic.

EDIT: Since the client is on a mobile platform, and Embarcadero has decided to eliminate the 8-bit strings in mobile compilers, the above won't work for this particular case. But in other cases where you have an 8-bit UTF-8 encoded string, the Utf8String can be used to seamlessly convert back and forth between UTF-8 and Unicode strings without the need to use explicit UTF-8 conversion functions. Just use it like

UnicodeStringVariable := Utf8StringVariable;

or

Utf8StringVariable := UnicodeStringVariable;

and the compiler will insert the appropriate conversion.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top