Question

This code starts a HTTP server which listens for requests on port 8080. When compiled with Delphi 2009, the Chinese text is rendered correctly. With Free Pascal 2.6.0 however, the browser displays 中文 instead of 中文.

What is the correct way to write Unicode / UTF-8 HTTP responses with Indy and Free Pascal?

program IdHTTPUnicode;

{$APPTYPE CONSOLE}

uses
  IdHTTPServer, IdCustomHTTPServer, IdContext, IdSocketHandle, IdGlobal,
  SysUtils;

type
  TMyServer = class (TIdHTTPServer)
  public
    procedure InitComponent; override;
    procedure DoCommandGet(AContext: TIdContext;
      ARequestInfo: TIdHTTPRequestInfo;
      AResponseInfo: TIdHTTPResponseInfo); override;
  end;

procedure Demo;
var
  Server: TMyServer;
begin
  Server := TMyServer.Create(nil);
  try
    try
      Server.Active := True;
    except
      on E: Exception do
      begin
        WriteLn(E.ClassName + ' ' + E.Message);
      end;
    end;
    WriteLn('Hit any key to terminate.');
    ReadLn;
  finally
    Server.Free;
  end;
end;

procedure TMyServer.InitComponent;
var
  Binding: TIdSocketHandle;
begin
  inherited;

  Bindings.Clear;
  Binding := Bindings.Add;
  Binding.IP := '127.0.0.1';
  Binding.Port := 8080;
  Binding.IPVersion := Id_IPv4;
end;

procedure TMyServer.DoCommandGet(AContext: TIdContext;
  ARequestInfo: TIdHTTPRequestInfo; AResponseInfo: TIdHTTPResponseInfo);
const
  UNI = '中文';
begin
  AResponseInfo.ContentText := '<html>' + UNI + '</html>';
  AResponseInfo.ContentType := 'text/html';
  AResponseInfo.CharSet := 'UTF-8';
end;

begin
  Demo;
end.

In the debugger, I can see that different code in the method TIdIOHandler.Write is executed, for Free Pascal, STRING_IS_ANSI is defined:

procedure TIdIOHandler.Write(const AOut: string; AByteEncoding: TIdTextEncoding = nil
  {$IFDEF STRING_IS_ANSI}; ASrcEncoding: TIdTextEncoding = nil{$ENDIF}
  );
begin
  if AOut <> '' then begin
    AByteEncoding := iif(AByteEncoding, FDefStringEncoding);
    {$IFDEF STRING_IS_ANSI}
    ASrcEncoding := iif(ASrcEncoding, FDefAnsiEncoding, encOSDefault);
    {$ENDIF}
    Write(
      ToBytes(AOut, -1, 1, AByteEncoding
        {$IFDEF STRING_IS_ANSI}, ASrcEncoding{$ENDIF}
        )
      );
  end;
end; 
Was it helpful?

Solution

FreePascal strings are not UTF-16 encoded like they are in Delphi 2009+. In FreePascal, and in Delphi 2007 and earlier, your code needs to take the actual string encoding into account. That is why Indy exposes additional Ansi-based parameters/properties for those platforms.

When TIdHTTPServer writes out the ContentText using TIdIOHandler.Write(), the ASrcEncoding parameter is not used on non-Unicode platforms, so you will have to use the TIdIOHandler.DefAnsiEncoding property instead to let Write() know what the encoding of the ContentText is, eg:

procedure TMyServer.DoCommandGet(AContext: TIdContext;
  ARequestInfo: TIdHTTPRequestInfo; AResponseInfo: TIdHTTPResponseInfo);
const
  UNI: WideString = '中文';
begin
  AResponseInfo.ContentText := UTF8Encode('<html>' + UNI + '</html>');
  AResponseInfo.ContentType := 'text/html';

  // this tells TIdHTTPServer what to encode bytes to during socket transmission
  AResponseInfo.CharSet := 'utf-8';

  // this tells TIdHTTPServer what encoding the ContentText is using
  // so it can be decoded to Unicode prior to then being charset-encoded
  // for output. If the input and output encodings are the same, the
  // Ansi string data gets transmitted as-is without decoding/reencoding...
  AContext.Connection.IOHandler.DefAnsiEncoding := IndyUTF8Encoding;
end;

Or, more generically:

{$I IdCompilerDefines.inc}

procedure TMyServer.DoCommandGet(AContext: TIdContext;
  ARequestInfo: TIdHTTPRequestInfo; AResponseInfo: TIdHTTPResponseInfo);
const
  UNI{$IFNDEF STRING_IS_UNICODE}: WideString{$ENDIF} = '中文';
begin
  {$IFDEF STRING_IS_UNICODE}
  AResponseInfo.ContentText := '<html>' + UNI + '</html>';
  {$ELSE}
  AResponseInfo.ContentText := UTF8Encode('<html>' + UNI + '</html>');
  {$ENDIF}
  AResponseInfo.ContentType := 'text/html';
  AResponseInfo.CharSet := 'utf-8';
  {$IFNDEF STRING_IS_UNICODE}
  AContext.Connection.IOHandler.DefAnsiEncoding := IndyUTF8Encoding;
  {$ENDIF}
end;

OTHER TIPS

In modern FreePascal strings by default are UTF-8 unless you tweaked copiler options.

Thus it seems in iif(ASrcEncoding, FDefAnsiEncoding, encOSDefault); the value of encOSDefault is wrong. You may fix its detection in INDY sources if you like or i guess better would be to set DefAnsiEncoding := 'utf-8'; (low-case by RFC AFAIR)

To be on safe side you can check for UTF-8 mode at the program beginning. Set some non-Latin constant (like that chinese thing, or greek or cyrillic - whatever) and check if it is UTF8 or not: http://compaspascal.blogspot.ru/2009/03/utf-8-automatic-detection.html

However overall i think you may try to find some library that cares about FPC and Linux more than Indy. Indy seems to me stagnating and next to abandoned even on Delphi. Maybe Synopse mORMot (look for DataSnap performance tests article) can help you or some library that comes with CodeTyphon distro.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top