We are in the process of migrating from PHP 5.2 to PHP 5.3 and have run into issues with gettext in the Windows versions of PHP greater than 5.3. gettext seems to return data as UTF-8 and calls to bind_textdomain_codeset() have no effect when we try to change the character set.

See the following script:

<?php

print 'PHP_OS: ' . PHP_OS . "\n";
print 'php_version: ' . phpversion() . "\n\n";

$language = 'de_DE';

putenv( "LANG=$language" );
setlocale( LC_ALL, $language );

if ( strtoupper( substr( PHP_OS, 0, 3 ) ) === 'WIN' ) {
    $language = 'german';
    setlocale( LC_ALL, $language );
}

bindtextdomain( 'messages', dirname(__FILE__) . '/language' );

textdomain( 'messages' );

$translated = _( 'Overtime' );

printf ("Default encoding:           %X %X\n", ord($translated[0]), ord($translated[1]));

bind_textdomain_codeset('messages', 'ISO-8859-1');

$translated = _( 'Overtime' );
printf ("Encoding set to ISO-8859-1: %X %X\n", ord($translated[0]), ord($translated[1]));

bind_textdomain_codeset('messages', 'UTF-8');

$translated = _( 'Overtime' );
printf ("Encoding set to UTF-8:      %X %X\n", ord($translated[0]), ord($translated[1]));

local directory structure for languages files:

language    
  de
    LC_MESSAGES
      messages.mo

messages.mo contains a single message translating "Overtime" to "Überstunden"

Results for PHP5.2.9-2 under windows and PHP5.3.27 under Linux are as expected (0xDC is ISO-8859-1 U with umlaut, 0x62 is ISO-8859-1 b, 0xC39C is UTF-8 U with umlaut)

PHP_OS: WINNT
php_version: 5.2.9-2

Default encoding:           DC 62
Encoding set to ISO-8859-1: DC 62
Encoding set to UTF-8:      C3 9C

---------------------------------

PHP_OS: Linux
php_version: 5.3.27

Default encoding:           DC 62
Encoding set to ISO-8859-1: DC 62
Encoding set to UTF-8:      C3 9C

However under windows version of PHP 5.3 (tested with 5.4 also), result is not as expected:

PHP_OS: WINNT
php_version: 5.3.28

Default encoding:           C3 9C
Encoding set to ISO-8859-1: C3 9C
Encoding set to UTF-8:      C3 9C

Output is UTF-8 by default and can't be changed by bind_textdomain_codeset.

We're currently using gettext and ISO-8859-1 throughout our app and want to run unit tests under Windows, but the Windows versions of PHP 5.3 and greater seem to be broken with respect to the encoding returned by gettext.

有帮助吗?

解决方案

I've been unable to get the gettext functionality in the Windows PHP 5.3.28 build to work correctly, but have a workaround which might be suitable for development environments.

Using the runkit extension (https://codeload.github.com/Crack/runkit-windows/zip/master) you can redefine the _() method to call gettext and then utf8_decode the result.

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top