It is correct to switch the default perl's IO to utf-8 while using Plack and Middlewares?

StackOverflow https://stackoverflow.com/questions/11012155

  •  14-06-2021
  •  | 
  •  

Question

Two starting points:

Is it correct to use

use uni::perl; # or any similar

in the PSGI application and/or in my modules?

uni::perl changes Perl's default IO to UTF-8, thus:

use open qw(:std :utf8);
binmode(STDIN,   ":utf8");
binmode(STDOUT,  ":utf8");
binmode(STDERR,  ":utf8");

Will doing so break something in Plack or its middlewares? Or is the only correct way to write apps for Plack explicitely encoding/decoding at open, so without the open pragma?

Was it helpful?

Solution

You really don't want to set STDIN/STDOUT to be UTF-8 mode by default on Plack, because you don't know for instance whether they will be binary data transports. E.g. if those filehandles are the FastCGI protocol connector they will be carrying encoded binary structures and not UTF-8 text. They therefore must not have an encoding layer defined, or those binary structures will be mangled or rejected as invalid.

OTHER TIPS

On modern GNU/Linux systems you should completely switch to UTF-8 globally. This means setting

LANG="xx_YY.UTF-8"
PERL_UNICODE=SDAL
PERL5OPT=-Mutf8

in your /etc/environment or /etc/sysconfig/i18n or /etc/default/locale or whatever your system configuration file is. Because of RHEL/Centos bug I symlinked /etc/environment to sysconfig/i18n.

Scripts that rely on binary input should set binmode on STDIN/OUT/ERR(?) or use open pragma or should be called with -C0 option.

The problem is that some DBD drivers are buggy, e.g. DBD::JDBC, and you must set the utf8 flag by hand.

use Encode qw/_utf8_on/;
map { _utf8_on $_; } @strings;
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top