Question

I have a situation where I need to convert raw email file to json and do transfer this to remote server using HTTP POST using perl with Cent OS 6.5

I have installed perl-JSON from CentOS repo itself and not from CPAN. The raw email file which I mentioned may have international characters. It is usually a few hundred KB in size.

As JSON doesn't by default support binary data, How should I embed a raw email file into JSON structure and do HTTP POST ?

The perl-JSON uses perl-JSON-PP and it has methods encode_json(), decode_json() and it expects to be in UTF-8 format.

Is it safe to use in my case or do I have to add base64 encoded string into json ?

Was it helpful?

Solution

I wouldn't worry that much about the fact that a JSON message itself may or may not be UTF8-encoded. The encoding of the message does not affect the encoding of the contents.

Let's assume we want to transmit a 256-byte string which contains each byte once:

my $string = join '', map chr, 0x00 .. 0xFF;

When we encode and decode the message, we end up with an equivalent string again:

my $message = encode_json { str => $string };
my $new_string = (decode_json $message)->{str};

$new_string eq $string or die "The strings aren't equal";

While the strings are equal, they do not have the same contents. We can achieve that by “downgrading” the new string:

use utf8;
# this may die if the $new_string contains characters outside of
# the native encoding
utf8::downgrade($new_string); 

How and why this works is perfectly well-defined, but armoring the message by using an ASCII-only encoding is admittedly preferable:

use MIME::Base64;
use JSON;

my $string = join '', map chr, 0x00 .. 0xFF;

my $message = encode_json { str => encode_base64 $string };
my $new_string = decode_base64 decode_json($message)->{str};

$string eq $new_string or die;
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top