Question

I want to read string from text file in perl

The method I used to read is :

my $indPara = "C:\\Users\\Admin001\\Desktop\\paraText.txt";
open(INDPARA, $indPara) || die "Indesign paraText not found on location!";
my $indesignPara = <INDPARA>;
close INDPARA;

When reading the text, I am getting an unknown unicode character (&#65279 or &#xFEFF) at the starting of text,

Download the text file that I used to read from below link

https://mega.co.nz/#!r1pAyAhA!VSnL2tbPWoTtThcCRoiogSxK4ok_0YvZSczs054w0uU

I am using Komodo IDE 8.5 and perl 5.16.3

kindly give some idea to overcome this

Thanks in advance

Vimal

Was it helpful?

Solution 3

Thank you So much guys for your kind help and ideas I found a way to clear this, ie: just find and replace this s/\x{feff}//g; and it works !

OTHER TIPS

What you have there is a BOM. It is telling you that what you have is not a UTF-8 file, it is a UTF-16 (BE) file).

The BOM is not part of the data in the file, so you can safely just skip past it and continue to read the data beyond it. However, you should not treat the data that you are reading from the file as UTF-8, you should treat it as UTF-16 (BE) and decode it appropriately.

If you would have the entire string ($indesignPara), do:

$s = decode("UTF-16LE", $s, Encode::FB_QUIET);

but I am not sure <INDPARA> works though. You could try "<:encoding(UTF-16LE)" as first extra parameter to open. And then strip the first wide character, the BOM U+FFFE.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top