Question

I have a problem with sending the correct charset information via header in PHP.

In my code I created a CSV string which is ISO-8859-2 encoded. I am outputting this to the browser using the following code (simplified):

$csv = 'iso-8859-2 encoded string, Łukasz Szukała';

header('Content-Type: text/csv; charset=ISO-8859-2');
header('Content-Disposition: attachment; filename=report.csv');

echo $csv;

The resulting file is detected as ISO-8859-1, when opening it with an editor (I am using Notepad++) and hence displayed incorrectly.

Converting the string to UTF-8 works as expected:

$csv = 'iso-8859-2 encoded string, Łukasz Szukała';
$csv = iconv('ISO-8859-2', 'UTF-8', $csv);

header('Content-Type: text/csv; charset=UTF-8');
header('Content-Disposition: attachment; filename=report.csv');

echo $csv;

I can even omit the charset part completely in this case.

However I need the file to be encoded ISO-8859-2 and I fail to see why the header information will not lead the file to be detected as ISO-8859-2. I also tried different aliases as per IANA (http://www.iana.org/assignments/character-sets/character-sets.xhtml), but I do not get it to work in any browser.

Greatly appreciating any valueable input.

Was it helpful?

Solution

You are doing all correct except that you are expecting to Notepad++ to detect character set of file. Strictly speaking ISO-8859-2 is not encoding but character set.

File that is opened by editor (any) is just plain text without headers, that you sent to browser. So editor can detect that it is single-byte encoding, but detect character set is not possible, so it is opened in default system character set.

See my answer about encoding detection

Your file has still correct encoding, but Notepad++ just displaying it incorrectly. UTF-8 encoding is another matter, because it doesn't have character sets and is always displayed correctly (i think).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top