DISABLE ADBLOCK

ADBlock is blocking some content on the site

ADBlock errore

How can I output a UTF-8 CSV in PHP that Excel will read properly?

StackOverflow https://stackoverflow.com/questions/4348802

Question

I've got this very simple thing that just outputs some stuff in CSV format, but it's got to be UTF-8. I open this file in TextEdit or TextMate or Dreamweaver and it displays UTF-8 characters properly, but if I open it in Excel it's doing this silly íÄ kind of thing instead. Here's what I've got at the head of my document:

header("content-type:application/csv;charset=UTF-8");
header("Content-Disposition:attachment;filename=\"CHS.csv\"");

This all seems to have the desired effect except Excel (Mac, 2008) doesn't want to import it properly. There's no options in Excel for me to "open as UTF-8" or anything, so … I'm getting a little annoyed.

I can't seem to find any clear solutions to this anywhere, despite a lot of people having the same problem. The thing I see the most is to include the BOM, but I can't exactly figure out how to do that. As you can see above I'm just echoing this data, I'm not writing any file. I can do that if I need to, I'm just not because there doesn't seem like a need for it at this point. Any help?

Update: I tried echoing the BOM as echo pack("CCC", 0xef, 0xbb, 0xbf); which I just pulled from a site that was trying to detect the BOM. But Excel just appends those three characters to the very first cell when it imports, and still messes up the special characters.

Solution

To quote a Microsoft support engineer,

Excel for Mac does not currently support UTF-8

Update, 2017: This is true of all versions of Microsoft Excel for Mac before Office 2016. Newer versions (from Office 365) do now support UTF-8.

In order to output UTF-8 content that Excel both on Windows and OS X will be able to successfully read, you will need to do two things:

  1. Make sure that you convert your UTF-8 text to UTF-16LE

    mb_convert_encoding($csv, 'UTF-16LE', 'UTF-8');
    
  2. Make sure that you add the UTF-16LE byte order mark

    chr(255) . chr(254)
    

The next problem that appears only with Excel on OS X (but not Windows) will be when viewing a CSV file with comma separated values, Excel will render rows only with one row and all of the text along with the commas in the first row.

The way to avoid this is to use tabs as your separated value.

I used this function from the PHP comments (using tabs "\t" instead of commas) and it worked perfectly on OS X and Windows Excel.

Note that to fix an issue with an empty column as the end of a row, that I did have to change the line of code that says:

    $field_cnt = count($fields);

to

    $field_cnt = count($fields)-1;

As some of the other comments on this page say, other spreadsheet apps like OpenOffice Calc, Apple's own Numbers and Google Doc's Spreadsheet have no issues with UTF-8 files with commas.

See the table in this question for what works and doesn't work for Unicode CSV files in Excel


As a side note, I might add that if you are using Composer, you should have a look at adding League\Csv to your requires. League\Csv has a really nice API for building CSV files.

To use League\Csv with this method of creating CSV files, check out this example

OTHER TIPS

I have the same (or similar) problem.

In my case, if I add a BOM to the output, it works:

header('Content-Encoding: UTF-8');
header('Content-type: text/csv; charset=UTF-8');
header('Content-Disposition: attachment; filename=Customers_Export.csv');
echo "\xEF\xBB\xBF"; // UTF-8 BOM

I believe this is a pretty ugly hack, but it worked for me, at least for Excel 2007 Windows. Not sure it'll work on Mac.

Here is how I did it (that's to prompt browser to download the csv file):

header('Content-Description: File Transfer');
header('Content-Type: application/octet-stream');
header('Content-Disposition: attachment; filename=file.csv');
header('Content-Transfer-Encoding: binary');
header('Expires: 0');
header('Cache-Control: must-revalidate, post-check=0, pre-check=0');
header('Pragma: public');
echo "\xEF\xBB\xBF"; // UTF-8 BOM
echo $csv_file_content;
exit();

The only thing it fixed UTF8 encoding problem in CSV preview when you hit space bar on Mac.. but not in Excel Mac 2008... don't know why

I just dealt with the same problem, and came up with two solutions.

  1. Use the PHPExcel class as suggested by bpeterson76.

    • Using this class generates the most widely compatible file, I was able to generate a file from UTF-8 encoded data that opened fine in Excel 2008 Mac, Excel 2007 Windows, and Google Docs.
    • The biggest problem with using PHPExcel is that it's slow and uses a lot of memory, which isn't an issue for reasonably sized files, but if your Excel/CSV file has hundreds or thousands of rows, this library becomes unusable.
    • Here is a PHP method that will take some TSV data and output an Excel file to the browser, note that it uses the Excel5 Writer, which means the file should be compatible with older versions of Excel, but I no longer have access to any, so I cannot test them.

      function excel_export($tsv_data, $filename) {
          $export_data = preg_split("/\n/", $tsv_data);
          foreach($export_data as &$row) {
              $row = preg_split("/\t/", $row);
          }
      
          include("includes/PHPExcel.php");
          include('includes/PHPExcel/Writer/Excel5.php');
      
          $objPHPExcel = new PHPExcel();          
      
          $objPHPExcel->setActiveSheetIndex(0);
          $sheet = $objPHPExcel->getActiveSheet();
          $row = '1';
          $col = "A";
          foreach($export_data as $row_cells) {
              if(!is_array($row_cells)) { continue; }
                  foreach($row_cells as $cell) {
                      $sheet->setCellValue($col.$row, $cell);
                      $col++;
                  }
              $row += 1;
              $col = "A";
          }
      
          $objWriter = new PHPExcel_Writer_Excel5($objPHPExcel);
          header('Content-Type: application/vnd.ms-excel');
          header('Content-Disposition: attachment;filename="'.$filename.'.xls"');
          header('Cache-Control: max-age=0');
          $objWriter->save('php://output');   
          exit;   
      }
      
  2. Because of the efficiency issues with PHPExcel, I also had to figure out how to generate a UTF-8 & Excel compatible CSV or TSV file.

    • The best I could come up with was a file that was compatible with Excel 2008 Mac, and Excel 2007 PC, but not Google Docs, which is good enough for my application.
    • I found the solution here, specifically, this answer, but you should also read the accepted answer as it explains the problem.
    • Here is the PHP code I used, note that I am using tsv data (tabs as delimiters instead of commas):

      header ( 'HTTP/1.1 200 OK' );
      header ( 'Date: ' . date ( 'D M j G:i:s T Y' ) );
      header ( 'Last-Modified: ' . date ( 'D M j G:i:s T Y' ) );
      header ( 'Content-Type: application/vnd.ms-excel') ;
      header ( 'Content-Disposition: attachment;filename=export.csv' );
      print chr(255) . chr(254) . mb_convert_encoding($tsv_data, 'UTF-16LE', 'UTF-8');
      exit;
      

I was having the same issue and it was solved like below:

    header('Content-Encoding: UTF-8');
    header('Content-Type: text/csv; charset=utf-8' );
    header(sprintf( 'Content-Disposition: attachment; filename=my-csv-%s.csv', date( 'dmY-His' ) ) );
    header('Content-Transfer-Encoding: binary');
    header('Expires: 0');
    header('Cache-Control: must-revalidate, post-check=0, pre-check=0');
    header('Pragma: public');

    $df = fopen( 'php://output', 'w' );

    //This line is important:
    fputs( $df, "\xEF\xBB\xBF" ); // UTF-8 BOM !!!!!

    foreach ( $rows as $row ) {
        fputcsv( $df, $row );
    }
    fclose($df);
    exit();

Excel doesn't support UTF-8. You have to encode your UTF-8 text into UCS-2LE.

mb_convert_encoding($output, 'UCS-2LE', 'UTF-8');

To follow up on this:

It appears that the problem is simply with Excel on the Mac. It's not how I'm generating the files, because even generating CSVs from Excel is breaking them. I save as CSV, and reimport, and all the characters are messed up.

So … there doesn't appear to be a correct answer to this. Thanks for all the suggestions.

I would say that from all I've read, @Daniel Magliola's suggestion about the BOM would probably be the best answer for some other computer. But it still doesn't solve my problem.

In my case following works very nice to make CSV file with UTF-8 chars displayed correctly in Excel.

$out = fopen('php://output', 'w');
fprintf($out, chr(0xEF).chr(0xBB).chr(0xBF));
fputcsv($out, $some_csv_strings);

The 0xEF 0xBB 0xBF BOM header will let Excel know the correct encoding.

This works fine in excel for both Windows and also Mac OS.

Fix issues in excel that are not displaying characters containing diacritics, cyrillic letters, greek letter and currency symbols.

function writeCSV($filename, $headings, $data) {   

    //Use tab as field separator
    $newTab  = "\t";
    $newLine  = "\n";

    $fputcsv  =  count($headings) ? '"'. implode('"'.$newTab.'"', $headings).'"'.$newLine : '';

    // Loop over the * to export
    if (! empty($data)) {
      foreach($data as $item) {
        $fputcsv .= '"'. implode('"'.$newTab.'"', $item).'"'.$newLine;
      }
    }

    //Convert CSV to UTF-16
    $encoded_csv = mb_convert_encoding($fputcsv, 'UTF-16LE', 'UTF-8');

    // Output CSV-specific headers
    header('Set-Cookie: fileDownload=true; path=/'); //This cookie is needed in order to trigger the success window.
    header("Pragma: public");
    header("Expires: 0");
    header("Cache-Control: must-revalidate, post-check=0, pre-check=0");
    header("Cache-Control: private",false);
    header("Content-Type: application/octet-stream");
    header("Content-Disposition: attachment; filename=\"$filename.csv\";" );
    header("Content-Transfer-Encoding: binary");
    header('Content-Length: '. strlen($encoded_csv));
    echo chr(255) . chr(254) . $encoded_csv; //php array convert to csv/excel

    exit;
}

The CSV File musst include a Byte Order Mark.

Or as suggested and workaround just echo it with the HTTP body

Since UTF8 encoding doesn't play well with Excel. You can convert the data to another encoding type using iconv().

e.g.

iconv('UTF-8', 'ISO-8859-1//TRANSLIT', $value),

Add:

fprintf($file, chr(0xEF).chr(0xBB).chr(0xBF));

Or:

fprintf($file, "\xEF\xBB\xBF");

Before writing any content to CSV file.

Example:

<?php
$file = fopen( "file.csv", "w");
fprintf( $file, "\xEF\xBB\xBF");
fputcsv( $file, ["english", 122, "বাংলা"]);
fclose($file);

Converting already utf-8 encoded text by using mb_convert_encoding is not needed. Just add three characters in front of the original content:

$newContent = chr(239) . chr(187) . chr(191) . $originalContent

For me this resolved the problem of special characters in csv files.

As I investigated and I found that UTF-8 is not working well on MAC and Windows so I tried with Windows-1252 , it supports well on both of them but you must select type of encoding on ubuntu. Here is my code$valueToWrite = mb_convert_encoding($value, 'Windows-1252');

$response->headers->set('Content-Type', $mime . '; charset=Windows-1252');
    $response->headers->set('Pragma', 'public');
    $response->headers->set('Content-Endcoding','Windows-1252');
    $response->headers->set('Cache-Control', 'maxage=1');
    $response->headers->set('Content-Disposition', $dispositionHeader);
    echo "\xEF\xBB\xBF"; // UTF-8 BOM

**This is 100% works fine in excel for both Windows7,8,10 and also All Mac OS.**
//Fix issues in excel that are not displaying characters containing diacritics, cyrillic letters, Greek letter and currency symbols.

function generateCSVFile($filename, $headings, $data) {

    //Use tab as field separator
    $newTab  = "\t";
    $newLine  = "\n";

    $fputcsv  =  count($headings) ? '"'. implode('"'.$newTab.'"', $headings).'"'.$newLine : '';

    // Loop over the * to export
    if (! empty($data)) {
      foreach($data as $item) {
        $fputcsv .= '"'. implode('"'.$newTab.'"', $item).'"'.$newLine;
      }
    }

    //Convert CSV to UTF-16
    $encoded_csv = mb_convert_encoding($fputcsv, 'UTF-16LE', 'UTF-8');

    // Output CSV-specific headers
    header('Set-Cookie: fileDownload=true; path=/'); //This cookie is needed in order to trigger the success window.
    header("Pragma: public");
    header("Expires: 0");
    header("Cache-Control: must-revalidate, post-check=0, pre-check=0");
    header("Cache-Control: private",false);
    header("Content-Type: application/octet-stream");
    header("Content-Disposition: attachment; filename=\"$filename.csv\";" );
    header("Content-Transfer-Encoding: binary");
    header('Content-Length: '. strlen($encoded_csv));
    echo chr(255) . chr(254) . $encoded_csv; //php array convert to csv/excel
    exit;
}

How about just outputting for Excel itself? This is an excellent class that allows you to generate XLS files server-side. I use it frequently for clients who can't "figure out" csv's and so far have never had a complaint. It also allows some extra formatting (shading, rowheights, calculations, etc) that csv won't ever do.

you can convert your CSV String with iconv. for example:

$csvString = "Möckmühl;in Möckmühl ist die Hölle los\n";
file_put_contents('path/newTest.csv',iconv("UTF-8", "ISO-8859-1//TRANSLIT",$csvString) );

You have to use the encoding "Windows-1252".

header('Content-Encoding: Windows-1252');
header('Content-type: text/csv; charset=Windows-1252');
header("Content-Disposition: attachment; filename={$filename}");

Maybe you have to convert your strings:

private function convertToWindowsCharset($string) {
  $encoding = mb_detect_encoding($string);

  return iconv($encoding, "Windows-1252", $string);
}

You may append the 3 bytes to the file before exporting, it works for me . Before doing that system only work in Windows and HP -UX but failed in Linux.

FileOutputStream fStream = new FileOutputStream( f );
final byte[] bom = new byte[] { (byte) 0xEF, (byte) 0xBB, (byte) 0xBF };
OutputStreamWriter writer = new OutputStreamWriter( fStream, "UTF8" );
fStream.write( bom );

Have a UTF-8 BOM (3 bytes, hex EF BB BF) at the start of the file. Otherwise Excel will interpret the data according to your locale's default encoding (e.g. cp1252) instead of utf-8

Generating CSV file for Excel, how to have a newline inside a value

I'm on Mac, in my case I just had to specify the separator with "sep=;\n" and encode the file in UTF-16LE like this:

$data = "sep=;\n" .mb_convert_encoding($data, 'UTF-16LE', 'UTF-8');

For me none of the solution above worked. Below is what i did to resolve the issue: modify the value using this function in the PHP code:

$value = utf8_encode($value);

This output values properly in an excel sheet.

I had this same problem when I had an Excel VBA routine that imported data. Since CSV is a plain text format, I was working around this by programatically opening the data in a simple file editor like wordpad, and re-saving it as unicode text, or copying it to the clipboard from there and pasting it to Excel. If excel doesn't automatically parse the CSV into cells, this is easily remedied using the built in "Text to Columns" feature.

Does the problem still occur when you save it as a .txt file and them open that in excel with comma as a delimiter?

The problem might not be the encoding at all, it might just be that the file isn't a perfect CSV according to excel standards.

This Post is quite old, but after hours of trying I want to share my solution … maybe it helps someone dealing with Excel and Mac and CSV and stumbles over this threat. I am generating a csv dynamically as output from a Database with Excel Users in mind. (UTF-8 with BOM)

I tried a lot of iconv´s but couldn´t get german umlauts working in Mac Excel 2004. One solution: PHPExcel. It is great but for my project a bit too much. What works for me is creating the csv file and convert this csv file to xls with this PHPsnippet: csv2xls. the result xls works with excel german umlauts (ä,ö,Ü,...).

I just tried these headers and got Excel 2013 on a Windows 7 PC to import the CSV file with special characters correctly. The Byte Order Mark (BOM) was the final key that made it work.

    header('Content-Encoding: UTF-8');
    header('Content-type: text/csv; charset=UTF-8');
    header("Content-disposition: attachment; filename=filename.csv");
    header("Pragma: public");
    header("Expires: 0");
    echo "\xEF\xBB\xBF"; // UTF-8 BOM

Otherwise you may:

header("Content-type: application/x-download");
header("Content-Transfer-Encoding: binary");
header("Content-disposition: attachment; filename=".$fileName."");
header("Cache-control: private");

echo utf8_decode($output);

EASY solution for Mac Excel 2008: I struggled with this soo many times, but here was my easy fix: Open the .csv file in Textwrangler which should open your UTF-8 chars correctly. Now in the bottom status bar change the file format from "Unicode (UTF-8)" to "Western (ISO Latin 1)" and save the file. Now go to your Mac Excel 2008 and select File > Import > Select csv > Find your file > in File origin select "Windows (ANSI)" and voila the UTF-8 chars are showing correctly. At least it does for me...

I use this and it works

header('Content-Description: File Transfer');
header('Content-Type: text/csv; charset=UTF-16LE');
header('Content-Disposition: attachment; filename=file.csv');
header('Content-Transfer-Encoding: binary');
header('Expires: 0');
header('Cache-Control: must-revalidate, post-check=0, pre-check=0');
header('Pragma: public');
// output headers so that the file is downloaded rather than displayed
// create a file pointer connected to the output stream
$output = fopen('php://output', 'w');
fputs( $output, "\xEF\xBB\xBF" );
// output the column headings
fputcsv($output, array('Thông tin khách hàng đăng ký'));
// fetch the data
$setutf8 = "SET NAMES utf8";
$q = $conn->query($setutf8);
$setutf8c = "SET character_set_results = 'utf8', character_set_client =
'utf8', character_set_connection = 'utf8', character_set_database = 'utf8',
character_set_server = 'utf8'";
$qc = $conn->query($setutf8c);
$setutf9 = "SET CHARACTER SET utf8";
$q1 = $conn->query($setutf9);
$setutf7 = "SET COLLATION_CONNECTION = 'utf8_general_ci'";
$q2 = $conn->query($setutf7);
$sql = "SELECT id, name, email FROM myguests";
$rows = $conn->query($sql);
$arr1= array();
if ($rows->num_rows > 0) {
// output data of each row
while($row = $rows->fetch_assoc()) {
    $rcontent = " Name: " . $row["name"]. " - Email: " . $row["email"];  
    $arr1[]["title"] =  $rcontent;
}
} else {
     echo "0 results";
}
$conn->close();
// loop over the rows, outputting them
foreach($arr1 as $result1):
   fputcsv($output, $result1);
endforeach;

#output a UTF-8 CSV in PHP that Excel will read properly.

//Use tab as field separator   
$sep  = "\t";  
$eol  = "\n";    
$fputcsv  =  count($headings) ? '"'. implode('"'.$sep.'"', $headings).'"'.$eol : '';

//convert UTF-8 file without BOM to UTF-16LE for excel on mac
$fileUtf8String = file_get_contents("file.ext");
file_put_contents("file.ext", "\xFF\xFE" . mb_convert_encoding($fileUtf8String, "UTF-16LE", "UTF-8"));

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow