how can I use a different line termination for reads in php?
-
06-07-2019 - |
Question
I'm trying to read a CSV file generated by M$ Excel on linux.
The file has quoted multi-line (x0A separated) columns and a 0x0d0a line termination.
PHP on Linux uses 0x0a as line terminator, so all the line-based tools (file, fgets, fgetcsv) thinks there are record breaks in the middle of the data cells.
Short of processing the file byte by byte, can I temporarily change PHP's end of line character (PHP_EOL constant) so I can easily parse the file.
I think it can be done in perl with "$\". Is there something similar in PHP?
I realize I can parse byte by byte, but I'm looking for a cleaner approach.
Solution
If conceptDawg's suggestion of auto_detect_line_endings
doesn't work, I would recommending reading in the entire file via file_get_contents() and then calling explode() to break up the file into multiple lines. You can pass whatever character you want to explode()
OTHER TIPS
You might try using the 'auto_detect_line_endings
' run-time configuration option. It says that using this will automatically figure out the correct line endings. From the docs:
When turned on, PHP will examine the data read by fgets() and file() to see if it is using Unix, MS-Dos or Macintosh line-ending conventions.
This enables PHP to interoperate with Macintosh systems, but defaults to Off, as there is a very small performance penalty when detecting the EOL conventions for the first line, and also because people using carriage-returns as item separators under Unix systems would experience non-backwards-compatible behaviour.
If that doesn't work then you could always read the entire file into memory (depending on the file size this might not be feasible) and do a preg_replace on the characters in question, replacing them for the "correct" characters.