CSV writes Ñ into its code form instead of its actual form

https://stackoverflow.com/questions/20087009

02-08-2022
|

Question

I have a CSV file. I checked its encoding using this:

File.open('C:\path\to\file\myfile.txt').read.encoding

and it returned the encoding as:

=> #<Encoding:IBM437>

I'm reading this CSV per row -- stripping spaces and doing other stuff. After "cleansing" it, I push it to a new file. I'm doing it like this:

CSV.foreach(file_read, encoding: "IBM437:UTF-8") do |r|

  # some code

  CSV.open(file_appended, "a", col_sep: "|") do |csv|
    csv << r
  end
end

Now my problem is, inside the CSV I'm reading, there's a word with an accented character -- Ñ to be exact. This character is being appended to the new file as

\u2564

Its a problem considering that the accented character is a vital part of that word, and I wanted that character to appear to the new file as-is.

Am I missing something? I tried the ff. source:destination encoding but to no avail:

ISO-8859-1:UTF8 (and vice versa)
ISO-8859-1:Windows-1252 (and vice versa)

Am I missing something?

Here is my ruby version, just if you'd need to know:

ruby 1.9.3p392 (2013-02-22) [i386-mingw32]

Thanks in advance!

Solution

The line below solved my problem:

Encoding.default_external = "iso-8859-1"

It tells Ruby that the file being read is encoded in ISO-8859-1, and therefore correctly interprets the Ñ character.

Credit goes to Darshan Computing's answer here. Just look for Update #2.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow