So i am parsing a large csv file and pushing the results into mongo.
The file is maxminds city database. It has all kinds of fun utf8 characters. I am still getting (?) symbols in some city names. Here is how I am reading the file:
(using csv node module)
csv().from.stream(fs.createReadStream(path.join(__dirname, 'datafiles', 'cities.csv'), {
flags: 'r',
encoding: 'utf8'
})).on('record', function(row,index){
.. uninteresting code to add it to mongodb
});
What could i be doing wrong here?
I am getting things like this in mongo: Ch�teauguay, Canada
EDIT:
i tried using a different lib to read the file:
lazy(fs.createReadStream(path.join(__dirname, 'datafiles', 'cities.csv'), {
flags: 'r',
encoding: 'utf8',
autoClose: true
}))
.lines
.map(String)
.skip(1) // skips the two lines that are iptables header
.map(function (line) {
console.log(line);
});
it produces the same bad results:
154252,"PA","03","Capellan�a","",8.3000,-80.5500,,
154220,"AR","01","Villa Espa�a","",-34.7667,-58.2000,,