Question

I am processing a lot of CSV files that have people data and occasionally names are used non-alpha numeric characters like á and those all become � symbols in the datatable. How do i prevent this problem ? I just wanna leave all the names as they are in the file without making any changes.

Thanks,

L

Was it helpful?

Solution

The most common reason for this is that it is actually encoded in ISO-8859-1 and interpreted as UTF-8. For less common reasons, the same principle applies, that is, something is in different encoding that it claims to be.

OTHER TIPS

Change the character encoding in the database or decode it when you read from the DB.

While processing, you need a Reader or something. I suggest you configure it by using a System.Encoding.UnicodeEncoding or UTF32Encoding.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top