Question

I have a C program that now I need to do support to UTF-8 characters. What must I know in order to perform that? I've always hear how problematic is handle it in a C/C++ environment. Why exactly is it problematic? How does it differ from an usual C character, also its size? Can I do it without any operating system help, in pure C and still make it portable? what else I should have asked but I didn't? what I'm looking for implement is it: The characters are a name with accents(like french word: résumé) that I need to read it and put into a symbol table and then search and print them from a file. It's part of my configuration file parsing(very much .ini-like)

Was it helpful?

Solution

There's an awesome article written by Joel Spolsky, one of the Stack Overflow creators.

The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)

Apart from that, you might want to query some other Q&A's regarding this subject, like Handling special characters in C (UTF-8 encoding).

As cited in the aforementioned Q&A, Tips on Using Unicode with C/C++ might give you the basics.

OTHER TIPS

Two good links that i have used in the past:

The-Basics-of-UTF8

reading-unicode-utf-8-by-hand-in-c

valter

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top