I have a problem with NFD Unicode strings I get from the OSX Filesystem.

This is what I get for the "Ä"-Umlaut on OSX "A\xcc\x88" and this is what I expect "\xc3\x84". The same function does it right under windows (simple boost filesystem operation, listing an directory).

After searching a while, I found out that Apple the NFD coding for UTF-8 and the rest of the world NFC. I tried a bit with converting through NSStrings or with boost::locale::normalize, but without success.

Does anybody know a way to do this in C++ (I can use Cocoa through obj-c if necessary)?

I would like the raw unicode string as std::string (with unicode coding) after that.

有帮助吗?

解决方案

This is the solution to get the precomposed form.

std::string precomposeFilename(const std::string& name)
{
   CFStringRef cfStringRef = CFStringCreateWithCString(kCFAllocatorDefault, name.c_str(), kCFStringEncodingUTF8);
   CFMutableStringRef cfMutable = CFStringCreateMutableCopy(NULL, 0, cfStringRef);

   CFStringNormalize(cfMutable,kCFStringNormalizationFormC);

   char c_str[255 + 1];
   CFStringGetCString(cfMutable, c_str, sizeof(c_str)-1, kCFStringEncodingUTF8);

   CFRelease(cfStringRef);
   CFRelease(cfMutable);

   return std::string(c_str);
}

其他提示

NSString has - (NSString *)precomposedStringWithCanonicalMapping function, and some other ones, looks like they will help you.

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top