Question

I have a string coming in from a web service, it's a mixture of Cyrillic and Latin/English characters. When building an array by separating the words in the sentence it show's the unicode in place of the letters when using NSLog. I want to know how to convert any of the cyrillic/unicode characters to a proper readable latin/english word. For example..

NSString *sentence = @"The Tobе Elіte"; (e in Tobe is Cyrillic, and i in Elite)

After putting each word in the string into an array, when printing I get this:

(
The,
"Tob\U0435",
"El\U0456te"
)

I need this to transliterate to latin "Tobe" and latin "Elite". If I try comparing what I have now by doing

if(![@"Tobe" isEqualToString:[array objectAtIndex:1]])
      //Tobe is not Equal to Tob\U0435

I do apologize if I explained this horribly, if you have any questions to help better understand my problem feel free to ask. I have tried several things to get this encoded to proper UTF8. For example, this does not work:

NSMutableString *buffer = [string mutableCopy];
CFMutableStringRef bufferRef = (__bridge CFMutableStringRef)buffer;
CFStringTransform(bufferRef, NULL, kCFStringTransformToLatin, false);

Ultimately I need to search the array for matching words by using NSPredicate, but with the Unicode in the array it does not allow me to do so. Any help is appreciated.

Was it helpful?

Solution

This works for me:

NSString *sentence = @"The Tobе Elіte";
NSMutableString *buffer = [sentence mutableCopy];
CFMutableStringRef bufferRef = (__bridge CFMutableStringRef)buffer;
CFStringTransform(bufferRef, NULL, kCFStringTransformToLatin, false);
CFStringTransform(bufferRef, NULL, kCFStringTransformStripDiacritics, false);
NSArray *arr = [buffer componentsSeparatedByString:@" "];
NSLog(@"%@", arr);

and you can find some more info here: http://nshipster.com/cfstringtransform/

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top