質問

i have a database that contains non-english words ( for those who wonders turkish letters). And i have an algorithm which compares the input with database.

So my problem is this; in my database all the strings are written with turkish characters. So lets say i have thıs element to compare heyyö. When user enters heyyo it won't find it since they are considered as different words.

My first thought was put special cases and when a non-english character found consider whether english or non-english letter ( like g with ğ or i with ı) but that means a lot of brute force.

how can i do this with elegance.

Oh and user enters this inputs from a textfield if that wasn't implied.

役に立ちましたか?

解決

The removal of diacritics is called "folding." You can compare strings without regard to diacritics using the option NSDiacriticInsensitiveSearch.

[string compare:otherString options:NSDiacriticInsensitiveSearch] == NSOrderedSame

You can similarly generate a folded string using stringByFoldingWithOptions:locale:.

Note that this only removes diacritics. There are many ways that characters can "seem" the same without being the same. Turkish is somewhat notorious about this because the lowercase version of "I" is "ı" (LATIN SMALL DOTLESS I), not "i". If you're particularly dealing with Turkish, you may have to account for this.

他のヒント

What you can do is something like this:

NSString *input = @"heyyö";
NSData *intermediaryDataForm = [input dataUsingEncoding:NSASCIIStringEncoding allowLossyConversion:YES];
NSString *output = [[NSString alloc] initWithData:intermediaryDataForm encoding:NSASCIIStringEncoding];

That way, because the turkish letters are not part of ASCII, and you are allowing a lossy conversion, then it automatically changes 'ö' to 'o' when converted to the NSData form. Then converting it back to NSString solves the issue.

ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top