Question

Danish language has just 3 non-standard characters: å, ø and æ.

When I try to search my Core Data entity using following predicates:

name CONTAINS[cd] "ø" // correct results
name CONTAINS[cd] "æ" // correct results
name CONTAINS[cd] "å" // wrong results - with 'å' and 'a'

First 2 predicates works correctly but not the last one. It results with both "å" and "a" letters in it.

What is so special with this one letter only?

Was it helpful?

Solution

I suggest you make your query string lower case and don't use the [cd] as a part of the predicate statement, both for Core Data optimization purposes but also that it returns the correct results.

Working example:

NSArray *ar = @[@"å",@"a",@"åa"];
NSPredicate *predicate = [NSPredicate predicateWithFormat:@"self CONTAINS %@", @"å"];
NSArray *filteredArray = [ar filteredArrayUsingPredicate:predicate];
NSLog(@"Results: %@",filteredArray); // which returns (å,åa)

OTHER TIPS

This is the right behaviour of the diacritic search. If you specify the d parameter core data runs a diacritic insensitive search. This means that it ignores all the accents.

That character is 'special' because it has multiple unicode representation and your search will produce different results based also on the unicode value that is saved in the store.

There is a nice explanation about that character in the string issue on objc.io https://www.objc.io/issues/9-strings/unicode/

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top