Question

Does anyone know of a fast implementation of fuzzy search matching for Objective-c? (levenshtein-distance algorithm).

I found this: https://github.com/thetron/StringScore/blob/master/NSString%2BScore.m - but it's unfortunately pretty slow. I need to compare this to a about 200 strings, and it's continuous - per new keystroke typed.

Any ideas?

Was it helpful?

Solution

If NSString+Score does what you want but is too slow, you could start by speeding it up. Lines 23 to 28 in -scoreAgainst:fuzziness:options: are setup code that only need to be done once, not on every of the 200 compares. So pull that code out into a setup method and measure again.

Edit:

As an exercise, I forked StringScore, extracted the setup code and did minimal changes to get some performance improvement, then measured it. I used 1000 random words, grouped them in three each (e.g. "disrupted dot drinking"). For each of these groups I did the setup (as told in this original answer) and then compared the string to all of the 1000 groups. This takes around 11 seconds on my Core 2 Duo.

So comparing one word to 1000 takes about 11 ms. Now you only need 1 to 200, so it will be probably well under 10 ms. That should work for you?

(By the way, nearly half the time is still spent in rangeOfString: finding a single character; this can probably done much much faster, but I didn't want to get in the details of the algorithm.)

OTHER TIPS

I don't know about the algorithm you're referring implemented in Objective-C

Is there a reason you are not using the built in functionality of NSPredicate with CoreData. I have found this very fast searching more than 200 strings.

For example, given an NSString *searchText and a fetchedResultsController

NSPredicate * predicate = [NSPredicate predicateWithFormat:@"name CONTAINS[cd] %@", searchText];

self.filteredListContents = [[[self fetchedResultsController] fetchedObjects] filteredArrayUsingPredicate:predicate];

You can also use an NSPredicate on an NSArray, which I assume you have tried and found to be too slow.

From the apple documentation

NSMutableArray *array =
[NSMutableArray arrayWithObjects:@"Nick", @"Ben", @"Adam", @"Melissa", nil];

NSPredicate *bPredicate = [NSPredicate predicateWithFormat:@"SELF beginswith[c] 'a'"];

NSArray *beginWithB = [array filteredArrayUsingPredicate:bPredicate];
// beginWithB contains { @"Adam" }.

NSPredicate *sPredicate = [NSPredicate predicateWithFormat:@"SELF contains[c] 'e'"];

[array filterUsingPredicate:sPredicate];
// array now contains { @"Nick", @"Ben", @"Melissa" }

https://developer.apple.com/library/mac/#documentation/Cocoa/Conceptual/Predicates/Articles/pSyntax.html

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top