Question

I am wondering how would I be able to run a SQLite order by in this manner

select * from contacts order by jarowinkler(contacts.name,'john smith');

I know Android has a bottleneck with user defined functions, do I have an alternative?

Was it helpful?

Solution

Step #1: Do the query minus the ORDER BY portion

Step #2: Create a CursorWrapper that wraps your Cursor, calculates the Jaro-Winkler distance for each position, sorts the positions, then uses the sorted positions when overriding all methods that require a position (e.g., moveToPosition(), moveToNext()).

OTHER TIPS

Pre calculate string lengths and add them into separate column. Then sort entired table by that that length. Add indexes (if you can). Then add extra filters for example you don't want to compare "Srivastava Brahmaputra" to "John Smith". The length are out of wack by way too much so exclude these kind of comparison by length as a percentage of the total length. So if your word is 10 characters compare it only to words with 10+-2 or 10+-3 characters.

This way you will significantly reduce the number of times this algorithm needs to run.

Typically in the vocalbulary of 100 000 entries such filters reduce the number of comparisons to about 300. Unless you are doing a full blown record linkage and then I would wonder why use Android for that. You would still need to apply probabilistic methods for that and calculate scores and this is not a job for Android (at least not for now).

Also in MS SQL Server Jaro Winkler string distance wrapped into CLR function perform much better, since SQL Server doesn't supprt arays natively and much of the processing is around arrays. So implementation in T-SQL add too much overhead, but SQL-CLR works extremely fast.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top