Full Text Search on Heroku, database and or indexer selection?
-
25-10-2019 - |
Question
I am looking to implement (free as in beer) full text searching on a small application on Heroku (minimal number of users, limited dataset). However, I am struggling to find a best pattern for doing so, one option is to use the 10mb limit of xeround, while it lasts (we may exceed this in the near future), the second is to somehow roll my own full text search on MongoDB or CouchDB.
The documents in this application are archived emails that I wish to make searchable from a mailing list, there are approximately 10k such emails, plain text, roughly 700bytes per.
I would prefer fuzzy search capabilities, thus the push for whoosh.
Among my requirements (I should have mentioned earlier, is for it to be free!)
I have not found any patterns for using whoosh with MongoDB in a python, flask application.
Can anyone provide more information on how to handle full text search in a small heroku, python application?
Solution
So I've not tried it, but http://tenderlove.github.com/texticle/ seems to imply that you can use native pgsql fulltext search if you can fit within the space limits. The trouble with whoosh is that you're going to run into issues with disk space and its persistence within heroku rules.
The other thing to do is to work with the add ons as suggested via the dev docs: http://devcenter.heroku.com/articles/full-text-search
As for patterns, you basically have to do the fulltext search and get back data/ids of records and then query your data store (mongo) for the full dataset based on the fulltext results. It's a manual process, but nothing that's too strange. If the search doesn't need full records, you can usually get away with stashing the important data with the fulltext information, but that'll increase the size of your fulltext indexing.
OTHER TIPS