Django “Did you mean?” query
-
20-08-2019 - |
Question
I am writing a fairly simple Django application where users can enter string queries. The application will the search through the database for this string.
Entry.objects.filter(headline__contains=query)
This query is pretty strait forward but not really helpful to someone who isn't 100% sure what they are looking for. So I expanded the search.
from django.utils import stopwords
results = Entry.objects.filter(headline__contains=query)
if(!results):
query = strip_stopwords(query)
for(q in query.split(' ')):
results += Entry.objects.filter(headline__contains=q)
I would like to add some additional functionality to this. Searching for miss spelled words, plurals, common homophones (sound the same spelled differently), ect. I was just wondering if any of these things were built into Djangos query language. It isn't important enough for me to write a huge algorithm for I am really just looking for something built in.
Thanks in advance for all the answers.
Solution
djangos orm doesn't have this behavior out-of-box, but there are several projects that integrate django w/ search services like:
- sphinx (django-sphinx)
- solr, a lightweight version of lucene (djangosearch)
- lucene (django-search-lucene)
I cant speak to how well options #2 and #3 work, but I've used django-sphinx quite a lot, and am very happy with the results.
OTHER TIPS
You could try using python's difflib module.
>>> from difflib import get_close_matches
>>> get_close_matches('appel', ['ape', 'apple', 'peach', 'puppy'])
['apple', 'ape']
>>> import keyword
>>> get_close_matches('wheel', keyword.kwlist)
['while']
>>> get_close_matches('apple', keyword.kwlist)
[]
>>> get_close_matches('accept', keyword.kwlist)
['except']
Problem is that to use difflib one must build a list of words from the database. That can be expensive. Maybe if you cache the list of words and only rebuild it once in a while.
Some database systems support a search method to do what you want, like PostgreSQL's fuzzystrmatch
module. If that is your case you could try calling it.
edit:
For your new "requirement", well, you are out of luck. No, there is nothing built in inside django's query language.