Question

I have a Django project that uses SOLR for indexing.

I'm trying to do a substring search using Haystack's SearchQuerySet class.

For example, when a user searches for the term "ear", it should return the entry that has a field with the value: "Search". As you can see, "ear" is a SUBSTRING of "Search". (obviously :))

In other words, in a perfect Django world I would like something like:

SearchQuerySet().all().filter(some_field__contains_substring='ear')

In the haystack documentation for SearchQuerySet (https://django-haystack.readthedocs.org/en/latest/searchqueryset_api.html#field-lookups), it says that only the following FIELD LOOKUP types are supported:

  • contains
  • exact
  • gt, gte, lt, lte
  • in
  • startswith
  • range

I tried using __contains, but it behaves exactly like __exact, which looks up the exact word (the whole word) in a sentence, not a substring of a word.

I am confused, because such a functionality is pretty basic, and I'm not sure if I'm missing something, or there is another way to approach this problem (using Regex or something?).

Thanks

Was it helpful?

Solution

That could be done using EdgeNgramField field:

some_field = indexes.EdgeNgramField() # also prepare value for this field or use model_attr

Then for partial match:

SearchQuerySet().all().filter(some_field='ear')

OTHER TIPS

It's a bug in haystack.

As you said, __exact is implemented exactly like __contains and therefore this functionality does not exists out of the box in haystack.

The fix is awaiting merge here: https://github.com/django-haystack/django-haystack/issues/1041

You can bridge the waiting time for a fixed release like this:

from haystack.inputs import BaseInput, Clean


class CustomContain(BaseInput):
    """
    An input type for making wildcard matches.
    """
    input_type_name = 'custom_contain'

    def prepare(self, query_obj):
        query_string = super(CustomContain, self).prepare(query_obj)
        query_string = query_obj.clean(query_string)

        exact_bits = [Clean(bit).prepare(query_obj) for bit in query_string.split(' ') if bit]
        query_string = u' '.join(exact_bits)

        return u'*{}*'.format(query_string)

# Usage:
SearchQuerySet().filter(content=CustomContain('searchcontentgoeshere'))
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top