Question

Haystack generates elasticsearch queries to get results from elasticsearch. The queries get prepended with a filter containing the following query:

"query": {
    "query_string": {
        "query": "django_ct:(customers.customer)"
     }
}

What is the meaning of the django_ct(..) query? Is this a function that haystack installs in elasticsearch? Is it some caching magic? Can I get rid of this part altogether?

The reason why I'm asking is that I have to build a custom query to use an elasticsearch multi_field. In order to change the queries I want to understand first how haystack generates its own queries.

Was it helpful?

Solution

Haystack uses Django's content types to determine which model attributes to search against in Elasticsearch. This is not really best practice, but it's how it's done in HS.

Basically, the code in HS looks something like this:

app_name, model_name = django_ct.split('.')
ct = ContentType.objects.get_by_natural_key(app_name, model_name)
model = ct.model_class()
# do stuff with model

So, you really don't want to ignore it when using haystack, if you are indexing more than one model in your index.

I have a couple other answers based on elasticsearch here: index analyzer vs query analyzer in haystack - elasticsearch? and here: Django Haystack Distinct Value for Field

EDIT regarding multi-fields:

I've used Haystack and multifields in the past, so I'm not sure you need to write you own backend. The key is understanding how haystack creates searches. As I said in one of the other posts, everything goes into query_string and from there it creates a lucene based search string. Again, not really best practice.

So let's say you have a multi-field that looks like this:

            "some_field": {
                "type": "multi_field",
                "fields": {
                    "some_field_edgengram": {
                        "type": "string",
                        "index": "analyzed",
                        "index_analyzer": "autocomplete_index",
                        "search_analyzer": "autocomplete_search"
                    },
                    "some_field": {
                        "type": "string",
                        "index": "not_analyzed"
                    }
                }
            },

In haystack, you can just search against some_field and some_field_edgengram directly.

For example SearchQuerySet().filter(some_field="cat") and SearchQuerySet().filter(some_field_edgengram="cat") will both work, but the first will only match tokens that have cat exactly and the second will match cat, cats, catlin, catch, etc, at least using my edgengram analyzers.

However, just because you use haystack for indexing and search doesn't mean you have to use it for 100% of your search solutions. In the past, I've used PYES in some areas of the app and haystack in others, because haystack lacked the support for more advanced features and the query_string parsing was losing some of the finer grained accuracy we were looking for.

In your case, you could get results from the search engine via elasticutils or python-elasticseach directly for some more advanced searches and use haystack for the other more routine searches.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top