Вопрос

I am using django-haystack on elastic search. My indexed documents each have a number of people's names and for each person, their role associated with the document. For example:

Doc1:
   name='Bob',  role='Author'
   name='Jill', role='Editor'
   name='Joe',  role='Publisher'

Doc2:
   name='Jill',  role='Author'
   name='Phill', role='Editor'
   name='Janet', role='Contributor'

How would I setup my index to allow me to do the search: "find all documents where Jill is an Author"? In the above example, I would want it to return only Doc2 and not Doc1.

There are hundreds of different types of roles a person can have, so it isn't realistic to have an index field for each type. I thought about having a single index field joining the two together (e.g., name_role=indexes.CharField(...)), where each entry has a delimiter that I would parse (e.g., "Jill#Author"). But that seems ugly.

Are there any better ways to do this? I feel like ElasticSearch's nested type may be able to help, but I'm not sure.

Even though I'm using django-haystack, if there is an elasticsearch specific answer, I'd be happy to hear it.

Это было полезно?

Решение

Indeed, ElasticSearch's nested type is essential to get this working neatly. This is not by default supported by django-haystack (because it's ES-specific), but it is possible to support this functionality by extending some of haystack's classes.

There is a blog post that explains this quite clearly (and a gist that can be forked).

Coincidentally, I wrote the post and @speedplane already found it, but hey.. ;-)

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top