Question

I am using django-haystack on elastic search. My indexed documents each have a number of people's names and for each person, their role associated with the document. For example:

Doc1:
   name='Bob',  role='Author'
   name='Jill', role='Editor'
   name='Joe',  role='Publisher'

Doc2:
   name='Jill',  role='Author'
   name='Phill', role='Editor'
   name='Janet', role='Contributor'

How would I setup my index to allow me to do the search: "find all documents where Jill is an Author"? In the above example, I would want it to return only Doc2 and not Doc1.

There are hundreds of different types of roles a person can have, so it isn't realistic to have an index field for each type. I thought about having a single index field joining the two together (e.g., name_role=indexes.CharField(...)), where each entry has a delimiter that I would parse (e.g., "Jill#Author"). But that seems ugly.

Are there any better ways to do this? I feel like ElasticSearch's nested type may be able to help, but I'm not sure.

Even though I'm using django-haystack, if there is an elasticsearch specific answer, I'd be happy to hear it.

Était-ce utile?

La solution

Indeed, ElasticSearch's nested type is essential to get this working neatly. This is not by default supported by django-haystack (because it's ES-specific), but it is possible to support this functionality by extending some of haystack's classes.

There is a blog post that explains this quite clearly (and a gist that can be forked).

Coincidentally, I wrote the post and @speedplane already found it, but hey.. ;-)

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top