Question

I am using ElasticSearch to power search in my application.

One of the things we need to do is to have a way of storing arbitrary relationships between different ElasticSearch types.

Suppose we have three types in the ElasticSearch index such as

"Customers","Lifestyle" and "MovieGenres".

Lets say the user wants to associate multiple lifestyles for a customer, what happens is that the users specifies the mapping in the application and within ElasticSearch takes the docId of the "customer", the docId of the "lifestyle" and puts it in a "Mapper" type which stores the docid pairs and stores the association type as a text("cutomerToLifeStyle").

In a similar fashion the association between customer and moviegenre will insert the pair of docIds for customer and moviegenre along with an association type text such as ("customerToMovieGenre"). The idea is to consolidate any kind of arbitrary relationships between types in this one "Mapper" type.

This may be a tad simplistic way of solving the issue but does anybody see anything wrong with this approach?

Was it helpful?

Solution

Purely from the design perspective I would also opt for a "Mapper" type that links two documents by storing their ids rather than e.g. listing ids of "MovieGenres" inside the "Customers" documents. However, I would introduce a separate "Mapper" type for each relationship. (That is a "CustomerToLifeStyle" type and a "CustomerToMovieGenre" type.) The reason is that you might store additional data in the mapper documents that is specific to the relationship. Let us say you want to store how well the customer rated the movies in a certain genre. This info won't make sense in a customer to lifestyle relationship. The more additional data you want to store in the mapper documents the more you will see the "CustomerToLifeStyle" data and the "CustomerToMovieGenre" data become different. So having separate types for each relationship keeps your design nice and clean.

OTHER TIPS

If you want to model this kind of relationship in elasticsearch, you should use the parent/child relationship which is one of the core concepts in elasticsearch:

http://www.elasticsearch.org/blog/managing-relations-inside-elasticsearch/

The advantage of this approach is that you can use standard ways to traverse relationships and to build queries. If you use your own approach, then what will happen is that you have to do joins and lookups manually in the code.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top