Question

I am trying to build a prototype of Elasticsearch as a Service. I have thought of 2 different approaches and I'd like to get opinions towards one or the other implementation

  1. One single installation of Elasticsearch, and a proxy layer on top to add user validation (http basic authentication + user account to validate the usage). This approach would be relatively straight forward and the main challenge would be configure the cluster properly to handle the load, as well as the permissions so there are no data leaks of the users don't have access to the cluster management APIs.

  2. Use Docker as a container and have one instance of elasticsearch for each user. In this case I would be providing the isolation by using the Linux container (Docker). I'd still need to manage authentication.

It probably would be good to implement both, play around and see how things behave. Any opinions about pros and cons of each approach?

Thanks!

Was it helpful?

Solution

Disclaimer: I am the founder of the Elasticsearch service provider Facetflow, which currently offers shared clusters.

I think that both approaches have merit, but maybe suited for different types of customers. Looking at other SaaS providers, like MongoDB provider MongoLab, they essentially ended up offering both setups (although not using Docker).

So, pros and cons as I see them:

Shared Cluster

Most Elasticsearch as a Service providers operate this way.

Pros:

  1. Far more affordable for the majority of users just looking for good search and analytics.
  2. Simpler maintenance, less clusters for you to monitor
  3. Potentially less versions of Elasticsearch to integrate with. If you need to communicate with other systems (which you do), write your own plugins (we did, for authentication, silos, entitlements, stats etc.) less versions will be far easier to maintain.

Cons:

  1. Noisy neighbours have to be monitored and you have to scale and relocate indices to handle this.
  2. Users have to choose from a limited list of versions of Elasticsearch, usually a single version.
  3. Users don't get full cluster admin control.

Private Clusters using Docker

One provider that works this way is Found.

Pros:

  1. Users could potentially be able to deploy a variety of versions of Elasticsearch
  2. Users can have complete cluster admin access
  3. Noisy neighbours don't affect their cluster, less manual intervention from you

Cons:

  1. Complex monitoring and support. If people can do whatever they want (shut down the cluster over the api), you have to be clear where your responsibility as a provider ends, and what wakes you up at night.
  2. Complex integration with multiple versions, see shared cluster pros.
  3. More expensive since you have to allocate resources that might not always be used.
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top