Question

I am confused with the near-real-time search ability of Solr and Elasticsearch. The near-real-time search is argued to be one of the advantages Elasticsearch has over Solr. However, I have read some documents of Solr saying that near-real-time search also can be done on Solr by using soft commit with the cost of open new searcher. By doing so, the new document is visible in 1 second. In Elasticsearch, the refresh can also make a new document available to search within one second. Did I miss or misunderstand anything? Which one does better on real time search? Any answer would be appreciated. Thank you.

Was it helpful?

Solution

At the end of the day, they both use lucene under the hood. Near real-time search in lucene means reopening the index reader, called refresh in elasticsearch and exposed through the refresh api.

On the other hand you need to commit the lucene index too to have durability, which is expensive and cannot be done every second, and that is why elasticsearch has a transaction log and what makes elasticsearch "kill -9 safe", and allows also for real-time get.

But the best part to me is that in elasticsearch the user doesn't have to worry about refreshes and commits too much, as everything happens automatically under the hood, by default. At the same time, there are apis (refresh and flush) as well as settings that allow to change the default behaviour for power users.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top