I'm building a service that will allow users to search for other users who are nearby, based on GPS coordinates. I've tried using ElasticSearch's geo spatial indexes. When a user signs in, he submits his GPS location to an ElasticSearch geo index. Other users periodically poll ElasticSearch, querying for new documents that contain GPS coordinates within a few hundred meters.

The problem is that ElasticSearch either doesn't update its index fast enough, or it caches its results, making it unsuitable for retrieving real-time results. I've tried disabling the cache with index.cache.filter.max_size=-1 and passing "_cache=false" with every query. ElasticSearch still returns stale results when polling with the same query, and it can return stale results for up to a few minutes.

Any idea on what could be happening? Maybe it's because I'm keeping the same connection open during polling, and ElasticSearch caches results for each connection? Still, the results can be out of date with subsequent requests.

有帮助吗?

解决方案

Elasticsearch results don't become immediately available for search. They are accumulated in a buffer and become available only after operation called refresh. In other words, search is not real time, but "near real time" operation ("near" is because refresh is called every second by default). Please also note that get operation is real-time - you can get document immediately after it is indexed.

While you can force refresh process after each document or make it more often, it's not the best solution for your problem because very frequent refreshing can significantly reduce search and indexing performance. Instead, I would advise you to check Elasticsearch percolators, which were added exactly for the use cases such as yours.

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top