Question

I'm encountering some issues when performing several get_or_create requests to ES. Elasticsearch seems to take some time after responding to the POST to index a document, so much that a GET called just after returns no results.

This example reproduces the issue:

curl -XPOST 'http://localhost:9200/twitter/tweet/' -d '{
    "user" : "kimchy",
    "post_date" : "2009-11-15T14:12:12",
    "message" : "trying out Elastic Search"
}' && \
curl -XGET 'http://localhost:9200/twitter/tweet/_search' -d '{
    "query" : {
        "term" : { "user" : "kimchy" }
    }
}' && \
sleep 1 && \
curl -XGET 'http://localhost:9200/twitter/tweet/_search' -d '{
    "query" : {
        "term" : { "user" : "kimchy" }
    }
}'

The POST goes well:

{
    "ok": true,
    "_index": "twitter",
    "_type": "tweet",
    "_id": "yaLwtgSuQcWg5lzgFpuqHQ",
    "_version": 1
}

The first GET does not match any result:

{
    "took": 1,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "failed": 0
    },
    "hits": {
        "total": 0,
        "max_score": null,
        "hits": []
    }
}

And after a brief pause, shows the result (second GET):

{
    "took": 1,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "failed": 0
    },
    "hits": {
        "total": 1,
        "max_score": 0.30685282,
        "hits": [{
            "_index": "twitter",
            "_type": "tweet",
            "_id": "yaLwtgSuQcWg5lzgFpuqHQ",
            "_score": 0.30685282,
            "_source": {
                "user": "kimchy",
                "post_date": "2009-11-15T14:12:12",
                "message": "trying out Elastic Search"
            }
        }]
    }
}

Is that behaviour normal ?

Is there a possibility to get the result immediately, even if the response is slower ?

Thanks!

Was it helpful?

Solution

Yeah this is normal, elastic search by default updates it's indexes once per second.

If you need it to update immediately include refresh=true in the URL when inserting documents

From the documentation:

refresh

To refresh the index immediately after the operation occurs, so that the document appears in search results immediately, the refresh parameter can be set to true. Setting this option to true should ONLY be done after careful thought and verification that it does not lead to poor performance, both from an indexing and a search standpoint. Note, getting a document using the get API is completely realtime.

OTHER TIPS

If you need realtime access to objects you just indexed you need to use the get API (http://www.elasticsearch.org/guide/reference/api/get/) and not search. Search, as stated here, is not realtime. The get API is. So if you give your object an ID yourself, you can immediately get that object by ID with the get API.

There is also an optimization to turn off search index refresh_interval during heavy import (like bulk) and put it back when done. Then wait/sleep some seconds and it should work. You may also tune the refresh interval using this (maybe you do not care and want it to refresh only every 15s)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top