ElasticSearch terms aggregation by entire field

https://stackoverflow.com/questions/22798170

25-06-2023
|

Pergunta

How can I write an ElasticSearch term aggregation query that takes into account the entire field value, rather than individual tokens? For example, I would like to aggregate by city name, but the following returns new, york, san and francisco as individual buckets, not new york and san francisco as the buckets as expected.

curl -XPOST "http://localhost:9200/cities/_search" -d'
{
   "size": 0, 
   "aggs" : {
     "cities" : {
         "terms" : { 
            "field" : "city",
            "min_doc_count": 10
         }
     }
   }
}'

Solução

You should fix this in your mapping. Add a not_analyzed field. You can create the multi field if you also need the analyzed version.

"album": {
  "city": "string",
  "fields": {
    "raw": {
      "type": "string",
      "index": "not_analyzed"
    }
  }
}

Now create your aggregate on city.raw

Outras dicas

Update at 2018-02-11 now we can use syntax .keyword after grouped by field according to this

GET /bank/_search
{
  "size": 0,
  "aggs": {
    "group_by_state": {
      "terms": {
        "field": "state.keyword"
      }
    }
  }
}

This elastic doc suggests to fix that in mapping (as suggested in the accepted answer) - either to make the field not_analyzed or to add a raw field with not_analyzed and use it in aggregations.

There is no other way for it. As the aggregations operate upon inverted index and if the field is analyzed, the inverted index is bound to have only tokens and not the original values of the field.

Licenciado em: CC-BY-SA com atribuição

Não afiliado a StackOverflow