Вопрос

I have some customer documents that I want to be retrieved using ElasticSearch based on where the customers come from (country field is IN an array of countries).

[
  {
    "name": "A1",
    "address": {
      "street": "1 Downing Street"
      "country": {
        "code": "GB",
        "name": "United Kingdom"
      }
    }
  },
  {
    "name": "A2",
    "address": {
      "street": "25 Gormut Street"
      "country": {
        "code": "FR",
        "name": "France"
      }
    }
  },
  {
    "name": "A3",
    "address": {
      "street": "Bonjour Street"
      "country": {
        "code": "FR",
        "name": "France"
      }
    }
  }
]

Now, I have another an array in my Python code:

["DE", "FR", "IT"]

I'd like to obtain the two documents, A2 and A3.

How would I write this in PyES/Query DSL? Am I supposed to be using an ExistsFilter or a TermQuery for this. ExistsFilter seems to only check whether the field exists or not, but doesn't care about the value.

Это было полезно?

Решение

In NoSQL-type document stores, all you get back is the document, not parts of the document.

Your requirement: "I'd like to obtain the two documents, A2 and A3." implies that you need to index each of those documents separately, not as an array inside another "parent" document.

If you need to match values of the parent document alongside country then you need to denormalize your data and store those values from the parent doc inside each sub-doc as well.

Once you've done the above, then the query is easy. I'm assuming that the country field is mapped as:

country: { type: "string", index: "not_analyzed" }

To find docs with DE, you can do:

curl -XGET 'http://127.0.0.1:9200/_all/_search?pretty=1'  -d '
{
   "query" : {
      "constant_score" : {
         "filter" : {
            "term" : {
               "country" : "DE"
            }
         }
      }
   }
}
'

To find docs with either DE or FR:

curl -XGET 'http://127.0.0.1:9200/_all/_search?pretty=1'  -d '
{
   "query" : {
      "constant_score" : {
         "filter" : {
            "terms" : {
               "country" : [
                  "DE",
                  "FR"
               ]
            }
         }
      }
   }
}
'

To combine the above with some other query terms:

curl -XGET 'http://127.0.0.1:9200/_all/_search?pretty=1'  -d '
{
   "query" : {
      "filtered" : {
         "filter" : {
            "terms" : {
               "country" : [
                  "DE",
                  "FR"
               ]
            }
         },
         "query" : {
            "text" : {
               "address.street" : "bonjour"
            }
         }
      }
   }
}
'

Also see this answer for an explanation of how arrays of objects can be tricky, because of the way they are flattened:

Is it possible to sort nested documents in ElasticSearch?

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top