Вопрос

I'm new to Elastic Search and to the non-SQL paradigm. I've been following ES tutorial, but there is one thing I couldn't put to work.

In the following code (I'me using PyES to interact with ES) I create a single document, with a nested field (subjects), that contains another nested field (concepts).

from pyes import *

conn = ES('127.0.0.1:9200')  # Use HTTP

# Delete and Create a new index.
conn.indices.delete_index("documents-index")
conn.create_index("documents-index")

# Create a single document.
document = {
    "docid": 123456789,
    "title": "This is the doc title.",
    "description": "This is the doc description.",
    "datepublished": 2005,
    "author": ["Joe", "John", "Charles"],
    "subjects": [{
                    "subjectname": 'subject1',
                    "subjectid": [210, 311, 1012, 784, 568],
                    "subjectkey": 2,
                    "concepts": [
                                    {"name": "concept1", "score": 75},
                                    {"name": "concept2", "score": 55}
                                  ]
                },
                {
                    "subjectname": 'subject2',
                    "subjectid": [111, 300, 141, 457, 748],
                    "subjectkey": 0,
                    "concepts": [
                                    {"name": "concept3", "score": 88},
                                    {"name": "concept4", "score": 55},
                                    {"name": "concept5", "score": 66}
                                  ]
                }],
    }


# Define the nested elements.
mapping1 = {
            'subjects': {
                'type': 'nested'
            }
        }
mapping2 = {
            'concepts': {
                'type': 'nested'
            }
        }
conn.put_mapping("document", {'properties': mapping1}, ["documents-index"])
conn.put_mapping("subjects", {'properties': mapping2}, ["documents-index"])


# Insert document in 'documents-index' index.
conn.index(document, "documents-index", "document", 1)

# Refresh connection to make queries.
conn.refresh()

I'm able to query subjects nested field:

query1 = {
    "nested": {
        "path": "subjects",
        "score_mode": "avg",
        "query": {
            "bool": {
                "must": [
                    {
                        "text": {"subjects.subjectname": "subject1"}
                    },
                    {
                        "range": {"subjects.subjectkey": {"gt": 1}}
                    }
                ]
            }
        }
    }
}


results = conn.search(query=query1)
for r in results:
    print r  # as expected, it returns the entire document.

but I can't figure out how to query based on concepts nested field.

ES documentation refers that

Multi level nesting is automatically supported, and detected, resulting in an inner nested query to automatically match the relevant nesting level (and not root) if it exists within another nested query.

So, I tryed to build a query with the following format:

query2 = {
        "nested": {
            "path": "concepts",
            "score_mode": "avg",
            "query": {
                "bool": {
                    "must": [
                        {
                            "text": {"concepts.name": "concept1"}
                        },
                        {
                           "range": {"concepts.score": {"gt": 0}}
                        }
                    ]
                }
            }
        }
}

which returned 0 results.

I can't figure out what is missing and I haven't found any example with queries based on two levels of nesting.

Это было полезно?

Решение

Ok, after trying a tone of combinations, I finally got it using the following query:

query3 = {
    "nested": {
        "path": "subjects",
        "score_mode": "avg",
        "query": {
            "bool": {
                "must": [
                    {
                        "text": {"subjects.concepts.name": "concept1"}
                    }
                ]
            }
        }
    }
}

So, the nested path attribute (subjects) is always the same, no matter the nested attribute level, and in the query definition I used the attribute's full path (subject.concepts.name).

Другие советы

Shot in the dark since I haven't tried this personally, but have you tried the fully qualified path to Concepts?

query2 = {
       "nested": {
           "path": "subjects.concepts",
           "score_mode": "avg",
           "query": {
               "bool": {
                   "must": [
                       {
                           "text": {"subjects.concepts.name": "concept1"}
                       },
                       {
                          "range": {"subjects.concepts.score": {"gt": 0}}
                       }
                   ]
               }
           }
       }
    } 

I have some question for JCJS's answer. why your mapping shouldn't like this?

mapping = {
    "subjects": {
        "type": "nested",
        "properties": {
            "concepts": {
                "type": "nested"
            }
        }
    }
}

I try to define two type-mapping maybe doesn't work, but be a flatten data; I think we should nested in nested properties..

At last... if we use this mapping nested query should like this...

{
    "query": {
        "nested": {
            "path": "subjects.concepts",
            "query": {
                "term": {
                    "name": {
                        "value": "concept1"
                    }
                }
            }
        }
    }
}

It's vital for using full path for path attribute...but not for term key can be full-path or relative-path.

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top