Domanda

I am trying to use the elasticsearch routing mapping to speed up some queries, but I am not getting the expected result set (not worried about the query performance just yet)

I am using Elastic to set up my mapping:

    $index->create(array('number_of_shards' => 4,
            'number_of_replicas' => 1,
            'mappings'=>array("country"=>array("_routing"=>array("path"=>"countrycode"))),
            'analysis' => array(
                    'analyzer' => array(
                            'indexAnalyzer' => array(
                                    'type' => 'keyword',
                                    'tokenizer' => 'nGram',
                                    'filter' => array('shingle')
                            ),
                            'searchAnalyzer' => array(
                                    'type' => 'keyword',
                                    'tokenizer' => 'nGram',
                                    'filter' => array('shingle')
                            )
                    )
            )  ), true);

If I understand correctly, what should happen is that each result should now have a field called "countrycode" with the value of "country" in it.

The results of _mapping look like this:

{"postcode":
    {"postcode":
        {"properties":
               {
               "area1":{"type":"string"},
               "area2":{"type":"string"},
               "city":{"type":"string",
               "include_in_all":true},
               "country":{"type":"string"},
               "country_iso":{"type":"string"},
               "country_name":{"type":"string"},
               "id":{"type":"string"},
               "lat":{"type":"string"},
               "lng":{"type":"string"},
               "location":{"type":"geo_point"},
               "region1":{"type":"string"},
               "region2":{"type":"string"},
               "region3":{"type":"string"},
               "region4":{"type":"string"},
               "state_abr":{"type":"string"},
               "zip":{"type":"string","include_in_all":true}}},
               "country":{
                   "_routing":{"path":"countrycode"},
                   "properties":{}
                          }
                 }
           }

Once all the data is in the index if I run this command:

http://localhost:9200/postcode/_search?pretty=true&q=country:au

it responds with 15740 total items

what I was expecting is that if I run the query like this:

http://localhost:9200/postcode/_search?routing=au&pretty=true

Then I was expecting it to respond with 15740 results

instead it returns 120617 results, which includes results where country is != au

I did note that the number of shards in the results went from 4 to 1, so something is working.

I was expecting that in the result set there would be an item called "countrycode" (from the rounting mapping) which there isn't

So I thought at this point that my understand of routing was wrong. Perhaps all the routing does is tell it which shard to look in but not what to look for? in other words if other country codes happen to also land in that particular shard, the way those queries are written will just bring back all records in that shard?

So I tried the query again, this time adding some info to it.

 http://localhost:9200/postcode/_search?routing=AU&pretty=true&q=country:AU 

I thought by doing this it would force the query into giving me just the AU place names, but this time it gave me only 3936 results

So I Am not quite sure what I have done wrong, the examples I have read show the queries changing from needing a filter, to just using match_all{} which I would have thought would only being back ones matching the au country code.

Thanks for your help in getting this to work correctly.

Almost have this working, it now gives me the correct number of results in a single shard, however the create index is not working quite right, it ignores my number_of_shards setting, and possibly other ones too

$index = $client->getIndex($indexname);
    $index->create(array('mappings'=>array("$indexname"=>array("_routing"=>array("required"=>true))),'number_of_shards' => 6,
            'number_of_replicas' => 1,
            'analysis' => array(
                    'analyzer' => array(
                            'indexAnalyzer' => array(
                                    'type' => 'keyword',
                                    'tokenizer' => 'nGram',
                                    'filter' => array('shingle')
                            ),
                            'searchAnalyzer' => array(
                                    'type' => 'keyword',
                                    'tokenizer' => 'nGram',
                                    'filter' => array('shingle')
                            )
                    )
            )  ), true); 
È stato utile?

Soluzione

I can at least help you with more info on where to look:

http://localhost:9200/postcode/_search?routing=au&pretty=true

That query does indeed translate into "give me all documents on the shard where documents for country:AU should be sent."

Routing is just that, routing ... it doesn't filter your results for you.

Also i noticed you're mixing your "au"s and your "AU"s .. that might mix things up too.

You should try setting required on your routing element to true, to make sure that your documents are actually stored with routing information when being indexed.

Actually to make sure your documents are indexed with proper routing explicitly set the route to lowercase(countrycode) when indexing documents. See if that helps any.

For more information try reading this blog post:

http://www.elasticsearch.org/blog/customizing-your-document-routing/

Hope this helps :)

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top