We're storing a title
field in our index and want to use the field for two purposes:
- We're analyzing with an ngram filter so we can provide autocomplete and instant results
- We want to be able to list results using an ASC sort on the
title
field rather than score.
The index/filter/analyzer is defined like so:
array(
'number_of_shards' => $this->shards,
'number_of_replicas' => $this->replicas,
'analysis' => array(
'filter' => array(
'nGram_filter' => array(
'type' => 'nGram',
'min_gram' => 2,
'max_gram' => 20,
'token_chars' => array('letter','digit','punctuation','symbol')
)
),
'analyzer' => array(
'index_analyzer' => array(
'type' => 'custom',
'tokenizer' =>'whitespace',
'char_filter' => 'html_strip',
'filter' => array('lowercase','asciifolding','nGram_filter')
),
'search_analyzer' => array(
'type' => 'custom',
'tokenizer' =>'whitespace',
'char_filter' => 'html_strip',
'filter' => array('lowercase','asciifolding')
)
)
)
),
The problem we're experiencing is unpredictable results when we Sort on the title
field. After doing a little searching, we found this at the end of the sort
man page at ElasticSearch... (http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-sort.html#_memory_considerations)
For string based types, the field sorted on should not be analyzed / tokenized.
How can we both analyze the field and sort on it later? Do we need to store the field twice with one using not_analyzed
in order to sort? Since the field _source
is also storing the title
value in it's original state, can that not be used to sort on?