I'm looking at using Elasticsearch to provide the search functions of our site.

I've been experimenting with it but am unable to enable the porterstem analyser (so that a search for fight matches fights and fighting).

Here's a run down of my input.

curl -XPUT localhost:9200/local/ -d'
index :                     
    analysis : 
        analyzer : 
            stemming : 
                type : custom 
                tokenizer : standard 
                filter : [standard, lowercase, stop, porterStem] 
'

curl -XPUT localhost:9200/local/_mapping -d'{"properties": { "title" : { "analyzer" : "stemming", "type" : "string" }}}'

curl -XPUT localhost:9200/local/article/1 -d'{"title": "Fight for your life"}'
curl -XPUT localhost:9200/local/article/2 -d'{"title": "Fighting for your life"}'
curl -XPUT localhost:9200/local/article/3 -d'{"title": "My dad fought a dog"}'
curl -XPUT localhost:9200/local/article/4 -d'{"title": "Bruno fights Tyson tomorrow"}'

However running a search for 'fight' only matches the first entry - the one that contains the exact term.

curl -XGET localhost:9200/local/_search?q=fight

The correct settings appear to have been set up but doesn't seem to work.

  "indices" : {
    "local" : {
      "aliases" : [ ],
      "settings" : {
        "index.analysis.analyzer.stemming.type" : "custom",
        "index.analysis.analyzer.stemming.tokenizer" : "standard",
        "index.analysis.analyzer.stemming.filter.1" : "lowercase",
        "index.analysis.analyzer.stemming.filter.0" : "standard",
        "index.analysis.analyzer.stemming.filter.3" : "porterStem",
        "index.analysis.analyzer.stemming.filter.2" : "stop",
        "index.number_of_shards" : "5",
        "index.number_of_replicas" : "1"
      },

Anyone got this functionality up and running and able to point me in the right direction?

有帮助吗?

解决方案

There is an example config on using custom analyzers, using the snowball stemmer: Why ElasticSearch is not finding my term

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top