Question

I can't get the mac address search to return proper results when I'm doing partial searches (half octect). I mean, if I look for the exact mac address I get results but if try to search for partial search like "00:19:9" I don't get anything until I complete the octet.

Can anyone point out which mapping should I use to index it or kind of search query should I use??

curl -XDELETE http://localhost:9200/ap-test
curl -XPUT http://localhost:9200/ap-test

curl -XPUT http://localhost:9200/ap-test/devices/1 -d '
{
  "user" : "James Earl",
  "macaddr" : "00:19:92:00:71:80"
}'

curl -XPUT http://localhost:9200/ap-test/devices/2 -d '
{
  "user" : "Earl",
  "macaddr" : "00:19:92:00:71:82"
}'

curl -XPUT http://localhost:9200/ap-test/devices/3 -d '
{
  "user" : "James Edward",
  "macaddr" : "11:19:92:00:71:80"
}'

curl -XPOST 'http://localhost:9200/ap-test/_refresh'
curl -XGET http://localhost:9200/ap-test/devices/_mapping?pretty

When I to find exact matches I get them correctly....

curl -XPOST http://localhost:9200/ap-test/devices/_search -d '
{
    "query" : {
        "query_string" : {
            "query":"\"00\\:19\\:92\\:00\\:71\\:80\""
        }
    }
}'

# RETURNS:

{
  "took": 6,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.57534903,
    "hits": [
      {
        "_index": "ap-test",
        "_type": "devices",
        "_id": "1",
        "_score": 0.57534903,
        "_source": {
          "user": "James Earl",
          "macaddr": "00:19:92:00:71:80"
        }
      }
    ]
  }
}

HOWEVER, I need to be able to match partial mac addresses searches like this:

curl -XPOST http://localhost:9200/ap-test/devices/_search -d '
{
    "query" : {
        "query_string" : {
            "query":"\"00\\:19\\:9\""
        }
    }
}'

# RETURNS 0 instead of returning 2 of them 

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 0,
    "max_score": null,
    "hits": []
  }
}

SO, What mapping should I use? Is there a better query string to accomplish this? BTW, what's the difference between using 'query_string' and 'text'?

Was it helpful?

Solution 2

After some research I found and easier way to make it work.

Elasticsearch query options are confusing sometimes because they have so many options...

  • query_string: has a full-fledged search with a myriad of options and wildcard uses.
  • match: is simpler and doesn't require wildcard characters, or other “advance” features. This one it's great to use it in search boxes because chances of it failing are very small if not non-existent.

So, that said. This is the one that work the best in most cases and didn't required customized mapping.

curl -XPOST http://localhost:9200/ap-test/devices/_search -d '
{
    "query" : {
        "match_phrase_prefix" : {
            "_all" : "00:19:92:00:71:8"
        }
    }
}'

OTHER TIPS

It looks like you haven't defined a mapping at all, which means elasticsearch will guess off your datatypes and use the standard mappings.

For the field macaddr, this will be recognised as a string and the standard string analyzer will be used. This analyzer will break up the string on whitespace and punctuation, leaving you with tokens consisting of pairs of numbers. e.g. "00:19:92:00:71:80" will get tokenized to 00 19 92 00 71 80. When you search the same tokenization will happen.

What you want is to define an analyzer which turns "00:19:92:00:71:80" into the tokens 00 00: 00:1 00:19 etc...

Try this:

curl -XPUT http://localhost:9200/ap-test  -d '
{
    "settings" : {
        "analysis" : {
            "analyzer" : {
                "my_edge_ngram_analyzer" : {
                    "tokenizer" : "my_edge_ngram_tokenizer"
                }
            },
            "tokenizer" : {
                "my_edge_ngram_tokenizer" : {
                    "type" : "edgeNGram",
                    "min_gram" : "2",
                    "max_gram" : "17"
                }
            }
        }
    }
}'

curl -XPUT http://localhost:9200/ap-test/devices/_mapping  -d '
{
    "devices": {
        "properties" {
            "user": {
                "type": "string"
            },
            "macaddr": {
                "type": "string",
                "index_analyzer" : "my_edge_ngram_analyzer",
                "search_analyzer": "keyword"
            }
        }
    }
}'

Put the documents as before, then search with the query specifically aimed at the field:

curl -XPOST http://localhost:9200/ap-test/devices/_search -d '
{
    "query" : {
        "query_string" : {
            "query":"\"00\\:19\\:92\\:00\\:71\\:80\"",
            "fields": ["macaddr", "user"]
        }
    }
}'

As for your last question, the text query is deprecated.

Good luck!

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top