質問

Thanks in advance. I expose the situation first and in the end the solution.

I have a collection of 2M documents with the following mapping:

{
   "image": {
      "properties": {
         "timestamp": {
            "type": "date",
            "format": "dateOptionalTime"
         },
         "title": {
            "type": "string"
         },
         "url": {
            "type": "string"
         }
      }
   }
}

I have a webpage which paginates through all the documents with the following search:

{  
  "from":STARTING_POSITION_NUMBER,
  "size":15,
  "sort" : [
        { "_id" : {"order" : "desc"}}
    ],
  "query" : {
    "match_all": {}
   }
 }

And a hit looks like this(note that the _id value is a hash of the url to prevent duplicated documents):

 {
    "_index": "images",
    "_type": "image",
    "_id": "2a750a4817bd1600",
    "_score": null,
    "_source": {
       "url": "http://test.test/test.jpg",
       "timestamp": "2014-02-13T17:01:40.442307",
       "title": "Test image!"
    },
    "sort": [
       null
    ]
 }

This works pretty well. The only problem I have is that the documents appear sorted chronologically (The oldest documents appear on the first page, and the ones indexed more recently on the last page), but I want them to appear on a random order. For example, page 10 should always show always the same N documents, but they don't have to appear sorted by the date.

I though of something like sorting all the documents by their hash, which is kind of random and deterministic. How could I do it?

I've searched on the docs and the sorting api just works for sorting the results, not the full index. If I don't find a solution I will pick documents randomly and index them on a separated collection.

Thank you.

役に立ちましたか?

解決

I solved it using the following search:

{  
    "from":STARTING_POSITION_NUMBER,
    "size":15,
    "query" : {
        "function_score": {
           "random_score": {
            "seed" : 1
           }
        }    
    }
}

Thanks to David from the Elasticsearch mailing list for pointing out the function score with random scoring.

ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top