Pregunta

A simple version of my document document is the following structure:

doc:

{
   "date": "2014-04-16T17:13:00",
   "key": "de5cefc56ff51c33351459b88d42ca9f828445c0",
}

I would like to group my document by key, to get the latest date and the number of documents for each key, something like

{ "Last": "2014-04-16T16:00:00", "Count": 10 }

My idea is to to do a map/reduce view and query setting group to true.
This is what I have so far tried. I get the exact count, but not the correct dates.

map

function (doc, meta) {
  if(doc.type =="doc")
      emit(doc.key, doc.date);
}

reduce

function(key, values, rereduce) {
   var result = {
    Last: 0,
    Count: 0
  };

   if (rereduce) {
       for (var i = 0; i < values.length; i++) {
           result.Count += values[i].Count;
           result.Last = values[i].Last;
         }

   } else {
       result.Count = values.length;
       result.Last = values[0]
   }
  return result;
}
¿Fue útil?

Solución

You're not comparing dates... Couchbase sorts values by key. In your situation it will not sort it by date, so you should do it manually in your reduce function. Probably it will look like: result.Last = values[i].Last > result.Last ? values[i].Last : result.Last;

and in reduce function it also can be an array, so I don't think that your reduce function always be correct.

Here is an example of my reduce function that filter documents and leave just one that have the newest date. May be it will be helpful or you can try to use it (seems it looks like reduce function that you want, you just need to add count somewhere).

function(k,v,r){ if (r){ if (v.length > 1){ var m = v[0].Date; var mid = 0; for (var i=1;i<v.length;i++){ if (v[i].Date > m){ m = v[i].Date; mid = i; } } return v[mid]; } else { return v[0] || v; } } if (v.length > 1){ var m = v[0].Date; var mid = 0; for (var i=1;i<v.length;i++){ if (v[i].Date > m){ m = v[i].Date; mid = i; } } return v[mid]; } else { return v[0] || v; } }

UPD: Here is an example of what that reduce do: Input date (values) for that function will look like (I've used just numbers instead of text date to make it shorter):

[{Date:1},{Date:3},{Date:8},{Date:2},{Date:4},{Date:7},{Date:5}]

On the first step rereduce will be false, so we need to find the biggest date in array, and it will return

Object {Date: 8}.

Note, that this function can be called one time, but it can be called on several servers in cluster or on several branches of b-tree inside one couchbase instance.

Then on next step (if there were several machines in cluster or "branches") rereduce will be called and rereduce var will be set to true

Incoming data will be: [{Date:8},{Date:10},{Date:3}], where {Date:8} came from reduce from one server(or branch), and other dates came from another server(or branch).

So we need to do exactly the same on that new values to find the biggest one.

Answering your question from comments: I don't remember why I used same code for reduce and rereduce, because it was long time ago (when couchbase 2.0 was in dev preview). May be couchbase had some bugs or I just tried to understand how rereduce works. But I remember that without that if (r) {..} it not worked at that time.

You can try to place return v; code in different parts of my or your reduce function to see what it returns on each reduce phase. It's better to try once by yourself to understand what actually happens there.

Otros consejos

I forget to mention that I have many documents for the same key. In fact for each key I can have many documents( message here):

{
   "date": "2014-04-16T17:13:00",
   "key": "de5cefc56ff51c33351459b88d42ca9f828445c0",
   "message": "message1",
}

{
   "date": "2014-04-16T15:22:00",
   "key": "de5cefc56ff51c33351459b88d42ca9f828445c0",
   "message": "message2",
}

Another way to deal with the problem is to do it in the map function:

function (doc, meta) {
  var count = 0;
  var last =''
  if(doc.type =="doc"){
    for (k in doc.message){
      count += 1; 
      last = doc.date> last?doc.date:last;
    }
    emit(doc.key,{'Count':count,'Last': last});
  }
}

I found this simpler and it do the job in my case.

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top