Question

w I have some data in my mongoDB as given below.

enter image description here

Now I want to retrieve the data like this.

Group by DATE, ENVIRONMENT, TCID

For the MAX TIME

Select DATE, ENVIRONMENT, TCID, STATUS

Order by DATE, TCID

I tried something like this..

{
    $group : 
    {
      _id: { DATE: "$DATE", TCID: "$TCID", ENVIRONMENT: "$ENVIRONMENT" }
      DATE: { $last : "$DATE" }
      TIME: { $last : "$TIME" }
      ENVIRONMENT: { $last : "$ENVIRONMENT" }
      TCID: { $last : "$TCID" }
      STATUS: { $last : "$STATUS" }
    }


},
{
    $sort: 
    { 
        DATE : 1,  TCID:1
    }
}

I am using the above query for the time being. it seems to help a bit. It just simply gives the last inserted row (assuming that would have the MAX TIME for that DATE). However, it might not be always correct.

So, I would like to know how to check the MAX TIME.

DATA TYPE of all columns are String.

Please help!!

Was it helpful?

Solution

Usually strings are not the best things to work with as values, but fortunately all your strings are "lexcially" ordered so they will sort properly.

The correct form of your aggregation statement would then be this:

db.collection.aggregate([
    { "$sort": { "DATE": 1, "TIME": 1 } },
    { "$group": {
         "_id": {
             "DATE": "$DATE", 
             "TCID": "$TCID", 
             "ENVIRONMENT": "$ENVIRONMENT"
         },
         "OTIME": { "$last": "$TIME" },
         "OSTATUS": { "$last": "$STATUS" },
    }},
    { "$project": {
        "_id": 0,
        "DATE": "$_id.DATE",
        "TIME": "$OTIME",
        "ENVIRONMENT": "$_id.ENVIRONMENT",
        "TCID": "$_id.TCID",
        "STATUS": "$OSTATUS"
    }},
    { "$sort": { "DATE": 1, "TIME": 1 } }
])

So let's look at the point where this differs from what you tried:

  • First there is a $sort before you group. Normally this would be the case when you are going to use $last because you want to find the "boundaries" with the working set in that order. If you were certain that the documents would already be in that order then you could skip this.

  • Next is the $group stage where you are correct in using $last as what you want after you are certain that "TIME" will be the greatest value on this boundary, you most importantly want the "last" value for "STATUS". Mostly note how the projected fields are not being duplicated.

  • Then you have $project which is like the "SELECT" clause in SQL. So here you combine the required fields and "transform" the fields from the grouping _id into the actual names you want.

  • Finally you $sort again from fields still in the pipeline from the last "project". This is just making sure you get the results in the order that you want.

Note: The funny "O" prefixed names from the "grouping" is a little aggregation trick to make the "projection" appear in the order you specified. If you used the same names, the aggregation engine considers those fields to already be in the result document and leaves them at the top of the field list. So that is another little trick you can make use of later.

OTHER TIPS

db.mycol.aggregate([
    {
      $group:
      {
       _id:{DATE:'$DATE', TCID:'$TCID', ENVIRONMENT:'$ENVIRONMENT'},
       DATE:{$addToSet:'$DATE'},
       TIME:{$addToSet:'$TIME'},
       ENVIRONMENT:{$addToSet:'$ENVIRONMENT'},
       TCID:{$addToSet:'$TCID'},
       STATUS:{$addToSet:'$STATUS'},
       TIME:{$max:'$TIME'}
     }
    },
    {$sort:{DATE: 1,TCID: 1}}
])
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top