Question

I have a report document that has an images array in it. Inside each image is a thumbnails array that track the thumbnails I have generated for the image.

{
  "_id" : ObjectId("536021ba9b319f5195000004"),
  "images": [{
    "name": "some_image.jpg",
    "width": 1200,
    "height": 1200,
    "thumbnails": [{
      "name": "some_image_150.jpg",
      "size": 150
    }]  
  }]
}

Now lets say I want to generate the following thumbnail sizes:

[150, 320, 800]

Is there a way that I can query the reports collection and get back all reports that do not have all the proper thumbnails generated? I.E. in the example above, the query would return the example document as it has an image that does not contain all the proper attachments.

Best case would be to only get images where all thumbs are not generated, ie if a report has 2 images and one image has all the thumbnails and another image does not it would be nice to just get the image that needs thumbnails.

I have tried messing around with aggregation to unwind the images array, such that I can match against things. However I am running in a problem with trying to match the numbers in my array [150, 320, 800] to the documents in thumbnails.

EDIT:

I also need to determine if a thumb should not be created. I.E. if my original image is 350x350, then I should not create a 800x800 thumbnail. so even though 800 is in my thumbnail set I need to only take values from that thumbnail set that are less than the original images width and height.

Was it helpful?

Solution 2

You would want to be working on a negative case of the $all operator combined with $not:

db.collection.find({
    "images.thumbnails.size": {
        "$not": { "$all": [ 150, 320, 800 ] }
    }
})

Possibly in versions prior to MongoDB 2.6 you might need to adapt this due to a change in the logic for $all but the same thing is constructed using $and:

db.collection.find({
    "$and": [
        { "$ne": { "images.thumbnail.size": 150 } },
        { "$ne": { "images.thumbnail.size": 320 } },
        { "$ne": { "images.thumbnail.size": 800 } }
    ]
})

Noting of course that these statements are matching "documents" and not the elements of your "images" array, to actually filter those you would need to apply this to aggregate:

db.collection.aggregate([
    // Match the documents meeting the conditions
    { "$match": {
        "images.thumbnails.size": {
            "$not": { "$all": [ 150, 320, 800 ] }
        }
    }},

    // Unwind the images array
    { "$unwind": "$images" },

    // Filter out any array elements that do not match
    { "$match": {
        "images.thumbnails.size": {
            "$not": { "$all": [ 150, 320, 800 ] }
        }
    }},


    // Optional: Projection re-shaping
    { "$project": {
        "_id": {
            "_id": "$_id",
            "images": {
                "name": "$images.name",
                "width": "$images.width",
                "height": "$images.height"
            }
        },
        "thumbs": "$images.thumbnails"
    }},

    // Optional: unwind the thumbnails
    { "$unwind": "$thumbs" },

    // Optional: group back only the sizes
    { "$group": {
        "_id": "$_id",
        "thumbs": { "$push": "$thumbs.size" }
    }},

    // Optional: Project with the difference on the set
    { "$project": {
        "_id": "$_id._id",
        "images": {
            "name": "$_id.images.name",
            "width": "$_id.images.width",
            "height": "$_id.images.height",
            "missingThumbs": { "$setDifference": [
                 [ 150, 320, 800 ],
                 "$thumbs"
            ]}
        }
    }},

    // Restore the images array
    { "$group": {
        "_id": "$_id",
        "images": { "$push": "$images" }
    }}

])

So this uses $setDifference to take this a little further and tell you which of the "thumbnail sizes" you were testing for did not exist. The stage is optional as that operator is only available from MongoDB 2.6 and upwards, so otherwise just remove the stages marked as "Optional:" to allow this to filter the "images" array entries.

You also could do the "difference" matching in versions prior to 2.6 but it is a fair bit more involved, but you may want to try and work that part out.


As for your full generation edit here would be the full listing:

db.collection.aggregate([
    // Match the documents meeting the conditions
    { "$match": {
        "images.thumbnails.size": {
            "$not": { "$all": [ 150, 320, 800 ] }
        }
    }},

    // Unwind the images array
    { "$unwind": "$images" },

    // Filter out any array elements that do not match
    { "$match": {
        "images.thumbnails.size": {
            "$not": { "$all": [ 150, 320, 800 ] }
        }
    }},

    // Projection re-shaping
    { "$project": {
        "_id": {
            "_id": "$_id",
            "images": {
                "name": "$images.name",
                "width": "$images.width",
                "height": "$images.height"
            }
        },
        "thumbs": "$images.thumbnails"
    }},

    // unwind the thumbnails
    { "$unwind": "$thumbs" },

    // group back only the sizes
    { "$group": {
        "_id": "$_id",
        "thumbs": { "$push": "$thumbs.size" }
    }},

    // Project missingThumbs
    { "$project": {
        "missingThumbs": { "$setDifference": [
             [ 150, 320, 800 ],
             "$thumbs"
        ]}
    }},

    // Unwind the missing thumbs
    { "$unwind": "$missingThumbs" },

    // Project a size test
    { "$project": {
        "missingThumbs": 1,
        "larger": { "$gte": [ 
            "$_id.images.width",
            "$missingThumbs"
        ]}
    }},

    // Match the size test
    { "$match": { "larger": true }},

    // Group back the missing thumbs
    { "$group": { 
        "_id": "$_id",
        "missingThumbs": { "$push": "$missingThumbs" }
    }},

    // Project the images entry
    { "$project": {
        "_id": "$_id._id",
        "images": {
            "name": "$_id.images.name",
            "width": "$_id.images.width",
            "height": "$_id.images.height",
            "missingThumbs": "$missingThumbs"
        }
    }},

    // Restore the images array
    { "$group": {
        "_id": "$_id",
        "images": { "$push": "$images" }
    }}

])

Nothing optional in here as you would clearly be using the features to detect the thumbnails you don't already have. The additional steps are there to compare the size of the missing thumb against the size of the image. Anything not detected as "larger" would be excluded.

OTHER TIPS

I believe following should work fine for your scenario:

db.mycollection.find( {"images.thumbnails.size": {$not: {"$all": [150, 320, 800]}}  })

This should give you any document that does not has one or more of those thumbnail sizes.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top