Non strict behavior of $nin in mongodb
-
21-12-2019 - |
Question
Is there a non strict $nin version in mongodb? for example
Let's say that we have a model called User and a Model called task
var TaskSchema = new Schema({
user_array: [{user: Schema.ObjectId}],
});
A quick sample would be this
task1 : [user1, user2, user4, user7]
task2 : [user2, user 5, user7]
if I have a list of user
[user1, user7]
I want to select the task that has the least overlapping in the user_array, in this case task2, I know $nin strictly returns the task that contains neither user1 or user7, but I would like to know if there are operation where $nin is non strict.
Alternatively, I could have write a DP function to this for me
Any advice would be appreciated
Thanks
Solution
Well in MongoDB version 2.6 and upwards you have the $setIntersection
and $size
operators available so you can perform an .aggregate()
statement like this:
db.collection.aggregate([
{ "$project": {
"user_array": 1,
"size": { "$size": {
"$setIntersection": [ "$user_array", [ "user1", "user7" ] ]
}}
}},
{ "$match": { "size": { "$gt": 1 } },
{ "$sort": { "size": 1 }},
{ "$group": {
"_id": null
"user_array": { "$first": "$user_array" }
}}
])
So those operators help to reduce the steps required to find the least matching document.
Basically the $setIntersection
returns the matching elements in the array to the one it is being compared with. The $size
operator returns the "size" of that resulting array. So later you filter out with $match
any documents where neither of the items in the matching list were found in the array.
Finally you just sort and return the item with the "least" matches.
But it can still be done in earlier versions with some more steps. So basically your "non-strict" implementation becomes an $or
condition. But of course you still need to count the matches:
db.collection.aggregate([
{ "$project": {
"_id": {
"_id": "$_id",
"user_array": "$user_array"
},
"user_array": 1
}}
{ "$unwind": "$user_array" },
{ "$match": {
"$or": [
{ "user_array": "user1" },
{ "user_array": "user7" }
]
}},
{ "$group": {
"_id": "$_id",
"size": { "$sum": 1 }
}},
{ "$sort": { "size": 1 } },
{ "$group": {
"_id": null,
"user_array": { "$first": "$_id.user_array" }
}}
])
And that would do the same thing.