Update MongoDB field using value of another field
-
09-10-2019 - |
سؤال
In MongoDB, is it possible to update the value of a field using the value from another field? The equivalent SQL would be something like:
UPDATE Person SET Name = FirstName + ' ' + LastName
And the MongoDB pseudo-code would be:
db.person.update( {}, { $set : { name : firstName + ' ' + lastName } );
المحلول
The best way to do this is in version 4.2+ which allows using of aggregation pipeline in the update document and the updateOne
, updateMany
or update
collection method. Note that the latter has been deprecated in most if not all languages drivers.
MongoDB 4.2+
Version 4.2 also introduced the $set
pipeline stage operator which is an alias for $addFields
. I will use $set
here as it maps with what we are trying to achieve.
db.collection.<update method>(
{},
[
{"$set": {"name": { "$concat": ["$firstName", " ", "$lastName"]}}}
]
)
MongoDB 3.4+
In 3.4+ you can use $addFields
and the $out
aggregation pipeline operators.
db.collection.aggregate(
[
{ "$addFields": {
"name": { "$concat": [ "$firstName", " ", "$lastName" ] }
}},
{ "$out": "collection" }
]
)
Note that this does not update your collection but instead replace the existing collection or create a new one. Also for update operations that require "type casting" you will need client side processing, and depending on the operation, you may need to use the find()
method instead of the .aggreate()
method.
MongoDB 3.2 and 3.0
The way we do this is by $project
ing our documents and use the $concat
string aggregation operator to return the concatenated string.
we From there, you then iterate the cursor and use the $set
update operator to add the new field to your documents using bulk operations for maximum efficiency.
Aggregation query:
var cursor = db.collection.aggregate([
{ "$project": {
"name": { "$concat": [ "$firstName", " ", "$lastName" ] }
}}
])
MongoDB 3.2 or newer
from this, you need to use the bulkWrite
method.
var requests = [];
cursor.forEach(document => {
requests.push( {
'updateOne': {
'filter': { '_id': document._id },
'update': { '$set': { 'name': document.name } }
}
});
if (requests.length === 500) {
//Execute per 500 operations and re-init
db.collection.bulkWrite(requests);
requests = [];
}
});
if(requests.length > 0) {
db.collection.bulkWrite(requests);
}
MongoDB 2.6 and 3.0
From this version you need to use the now deprecated Bulk
API and its associated methods.
var bulk = db.collection.initializeUnorderedBulkOp();
var count = 0;
cursor.snapshot().forEach(function(document) {
bulk.find({ '_id': document._id }).updateOne( {
'$set': { 'name': document.name }
});
count++;
if(count%500 === 0) {
// Excecute per 500 operations and re-init
bulk.execute();
bulk = db.collection.initializeUnorderedBulkOp();
}
})
// clean up queues
if(count > 0) {
bulk.execute();
}
MongoDB 2.4
cursor["result"].forEach(function(document) {
db.collection.update(
{ "_id": document._id },
{ "$set": { "name": document.name } }
);
})
نصائح أخرى
You should iterate through. For your specific case:
db.person.find().snapshot().forEach(
function (elem) {
db.person.update(
{
_id: elem._id
},
{
$set: {
name: elem.firstname + ' ' + elem.lastname
}
}
);
}
);
Apparently there is a way to do this efficiently since MongoDB 3.4, see styvane's answer.
Obsolete answer below
You cannot refer to the document itself in an update (yet). You'll need to iterate through the documents and update each document using a function. See this answer for an example, or this one for server-side eval()
.
For a database with high activity, you may run into issues where your updates affect actively changing records and for this reason I recommend using snapshot()
db.person.find().snapshot().forEach( function (hombre) {
hombre.name = hombre.firstName + ' ' + hombre.lastName;
db.person.save(hombre);
});
http://docs.mongodb.org/manual/reference/method/cursor.snapshot/
I tried the above solution but I found it unsuitable for large amounts of data. I then discovered the stream feature:
MongoClient.connect("...", function(err, db){
var c = db.collection('yourCollection');
var s = c.find({/* your query */}).stream();
s.on('data', function(doc){
c.update({_id: doc._id}, {$set: {name : doc.firstName + ' ' + doc.lastName}}, function(err, result) { /* result == true? */} }
});
s.on('end', function(){
// stream can end before all your updates do if you have a lot
})
})
Regarding this answer, the snapshot function is deprecated in version 3.6, according to this update. So, on version 3.6 and above, it is possible to perform the operation this way:
db.person.find().forEach(
function (elem) {
db.person.update(
{
_id: elem._id
},
{
$set: {
name: elem.firstname + ' ' + elem.lastname
}
}
);
}
);
Here's what we came up with for copying one field to another for ~150_000 records. It took about 6 minutes, but is still significantly less resource intensive than it would have been to instantiate and iterate over the same number of ruby objects.
js_query = %({
$or : [
{
'settings.mobile_notifications' : { $exists : false },
'settings.mobile_admin_notifications' : { $exists : false }
}
]
})
js_for_each = %(function(user) {
if (!user.settings.hasOwnProperty('mobile_notifications')) {
user.settings.mobile_notifications = user.settings.email_notifications;
}
if (!user.settings.hasOwnProperty('mobile_admin_notifications')) {
user.settings.mobile_admin_notifications = user.settings.email_admin_notifications;
}
db.users.save(user);
})
js = "db.users.find(#{js_query}).forEach(#{js_for_each});"
Mongoid::Sessions.default.command('$eval' => js)
Starting Mongo 4.2
, db.collection.update()
can accept an aggregation pipeline, finally allowing the update/creation of a field based on another field:
// { firstName: "Hello", lastName: "World" }
db.collection.update(
{},
[{ $set: { name: { $concat: [ "$firstName", " ", "$lastName" ] } } }],
{ multi: true }
)
// { "firstName" : "Hello", "lastName" : "World", "name" : "Hello World" }
The first part
{}
is the match query, filtering which documents to update (in our case all documents).The second part
[{ $set: { name: { ... } }]
is the update aggregation pipeline (note the squared brackets signifying the use of an aggregation pipeline).$set
is a new aggregation operator and an alias of$addFields
.Don't forget
{ multi: true }
, otherwise only the first matching document will be updated.