That's the way mongoimport
works. There's an existing new feature request for merge imports, but for now, you'll have to write your own import to provide merge behavior.
Mongoimport to merge/upsert fields
Domanda
I'm trying to import and merge multiple CSVs into mongo, however documents are getting replaced rather than merged.
For example, if I have one.csv:
key1, first column, second column
and two.csv:
key1, third column
I would like to end up with:
key1, first column, second column, third column
But instead I'm getting:
key1,third column
Currently I'm using:
mongoimport.exe --ftype csv --file first.csv --fields key,firstColumn,secondColumn
mongoimport.exe --ftype csv --file second.csv --fields key,thirdColumn --upsert --upsertFields key1
Soluzione
Altri suggerimenti
cross-collection workaround: forEach method can be run on a dummy collection and the resulting doc objects used to search/update your desired collection:
mongoimport.exe --collection mycoll --ftype csv --file first.csv --fields key,firstColumn,secondColumn
mongoimport.exe --collection dummy --ftype csv --file second.csv --fields key,third
db.dummy.find().forEach(function(doc) {db.mycoll.update({key:doc.key},{$set:{thirdcol:doc.third}})})
That's correct, mongoimport --upsert updates full documents. You may achieve your goal by importing to a temporary collection and using the following Gist.
Load the script to Mongo Shell and run:
mergeCollections("srcCollectionName", "destCollectionName", {}, ["thirdColl"]);
I just had a very similar problem. There is a node module for mongo and jline is my command line node tool for stream processing JSON lines. So:
echo '{"page":"index.html","hour":"2015-09-18T21:00:00Z","visitors":1001}' |\
jline-foreach \
'beg::dp=require("bluebird").promisifyAll(require("mongodb").MongoClient).connectAsync("mongodb://localhost:27017/nginx")' \
'dp.then(function(db){
updates = {}
updates["visitors.hour."+record.hour] = record.visitors;
db.collection("pagestats").update({_id:record.page},{$set:updates},{upsert:true});});' \
'end::dp.then(function(db){db.close()})'
In your case you'd have to convert from csv to JSON lines first by piping it through jline-csv2jl
. That converts each CSV line into a dictionary with names taken from the header.
I have added this example to the manual: https://github.com/bitdivine/jline/blob/master/bin/foreach.md
I haven't used jline with promises much but so far it's OK.
Disclaimer: I am the author of jline.