Question

I'm currently in the process of trying to transfer over a million records from SQL to MongoDB (in a completely different schema)

I've coded a Java application that reads data from multiple tables in SQL, copies it to a single new MongoDB document, and uploads it - which is very efficient, and the application takes almost no resources.

However, since there are around a million records that need to be transferred and since we want to minimize downtime (24 hours of downtime during the transfer is unacceptable), we're forced to multi-thread the application. Running with about 200 threads (each doing 3 queries to compile a single BSON document), SQL quickly peaks at 100% cpu use and blocks newly created threads.

My question: What is the best way to prevent this high CPU use?

We already use indexes for anything that it is needed for, and the slowest SQL query runs at 0.0005ms during usual SQL server performance.

In my eyes, it's a waste to shard SQL even though we're about to switch off of it, and reducing the amount of threads really isn't an option. Could transferring entire tables to Mongo then transferring to the schema be a possible alternative? What about moving to SQLLite and getting all SQL data from that (in order to avoid the CPU bottleneck)?

Thanks for all the help!

No correct solution

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top