Question

If I move 300,000 rows from a table (either innodb or ndb) into a ndb table like so:

INSERT INTO ndbtable2
SELECT a, b,IFNULL(c,UTC_TIMESTAMP()),CASE WHEN b = 'X' THEN e ELSE d END,f
FROM ndbtable1;

Query OK, 308372 rows affected (5 min 12.59 sec)
Records: 308372  Duplicates: 0  Warnings: 0

ndb uses an increasingly huge about of data memory until it completes. Before / peak / after as follows

ndb_mgm -e "all report memoryusage"
Connected to Management Server at: fl-prd-mysql1:1186
Node 1: Data usage is 2%(5752 32K pages of total 262144)
Node 1: Index usage is 0%(2428 8K pages of total 262176)
Node 2: Data usage is 2%(5752 32K pages of total 262144)
Node 2: Index usage is 0%(2428 8K pages of total 262176)

Connected to Management Server at: fl-prd-mysql1:1186
Node 1: Data usage is 62%(164013 32K pages of total 262144)
Node 1: Index usage is 1%(3136 8K pages of total 262176)
Node 2: Data usage is 62%(164013 32K pages of total 262144)
Node 2: Index usage is 1%(3136 8K pages of total 262176)

Connected to Management Server at: fl-prd-mysql1:1186
Node 1: Data usage is 3%(10293 32K pages of total 262144)
Node 1: Index usage is 1%(4590 8K pages of total 262176)
Node 2: Data usage is 3%(10293 32K pages of total 262144)
Node 2: Index usage is 1%(4590 8K pages of total 262176)

If my math is right I insert 10293 - 5752 = 4541 = 142 MB but memory spikes by 164013 - 5752 = 158261 = 4945 MB

Also if I limit the insert to 50,000 rows, with only a 3MB before/after difference, the spike was 780MB.

Obviously this will be problem when ndb isn't empty... what is happening here?!

Was it helpful?

Solution

In absence of an explanation the manual does acknowledge the limitation as follows

As noted elsewhere in this chapter, MySQL Cluster does not handle large transactions well; it is better to perform a number of small transactions with a few operations each than to attempt a single large transaction containing a great many operations. Among other considerations, large transactions require very large amounts of memory.

http://dev.mysql.com/doc/refman/5.5/en/mysql-cluster-limitations-transactions.html

"As noted elsewhere in this chapter" ... I have looked but yet to find anything specific.

FYI the alternatives are:

  1. Loop through iterations with a row LIMIT (say 10,000)
  2. If inserting into an empty table consider LOAD DATA or even using innodb and finally ALTER TABLE ... ENGINE = ndbcluster (neither are transactional)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top