Question

I have prepared the following SQL statements to compare the performance behavior of MyISAM, InnoDB, and TokuDB (INSERT is executed for 100000 times):

MyISAM:

CREATE TABLE `testtable_myisam` (`id` bigint(20) NOT NULL AUTO_INCREMENT, `value1` INT DEFAULT NULL, `value2` INT DEFAULT NULL, PRIMARY KEY (`id`), KEY `index1` (`value1`)) ENGINE=MyISAM DEFAULT CHARSET=utf8;

INSERT INTO `testtable_myisam` (`value1`, `value2`) VALUES (FLOOR(RAND() * 1000), FLOOR(RAND() * 1000)); 

InnoDB:

CREATE TABLE `testtable_innodb` (`id` bigint(20) NOT NULL AUTO_INCREMENT, `value1` INT DEFAULT NULL, `value2` INT DEFAULT NULL, PRIMARY KEY (`id`), KEY `index1` (`value1`)) ENGINE=InnoDB DEFAULT CHARSET=utf8;

INSERT INTO `testtable_innodb` (`value1`, `value2`) VALUES (FLOOR(RAND() * 1000), FLOOR(RAND() * 1000));

TokuDB:

CREATE TABLE `testtable_tokudb` (`id` bigint(20) NOT NULL AUTO_INCREMENT, `value1` INT DEFAULT NULL, `value2` INT DEFAULT NULL, PRIMARY KEY (`id`), KEY `index1` (`value1`)) ENGINE=TokuDB DEFAULT CHARSET=utf8;

INSERT INTO `testtable_tokudb` (`value1`, `value2`) VALUES (FLOOR(RAND() * 1000), FLOOR(RAND() * 1000));

At the beginning, the INSERT performance of InnoDB is almost 50 times slower than MyISAM, and TokuDB is 40 times slower than MyISAM.

Then I figure out the setting of "innodb-flush-log-at-trx-commit=2" on InnoDB, to make its INSERT behavior similar with MyISAM.

The question is, what should I do on the TokuDB? I bet the poor INSERT performance of TokuDB is also caused by some inproper setting, but I cannot figure out the reason.

--------- UPDATE ---------

Thanks to tmcallaghan's comments, I have modified my setting into "tokudb_commit_sync=OFF", now the insert rate of TokuDB on small dataset seems to be meaningful (I will execute them on large dataset once I figure out the following problem):

However, the select performance of TokuDB is still wired compared to MyISAM and InnoDB with following SQL (wherein the ? is replaced by a different Int by my simulator):

SELECT id, value1, value2 FROM testtable_myisam WHERE value1=?; 
SELECT id, value1, value2 FROM testtable_innodb WHERE value1=?; 
SELECT id, value1, value2 FROM testtable_tokudb WHERE value1=?; 

Upon a million dataset, each 10k SELECT statments cost 10 and 15 seconds by MyISAM and InnoDB individually, but TokuDB requires about 40 seconds.

Did I miss some other settings?

Thanks in advance!

Was it helpful?

Solution 2

The reason why transactional engines are slower is because they force the hard disk to confirm it wrote the data down. For the HDD to write data down, it has to position the head over the magnetic disk plate and stream the data. Each transaction means that the disk will position the magnetic needle over the head, write the data down and tell the OS that it's there for sure.

The reason transactional engines do that is so they can conform to the D part of ACID. They ensure you that data you wanted to be written down, is, in fact, written down permanently. MyISAM doesn't do that.

Thus, the speed of insert is proportional to the number of Input Output Operations per Second (IOPS) of the hard disk. That also means, if you wrap several queries in one transaction, you can exploit the write speed bandwith of the mentioned drives. Also, that implies that drives with high IOPS (SSD for example, have 40+ thousand IOPS and mechanical ones range at about 250 - 300, but don't take my word for exact numbers).

Long story short, if you want really fast inserts using transactional engines - wrap multiple queries in a single transaction. All the "optimizations" you do are slightly violating the D part of ACID, because the engines will try to exploit various fast memories lying around that can be used as buffers. That means, if something goes wrong, such as you lose power - kiss your data goodbye.

Also, the tests conducted by you are actually bad because they're on small scale. Both InnoDB and especially TokuDB are designed to contain hundreds of gigabytes of data and to offer linear performance.

OTHER TIPS

This doesn't sound like a very interesting test (100,000 rows is not a lot, and your insertions are not concurrent), but here is the setting you are looking for.

Issuing "set tokudb_commit_sync=0;" will turn off fsync() on commit operations. Note that there are no durability guarantees in this mode.

As I mentioned prior, TokuDB's strength is indexing data that is significantly larger than RAM, and this test is not.

I have updated my.cnf into something below, now the overall performance looks better.

For 10k times of SELECT from MyISAM, it takes 4 seconds, whereby InnoDB takes 5 seconds, and TokuDB takes 8 seconds. So can I conclude under below configuration, TokuDB is behaving similar (even not necessary better) with MyISAM and InnoDB.

Indeed, I am curious about tons of performance comparison between InnoDB v.s. TokuDB, but not MyISAM v.s. TokuDB, why?


tokudb_commit_sync=0

max_allowed_packet = 1M
table_open_cache = 128
read_buffer_size = 2M
read_rnd_buffer_size = 8M
myisam_sort_buffer_size = 64M
thread_cache_size = 8
query_cache_size = 32M
thread_concurrency = 8

innodb_flush_log_at_trx_commit=2
innodb_buffer_pool_size = 2G
innodb_additional_mem_pool_size = 20M
innodb_log_buffer_size = 8M
innodb_lock_wait_timeout = 50
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top