Question

I have a table with about 8.5m rows in it. The table is tokudb and it has the indexes described below. I'm experiencing dismal performance when trying to run update statements like the following:

 update retail.lw_item_discovery 
 set price = 'X', 
     prev_price = 'Y', 
     last_updated = '2016-04-13', 
     last_price_change = '2016-04-13' 
 where market = 'XX' 
   and sku = '123456'

It takes upwards of 40 seconds to perform this update. There are other updates like this happening frequently, but the I/O subsystem of this machine is not being stressed in the least (raided SSDs) and there is plenty of RAM available as well.

EXPLAIN yields:

+----+-------------+-------------------+------------+-------+------------------------------------------------------------+---------+---------+------+------+----------+------------------------------+
| id | select_type | table             | partitions | type  | possible_keys                                              | key     | key_len | ref  | rows | filtered | Extra                        |
+----+-------------+-------------------+------------+-------+------------------------------------------------------------+---------+---------+------+------+----------+------------------------------+
|  1 | UPDATE      | lw_item_discovery | NULL       | index | cl_unique_idx,cl_mkt_sku_upd_avail_idx,market_sku_item_idx | PRIMARY | 4       | NULL |  100 |   100.00 | Using where; Using temporary |
+----+-------------+-------------------+------------+-------+------------------------------------------------------------+---------+---------+------+------+----------+------------------------------+
1 row in set (0.00 sec)

Based on this - it's picking the PRIMARY index instead of one of the others which such as cl_unique_idx which have both columns in the where statement in the first two positions. So I'm stumped at why the planner is choosing the PRIMARY instead and causing the performance to be so poor. Below are a list of the indexes:

+-------------------+------------+--------------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table             | Non_unique | Key_name                 | Seq_in_index | Column_name     | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------------------+------------+--------------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| lw_item_discovery |          0 | PRIMARY                  |            1 | itd_id          | A         |          82 |     NULL | NULL   |      | BTREE      |         |               |
| lw_item_discovery |          0 | cl_unique_idx            |            1 | sku             | A         |          82 |     NULL | NULL   |      | BTREE      |         |               |
| lw_item_discovery |          0 | cl_unique_idx            |            2 | market          | A         |          82 |     NULL | NULL   |      | BTREE      |         |               |
| lw_item_discovery |          0 | cl_unique_idx            |            3 | upc             | A         |          82 |     NULL | NULL   |      | BTREE      |         |               |
| lw_item_discovery |          0 | cl_unique_idx            |            4 | model_num       | A         |          82 |     NULL | NULL   |      | BTREE      |         |               |
| lw_item_discovery |          0 | cl_unique_idx            |            5 | item_id         | A         |          82 |     NULL | NULL   | YES  | BTREE      |         |               |
| lw_item_discovery |          1 | update_idx               |            1 | last_updated    | A         |          82 |     NULL | NULL   |      | BTREE      |         |               |
| lw_item_discovery |          1 | update_idx               |            2 | market          | A         |          82 |     NULL | NULL   |      | BTREE      |         |               |
| lw_item_discovery |          1 | update_idx               |            3 | sku             | A         |          82 |     NULL | NULL   |      | BTREE      |         |               |
| lw_item_discovery |          1 | description_idc          |            1 | web_description | A         |          82 |     NULL | NULL   |      | BTREE      |         |               |
| lw_item_discovery |          1 | category_idx             |            1 | web_category    | A         |          82 |     NULL | NULL   |      | BTREE      |         |               |
| lw_item_discovery |          1 | category_idx             |            2 | upc             | A         |          82 |     NULL | NULL   |      | BTREE      |         |               |
| lw_item_discovery |          1 | category_idx             |            3 | sku             | A         |          82 |     NULL | NULL   |      | BTREE      |         |               |
| lw_item_discovery |          1 | upc_idx                  |            1 | upc             | A         |          82 |     NULL | NULL   |      | BTREE      |         |               |
| lw_item_discovery |          1 | item_id_idx              |            1 | item_id         | A         |          82 |     NULL | NULL   | YES  | BTREE      |         |               |
| lw_item_discovery |          1 | item_id_idx              |            2 | market          | A         |          82 |     NULL | NULL   |      | BTREE      |         |               |
| lw_item_discovery |          1 | item_id_idx              |            3 | available       | A         |          82 |     NULL | NULL   |      | BTREE      |         |               |
| lw_item_discovery |          1 | cl_mkt_sku_upd_avail_idx |            1 | sku             | A         |          82 |     NULL | NULL   |      | BTREE      |         |               |
| lw_item_discovery |          1 | cl_mkt_sku_upd_avail_idx |            2 | market          | A         |          82 |     NULL | NULL   |      | BTREE      |         |               |
| lw_item_discovery |          1 | cl_mkt_sku_upd_avail_idx |            3 | last_updated    | A         |          82 |     NULL | NULL   |      | BTREE      |         |               |
| lw_item_discovery |          1 | cl_mkt_sku_upd_avail_idx |            4 | available       | A         |          82 |     NULL | NULL   |      | BTREE      |         |               |
| lw_item_discovery |          1 | market_sku_item_idx      |            1 | market          | A         |          82 |     NULL | NULL   |      | BTREE      |         |               |
| lw_item_discovery |          1 | market_sku_item_idx      |            2 | sku             | A         |          82 |     NULL | NULL   |      | BTREE      |         |               |
| lw_item_discovery |          1 | market_sku_item_idx      |            3 | item_id         | A         |          82 |     NULL | NULL   | YES  | BTREE      |         |               |
+-------------------+------------+--------------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
24 rows in set (0.00 sec)

I've had to increase the tokudb_lock_timeout from 4sec to 40sec in order to not have a bunch of lock wait contention. Am I missing something here?

Table definition

`lw_item_discovery` (
  `item_id` bigint(20) unsigned DEFAULT '0',
  `chain` varchar(12) NOT NULL DEFAULT 'lowes',
  `market` varchar(4) NOT NULL DEFAULT '',
  `available` varchar(1) NOT NULL DEFAULT 'y',
  `last_updated` date NOT NULL DEFAULT '0000-00-00',
  `itd_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `web_description` varchar(255) NOT NULL DEFAULT '',
  `model_num` varchar(100) NOT NULL DEFAULT '' COMMENT 'its only 1char cause its not currently used. Its here for consistency',
  `price` decimal(6,2) NOT NULL DEFAULT '0.00',
  `item_link_url` text NOT NULL,
  `item_img_url` text NOT NULL,
  `store_shopped` smallint(5) unsigned NOT NULL DEFAULT '0',
  `sku` varchar(32) NOT NULL DEFAULT '0',
  `upc` varchar(12) NOT NULL DEFAULT '',
  `web_category` varchar(255) NOT NULL DEFAULT '',
  `mfr` varchar(100) NOT NULL DEFAULT '',
  `class` tinyint(3) unsigned NOT NULL DEFAULT '0',
  `subclass` tinyint(3) unsigned NOT NULL DEFAULT '0',
  `first_found` date NOT NULL DEFAULT '0000-00-00' COMMENT 'first time it was seen in market',
  `last_price_change` date NOT NULL DEFAULT '0000-00-00' COMMENT 'the date of the last price change observed',
  `discontinued` varchar(1) NOT NULL DEFAULT 'n',
  `discontinued_date` date NOT NULL DEFAULT '0000-00-00',
  `prev_price` decimal(6,2) unsigned NOT NULL DEFAULT '0.00',
  `rating` decimal(4,2) NOT NULL DEFAULT '-1.00',
  `review_count` int(11) NOT NULL DEFAULT '-1',
  PRIMARY KEY (`itd_id`),
  UNIQUE KEY `cl_unique_idx` (`sku`,`market`,`upc`,`model_num`,`item_id`),
  KEY `update_idx` (`last_updated`,`market`,`sku`),
  KEY `description_idc` (`web_description`),
  KEY `category_idx` (`web_category`,`upc`,`sku`),
  KEY `upc_idx` (`upc`),
  KEY `item_id_idx` (`item_id`,`market`,`available`) USING BTREE,
  KEY `cl_mkt_sku_upd_avail_idx` (`sku`,`market`,`last_updated`,`available`),
  CLUSTERING KEY `market_sku_item_idx` (`market`,`sku`,`item_id`)
) ENGINE=TokuDB AUTO_INCREMENT=8858224 DEFAULT CHARSET=latin1

The number of rows updated should be 1-3 max for each update. The updates can happen at a rate of probably 1 per second to perhaps 3-4 perhaps up to several dozen per second generally.

This is on Percona Server 5.7.

Was it helpful?

Solution 2

Ultimately - a dump of the table and reload solved the weird cardinality and row count behavior. We tried using analyze table, but that did not fix the problem. George's answer is very well done, but unfortunately it wouldn't solve my problem.

OTHER TIPS

Unlike InnoDB, TokuDB historically did not automatically compute cardinality statistics. As a user you were required to manually run ANALYZE TABLE in order to calculate these values.

All tables and indices created prior to 5.6.27-76.0 would also not maintain accurate row counts. After 5.6.27-76.0, new tables and indices, and tables that had RECOUNT ROWS analysis, would all accurately track row counts. This is very important for cardinality metrics and particularly for cardinality with partitioned tables.

Please see the following documents that describe the analysis changes:

Prior to 5.7.11-4, automatic background analysis was disabled by default. From 5.7.11-4 onward, automatic background analysis is enabled by default when ~30% of the table had been changed (insert/update/delete). You can change this threshold and several other aspects of the analysis by manipulating the various system variables documented in the links above.

The reloading of your data into a server newer than 5.6.27-76.0 would have corrected the inaccurate row counts and the move to 5.7.11-4 would have enabled the automatic background analysis.

If you are going to use TokuDB, you should be sure of your reasons, TokuDB is not 'just better than InnoDB for all loads'. It has specific benefits and tradeoffs and use cases where it will not perform as well as InnoDB and in general is not as mature as InnoDB.

If you need compression, have a heavy insert load, slow storage, or if your data set greatly outsizes your available memory, TokuDB may be a good fit. If you need raw random point query performance, have heavy sequential deletes followed by covering queries, have large char/varchar/blobs (> 32K), have plenty of fast storage (although TokuDB can reduce flash wear), or have a small data set that is a small multiple of physical memory size, TokuDB is probably not for you.

I also now notice that you say you have only 100GB of data but 500GB of memory (with 100GB innodb buffer pool). This is a case where most/all of your data will fit into memory. InnoDB should be the clear performance winner here. TokuDB is not (yet) optimized for in-memory workloads and InnoDB will beat it almost 100% of the time in this situation. Now, if you had 100GB of memory and a TB of data & indexes, then TokuDB would be worth considering.

(I am a Software Engineer at Percona.)

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top