Question

I got a large table with million of records. I have to do a count(*) for a certain criteria and there is no way I can get rid of it.

count() with InnoDB is very expensive. I have been trying to figure out different configurations for MySQL but all in vain. Can't speed up the count. The application requires the result to be less than 1 second because there are other dependent queries to run.

Any indexes are not helping because of the way InnoDB counts.

mysql> EXPLAIN SELECT count(*) FROM `callrequests` WHERE active_call = 1;
+----+-------------+--------------+-------+---------------+-------------+---------+------+---------+--------------------------+
| id | select_type | table        | type  | possible_keys | key         | key_len | ref  | rows    | Extra                    |
+----+-------------+--------------+-------+---------------+-------------+---------+------+---------+--------------------------+
|  1 | SIMPLE      | callrequests | index | NULL          | active_call | 6       | NULL | 5271135 | Using where; Using index |
+----+-------------+--------------+-------+---------------+-------------+---------+------+---------+--------------------------+

mysql> show index from callrequests;
+--------------+------------+------------------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table        | Non_unique | Key_name                     | Seq_in_index | Column_name  | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+--------------+------------+------------------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| callrequests |          0 | PRIMARY                      |            1 | id           | A         |     5271135 |     NULL | NULL   |      | BTREE      |         |               |
| callrequests |          0 | PRIMARY                      |            2 | campaign_id  | A         |     5271135 |     NULL | NULL   |      | BTREE      |         |               |
| callrequests |          0 | unique_contact               |            1 | campaign_id  | A         |        4849 |     NULL | NULL   |      | BTREE      |         |               |
| callrequests |          0 | unique_contact               |            2 | contact_id   | A         |     5271135 |     NULL | NULL   |      | BTREE      |         |               |
| callrequests |          0 | unique_contact               |            3 | contact      | A         |     5271135 |     NULL | NULL   |      | BTREE      |         |               |
| callrequests |          1 | fk_callrequest_campaign1_idx |            1 | campaign_id  | A         |          10 |     NULL | NULL   |      | BTREE      |         |               |
| callrequests |          1 | index4                       |            1 | campaign_id  | A         |        2506 |     NULL | NULL   |      | BTREE      |         |               |
| callrequests |          1 | index4                       |            2 | contact      | A         |     5271135 |     NULL | NULL   |      | BTREE      |         |               |
| callrequests |          1 | phonbook_id_index            |            1 | phonebook_id | A         |          10 |     NULL | NULL   |      | BTREE      |         |               |
| callrequests |          1 | dnc_group_id_index           |            1 | dnc_group_id | A         |           2 |     NULL | NULL   |      | BTREE      |         |               |
| callrequests |          1 | active_call                  |            1 | campaign_id  | A         |          12 |     NULL | NULL   |      | BTREE      |         |               |
| callrequests |          1 | active_call                  |            2 | active_call  | A         |          16 |     NULL | NULL   | YES  | BTREE      |         |               |
| callrequests |          1 | call_status                  |            1 | call_status  | A         |        2518 |     NULL | NULL   |      | BTREE      |         |               |
| callrequests |          1 | call_status                  |            2 | processed    | A         |        2518 |     NULL | NULL   |      | BTREE      |         |               |
| callrequests |          1 | call_status                  |            3 | active_call  | A         |        2518 |     NULL | NULL   | YES  | BTREE      |         |               |
+--------------+------------+------------------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

The server is Xeon machine with 12 CPU cores and 64 GB RAM dedicated 5.6.14-62.0 Percona Server

My innodb_buffer_pool_size is 38 GB and all of data sits in innodb buffer pool.

Was it helpful?

Solution

Difference between innodb and myisam concerning counting

Please notice that counting with WHERE is not slower with InnoDB than it would be with MyISAM. Only a very bare

SELECT COUNT(*) FROM table

can be computed faster with MyISAM as this number is stored in MyISAMs table metadata.

If you have a query with WHERE constraint for example:

SELECT COUNT(*) FROM table WHERE active_calls = 1

the query needs to access the table data in both storage engines and there should be no notable performance difference between MyISAM and InnoDB.

Concerning your specific problem

Please see that your query does not use any proper index. This is not because InnoDB "prefers" a full table scan, but because there exists no proper index.

You have a combined index (campaign_id, active_calls), but active_calls is the second part of the index. As long as the first part is not used in the query, MySQL has no easy access to the second part.

What you want for this simple count query is another index (active_calls) only on this one column. It should run fast then.

OTHER TIPS

I found way to improve the performance of count(*):

SELECT COUNT(*) FROM table WHERE id > 0;

COUNT(*) for Innodb Tables - Percona Database Performance Blog https://www.percona.com/blog/2006/12/01/count-for-innodb-tables/

So if you have query like SELECT COUNT(*) FROM USER It will be much faster for MyISAM (MEMORY and some others) tables because they would simply read number of rows in the table from stored value. Innodb will however need to perform full table scan or full index scan because it does not have such counter, it also can’t be solved by simple singe counter for Innodb tables as different transactions may see different number of rows in the table.

If you have query like SELECT COUNT(*) FROM IMAGE WHERE USER_ID=5 this query will be executed same way both for MyISAM and Innodb tables by performing index range scan. This can be faster or slower both for MyISAM and Innodb depending on various conditions.

So remember Innodb is not slow for ALL COUNT(*) queries but only for very specific case of COUNT(*) query without WHERE clause.

I found that COUNT(a secondary index) was fast as is, but I needed to play around in order to encourage mySQL to use the PRIMARY index.

I was able to use the PRIMARY index and get an significant speed increase without the condition of id>0 using:

SELECT COUNT(*) AS count FROM _test_offset WHERE id IS NOT NULL OR id IS NULL;

I also had to write the query in such a way that the column was only referred to once (for use in a mysql-rewriter plugin) prepared statement and found that this similar query works (cost was +10% time vs the above)

SELECT COUNT(*) AS count FROM _test_offset WHERE CASE WHEN id IS NOT NULL THEN TRUE ELSE TRUE END;

(In mySQL 5.7) it works for secondary indices, nullable/nonnullable indices and unique/non-unique indices. In my simple test case it showed a 27x decrease in time required.

I would have put this as a comment under Mark Khor's answwer (since it's heavily derivative), but don't have sufficient rep.

I had a similar issue. I also had a single int autoincrement primary key column. So I got around the problem by doing this:

select max(id) from table

Also the suggestion of @mark-khor (where id > 0) would have worked for me, but I don't understand why, so I went for the max(id)-value...

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top