Question

First I want to know how to estimate the database size regarding the biggest table it will contains. I've the following :

+----------+------------------+------+-----+---------+-------+
| Field    | Type             | Null | Key | Default | Extra |
+----------+------------------+------+-----+---------+-------+
| users_id | int(32) unsigned | NO   | MUL | NULL    |       |
| s        | binary(16)       | NO   | PRI | NULL    |       |
| t        | binary(16)       | NO   | PRI | NULL    |       |
| x        | binary(16)       | NO   | PRI | NULL    |       |
+----------+------------------+------+-----+---------+-------+

This is the table that will be significant for the size (the other table is only to maintain user data (id, user, pass, email), I don't expect them to have more than 100 entries).

I expect around 7.61263 * 10^9 entries into the table above. I made this simple calculations : 7.61263 * 10^9 + (4 Bytes + 16 Bytes + 16 Bytes + 16 Bytes) ~= 395 GB. But I don't know how to take into account the overhead coming from the dbms (indexes, database structure ...) ?

How to estimate the database size ?

What about the speed and the stability of MySQL running with this large table. Do I have to split the data over two or more databases in order to reduce the size ?

Was it helpful?

Solution

As far as your asking about both size and speed, yes, Russell's right. You're going to have to actully fill a table with sample data, and test representative queries to see how they perform.

For the size issues, you can't always calculate the exact size for the indexes, but see the MySQL documentation for Data Type Storage Requirements and the appropriate documentation for the storage engine you're using. for information about row overhead and some guidance for estimating indexes.

(for instance, you're going to hit the 2**32 limit on rows in MyISAM, so you'll have to build with big tables support)

As for if you'll need multiple databases -- you shouldn't. If you need to spread it across multiple tables, you can always use partitioning, which I think should also help if you're pushing up against file system limits.

OTHER TIPS

Write a simple loop that will generate and populate the data into the table. Then you can answer all of those questions for yourself much more accurately and precisely. It takes minimal time to do the real world test.

For MyISAM add 1 byte per row to the computation you gave. That takes care of the data. For each index, the math goes something like

  • calculate the field sizes
  • add on 6 for the pointer to the data (assuming the default)
  • multiply by, let's say, 1.5, to account for BTree overhead.

For InnoDB, the math is much messier. The simple answer is to take the data+index size for MyISAM, then multiply by 2 or 3.

Please use SHOW CREATE TABLE, not DESC -- I can't really see your indexes, and I suspect you have a big PK, which adversely impacts any secondary keys.

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top