Pergunta

I did quite good research and went through some information on this regard. But I am not satisfied yet, so preferred to ask directly about it.

I'm designing the schema of an application database, which is assumed to have tins of millions of records.

Some of these are frequently updated, and some are very rarely updated. Hence, I decided to spread data on different tables according to the nature, and JOIN them in queries.

So, is this better or.. going with the one table (no JOINs)?

Assume it is a Rent-A-Car management system..

Here are the tables:


DESC cars
+-----------------+---------------------+------+-----+---------+-------+
| Field           | Type                | Null | Key | Default | Extra |
+-----------------+---------------------+------+-----+---------+-------+
| id              | bigint(20) unsigned | NO   | PRI | NULL    | ai    |
| brand           | varchar(40)         | NO   |     | NULL    |       |
| engine_capacity | tinyint(4)          | NO   |     | NULL    |       |
| model           | tinyint(4)          | NO   |     | NULL    |       |
| serialnumber    | bigint(20)          | NO   | UNI | NULL    |       |
+-----------------+---------------------+------+-----+---------+-------+

AND

DESC cars_data;
+---------------+----------------------+------+-----+---------+-------+
| Field         | Type                 | Null | Key | Default | Extra |
+---------------+----------------------+------+-----+---------+-------+
| carid         | bigint(20) unsigned  | NO   | PRI | NULL    | ai    |
| car_status    | enum('1','2','3')    | NO   | MUL | 1       |       |
| base_location | point                | NO   | MUL | NULL    |       |
| driver_id     | bigint(20) unsigned  | NO   | MUL | NULL    |       |
+---------------+----------------------+------+-----+---------+-------+

Both those tables, are assumed to be very rarely updated, thus I used for them MyISAM for next reasons:

  • Need to store Point data and use Spatial Index on them.
  • Saving disk space (since MyISAM tables are smaller in size).
  • Easier to maintain (as per my experience, is that right?).

Here is the very frequently updated table:

DESC cars_extra;
+---------------+---------------------+------+-----+---------+-------+
| Field         | Type                | Null | Key | Default | Extra |
+---------------+---------------------+------+-----+---------+-------+
| carid         | bigint(20) unsigned | NO   | PRI | NULL    | ai    |
| odometer_km   | tinyint(6) unsigned | NO   |     | NULL    |       |
| last_maint    | bigint(20) unsigned | NO   |     | NULL    |       |
| last_update   | bigint(20) unsigned | NO   |     | NULL    |       |
+---------------+---------------------+------+-----+---------+-------+

I used for last table InnoDB for one single main reason:

  • Avoid table-level locks during updates.

Advises, and general notes would be more than appreciated :-)

Also, are joins usually a con or pros (surly with the use of proper indexed keys).

Foi útil?

Solução

Simply, use InnoDB for everything. MyISAM is going away in the next release. In almost all cases, InnoDB out-performs MyISAM. About the only advantage with MyISAM is disk space.

With millions of rows, you will appreciate InnoDB's automatic recovery after any crash.

It is rarely wise to make two tables be 1:1. Your suggestion for that is based on one table being constant; the other being frequently modified. But I would suggest that "frequent" is 'many times a second', not 'a few times a day'.

Don't design the schema without sketching out the SELECTs. You have POINT and SPATIAL, but what will you do with such?

Glancing at the tables...

Use SHOW CREATE TABLE; it is more descriptive than DESCRIBE.

Don't use BIGINT (8 bytes) unless INT (4 bytes) won't suffice. You would need to hire nearly all the adults in the world to exceed what INT UNSIGNED can handle.

JOIN can be pro or con. But it is rarely a deciding factor.

Use utf8mb4 from the start. (DESCRIBE omits that bit if info.)

Use AUTO_INCREMENT only if there is not a 'reasonable' column (or combination of columns) that make up a 'natural' PRIMARY KEY.

TINYINT UNSIGNED is limited to 255, not enough for odometer. MEDIUMINT UNSIGNED goes up to 16M.

Do not PARTITION without a good reason for it.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a dba.stackexchange
scroll top