Question

Currently, I have one table, and it is getting populated very fast. I have 50 devices. I gather data from each device every 30 seconds. Therefore, after we add 10,000 devices, they would generate 876,000,000 records per month-- which is a lot!

INSERT INTO unit_data
(`id`,`dt`,`id_unit`,`data1`,`data2`,
`ip`,`unique_id`,`loc_age`,`reason_code`,
`data3`,`data4`,`Odo`,`event_time_gmt_unix`,
`switches`,`on_off`,`data5`)

here are my relationships

  PRIMARY KEY (`id`),
  UNIQUE KEY `id_unit_data_UNIQUE` `id`),
  KEY `fk_gp2` (`id_unit`),
  KEY `unit_dt_id` (`dt`,`id_unit`),
  KEY `unit_id_dt` (`id_unit`,`dt`),
  CONSTRAINT `fk_gp2` FOREIGN KEY (`id_unit`) REFERENCES `unit` (`id_unit`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=1049392 DEFAULT CHARSET=utf8$$

I am facing pretty complex queries and reports, and when I do them, our system is not responding and hitting execution timeout. (this is with 2mil+ records)

I need to rethink and re-implement the database structure. And currently I am thinking about either

  • Create new table for each unit
  • Create new table for each unit for each month

What would you suggest?

Was it helpful?

Solution

Creating new tables is a nice idea, but you don't need to implement it, MySql already has such tool - google for keywords "mysql+partitioning". I recommend to use it because you needn't to change your queries, mysql itself cares about it. Just add "partition by" keyword to your create table statment.

One more trick for you: I suggest you are gathering a lot of information to some big table and also select sometimes some data from it. But inserting many new rows provokes table to be locked (unavailable for selects) and rebuild indexes (I'm sure that your table is indexed). In my current project I'm doing something similar to you and I advice you to do the following:

1) create table-clone of your BIG-TABLE. It should has the same structure with BIG-TABLE with one difference - table-clone doesn't has indexes.

2) when you recieve data from your devices put it into table-clone.

3) write some robot-agent which will put records from small table into big table each hour or each day - it depends to you but the best case is choose such interval that size of table will be small enough to do fullscan(remember, it isn't indexed).

4) when you want to perform SELECT query, you do it in 2 tables - in indexed BIG table - fast enough because nobody tryes to insert data into it(only robot do it sometimes), and fullscan in small table - also fast enough because you can keep it small.

5) robot should wake up in the calm time c- may be at night.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top