Question

There is a DB {MySql} with 5 tables A, BIG, C, D, E. Their grow factor is about 1/100/1/1/1. The table BIG has Insert/Read/Update request ratio about 1/10/2. The Inserts and Updates 'cannot fail'.

The table 'BIG' has data that:

  1. they are critical in the same day in which they are created (in the ACID point of view, the AC is very important), after 2 day its criticality gets smaller and smaller.
  2. they provide the basis for statistical information located in some othe tables (F, G,...). There are some "data-pumps" that reads the data from BIG and write it on F, G. The data pumps reads about 100 rows from BIG and write about 1 row on F, 1 on G etc. After that operation the rows on BIG can be removed.

More Numbers: For the table BIG i expect that: About +2k rows per day in the peakday, about +0.5k rows per day as average. The growing is cyclical: (i.e. mon=+0.5,...,wed=+0.5,sat=+1k,sun=+2k, mon=+0.5,...) and for this reason i would activate the clean of the data once per week (i.e. on mondays)

Description of Data: they are basically user requests that needs to be served 'live' (max 1 hour). Basically there is no need to store those served requests after they are marked as consumed; I just need to do some stats on that (maybe after some day, but no hurry on that).

Deploy information: The deploy is on Heroku and i would use MySql (said to be good at reads) or Postgres (said to be good at updates?) any advice on that too?

What would be a good option in order to effectively manage the scalability of the DB? Is the data pump a good solution?

I was thinking an in memory table BIG but it is said to provide a good Read ratio (like it would be a cache), what about the Inserts and Updates? Are there any other options ?

Was it helpful?

Solution

Please provide some sizes. For example, how big would BIG be when it is time to remove the rows? How big are the other tables?

MEMORY is not necessarily better than InnoDB. It may be slower because of table vs row locking. It may interfere with overall performance because of taking RAM away from the buffer_pool for BIG, thereby slowing down other things.

Yes, having a "staging" table is a practical way to do certain things. For really high-speed staging, ping-ponging between two table may be desirable.

PARTITION is unlikely to be useful in what you have described so far.

Please describe what kind of data you have and why it needs AC or ACID (if it is not obvious from the data).

Edit

Thanks. One row a minute? Things don't get exciting until 100 per second. Based on that, I would expect either MySQL or Postgres to consider this to be a "tiny" database.

I recommend doing it in whatever way is easiest for you. I do not forsee any scaling/performance problems, at least not for the near future. (I am assuming adequate INDEXes on the tables and reasonable queries.) I'm still vague on what the "pump" is, but it sounds unnecessary.

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top