MySQL Performance - Number of Tables Vs. Number of Rows

https://stackoverflow.com/questions/9965203

28-05-2021
|

Question

I've two routes,

1) creating sub-tables for each user and storing his individual content

2) creating few tables and store data of all users in them.

for instance.

1) 100,000 tables each with 1000 rows

2) 50 Tables each with 2,000,000 rows

I wanna know which route is the best and efficient.

Context: Like Facebook, for millions of users, their posts, photos, tags. All this information is in some giant tables for all users or each user has it's own sub-tables.

Solution

This are some pros and cons of this two approaches in MySQL.

1. Many small tables.

Cons:

More concurrent tables used means more file descriptors needed (check this)
A database with 100.000 tables is a mess.

Pros:

Small tables means small indexes. Small indexes can be loaded entirely on memory, that means that your queries will run faster.
Also, because of small indexes, data manipulation like inserts will run faster.

2. Few big tables

Cons:

A huge table imply very big indexes. If your index cannot be entirely loaded on memory most of the queries will be very slow.

Pros:

The database (and also your code) it's clear and easy to mantain.
You can use partitioning if your tables became so big. (check this).

From my experience a table of two millions rows (I've worked with 70 millions rows tables) it's not a performance problem under MySQL if you are able to load all your active index on memory.

If you'll have many concurrent users I'll suggest you to evaluate other technologies like Elastic Search that seems to fit better this kind of scenarios.

OTHER TIPS

Creating a table for each user is the worse design possible. It is one of the first things you are taught in db design class.

Table is a strong logical component of a database and hence it is used for many of the maintenance tasks by RDBMS. E.g. it is customary to set up table file space, limitations, quota, log space, transaction space, index tree space and many many other things. If every table gets it's own file to put data in, you'll get big roundtrip times when joining tables or whatsoever.

When you create many tables, you'll be having a really BIG overhead in maintenance. Also, you'll be denying the very nature of relational sources. And just suppose you're adding record to a database - creating a new table each time? It'd be a bit harder on your code.

But then again, you could try and see for yourself.

You should leverage the power of MySQL indexes which will basically provide something similar to having one table per user.

Creating one table called user_data indexed on user_id will (in a big picture) transform your queries having a where clause on user_id like this one:

SELECT picture FROM user_data WHERE user_id = INT

Into:

Look in the index to find rows from user_data where user_id = INT
Then, in this batch of rows load me the value of picture

By doing that, MySQL won't search into all rows from user_data but into the relevant one found in the index.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow