Question

I currently have 26 tables in a MYSQL database. My boss wants me to recreate those 26 tables whenever we have a new client, and append a client abbreviation of some sort to those new tables. So for example there's company1~system~users and company2~system~users and so on and so forth.

I'd rather just add a table to the database that keeps track of our clients, with an auto incrementing 11 digit INT primary key and reference it in the other 26 tables instead, so we're not cluttering up the database with 4000 tables if have 200 clients.

I think his fear is that if we go the method I'd rather do, it will take MYSQL noticeably longer to perform queries because clients with anywhere between 2000 and 5000 records each will be sharing tables. So for example, searching for users that belong to company1 from a table called system~users with 1,500,000 records would be slower than searching for users from a table called company1~system~users with 2,000 records. I think it would actually be slower for MYSQL to search if each client we have has 26 sets of tables ( that's 26 per client ).

Which method is actually slower?

Was it helpful?

Solution

Why not just create a database or each company? And then you don't even need to construct dynamic table names when constructing your queries. It's a far more sound solution. What's more, it'll make client data more separated so any inter-dependency will probably be more obvious.

The above works best when the application layers are also separate so you can provide each instance with a different set of database login credentials.

If that isn't the case it might work fine or be awkward or fine depending on your installation, what platform you're using and so on.

Appending a company name is a hack but it can be made to work I guess.

Having a client ID in records is also a common approach. I wouldn't necessarily worry about 1.5 million records from a performance point of view as long as the tables are appropriate indexed. This isn't a huge amount of records. Plus the company ID criteria should limit results fairly well anyway.

OTHER TIPS

200 clients * 5,000 records [in any given table] is small by database standards. Provided you add the client-id as the first key of most indexes, the scheme you suggest should not introduce any noticeable performance degradation.

However, it may be interesting to keep customers' data separated for other reasons. Maybe if this makes sense in regards to the application, you may handle each different client's data in a separate database.

If you have appropriate indexes on the company identifier, performance shouldn't be a problem.

As for appending an id to the table names, why don't you consider creating a separate database instance instead? I think changing the table names may require you additional coding (to dynamically generate the table names on any query), where using different instances seems straightforward.

Keep all your customers' data in the same set of tables until you have a really good reason to do so.

From what I understand from the above, your data are incredibly tiny therefore will fit in ram; this makes performance tuning unnecessary in practice, so you won't care about things like index clustering.

Do the simplest thing that could work. Just keep them in one set of tables.

Thank you everyone for your wonderful and helpful suggestions. From the general consensus among you it seems that I should either separate each clients set of tables into their own database or allow the clients to share their information in one set of tables with appropriate indexes with company identifiers.

You're solution sounds best and as far as performance goes, hardware is cheap. Trying to maintain 4000 tables or 26 databases, is likely to cost more than the one time hit of a nicer CPU with more RAM. It sounds like your boss is trying to do the engineering for you, that's no good.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top