Question

Suppose I have to do a networked application for an organisation ,say bank .Here a member can have different accounts like loan accounts,GDCS accounts,deposit accounts etc.To store the data I thought about two apporoaches.I am taking the deposit account for example.(A member can deposit amount in any day.)

1.]Storing the deposit details of every member in a single table having member id as a field. 2.]Storing the deposit details of a single member in a single table named member_id_deposits

In case 1 ,there will be more than one record with same member_id.So data redundancy is there,since member_id is redundant. In case 2 ,there is no redundancy as the entire deposit details of different days are stored in a single table for each member.But in this case if there are 100000 members,100000 tables will be there.

So which approach should be followed,the one which having lesser no of tables ,or the one that reduces redundancy but having very large number of tables?.

I know the primary concern in database design is decreasing redundancy.So in that point of view the second design is better.But it has a lot of tables.Is there any problem, having a very large number of tables?.Is there a limit for maximum number of tables that can be stored in a database.Is database with large number of tables slow in querying.

Was it helpful?

Solution

Why would anyone think that databases designed to support tens or hundreds of gigabytes of data would perform better on zillions of small tables rather than one large table?

There is only one instance that I can readily think of where splitting customer data among different tables (and ultimately databases) is desirable. That is when it is an explicit requirement of the problem at hand. For instance, a law firm might have to store client data in different places because that is legally necessary. The investment side of a bank might have to store data in a different place from the rest of the bank, to prevent access.

What are downsides of having lots of tables? Here are some that I can think of:

  1. It is not free to find the right table for a given customer. It may take essentially no overhead to access one of one hundred tables in a database. It might take the database a bit of time to access one of one hundred thousand.
  2. Your database may not allow that many tables.
  3. If the tables are small, the pages are likely to be sparsely filled. Drastically increasing the overhead of storing the data.
  4. Any information needed "across the tables" (say, "What is the average balance of customers at each client?") becomes so difficult to answer that no one will even both to ask such questions.
  5. All the queries need to be dynamic, meaning that you lose the ability to error-check the queries as they are being written.
  6. In most databases, dynamic queries that change the table name need to be recompiled, so you will have compilation overhead on each query.
  7. Maintenance is a nightmare, because you have to change zillions of tables. On the other hand, this would discourage additional development, which might make the application more stable ;).

I'm sure there are additional reasons that other people will think of. In short, databases are designed for large tables. Normalization is not about eliminating redundancy (redundant copies of data, yes). Use databases they way they are designed to be used.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top