Which DB Junction approach is more efficient in this scenario? [closed]

https://stackoverflow.com/questions/9932622

27-05-2021
|

Вопрос

In order to avoid soft delete I am creating a recycle bin database. The main database will junction to it. Here is an example of two possible junction approaches, and I was hoping for some input on which would be more efficient?

For simplicity, lets say there are two tables, Order and Invoice (and each invoice only has 1 order).

Order
-----
OrderId
InvoiceId
Description
Date
NumberOfStuffOrdered

Invoice
-------
InvoiceId
Description
Price
Tax
Shipping

For a junction of these tables to the recycle bin, I was unsure which approach to take.

Approach 1:

DeletedOrder
------------
DeletedOrderId
OrderId
RecycleBinId
Date
Reason

DeletedInvoice
--------------
DeletedInvoiceId
InvoiceId
RecycleBinId
Date
Reason

Approach 2:

DeletedRecords
--------------
DeletedRecordsId
RecordPrimaryKeyId
RecycleBinId
RecordType
Date
Reason

Although Approach 1 will take more table space in the database, it will have less rows per table and have fast query times as the system matures. Approach 2 consolidates having to make an extra deleted table for each table in the database, but as the system matures will grow in size and become slow to query.

Which one will be more efficient overall, or is there a better way to approach this?

Решение

It depends on how much you need to retain, and how you will be using it. If you need to record all the details of your invoices and orders (NumberOfStuffOrdered, Tax, etc.) the specific delete tables are necessary. If you merely need to record the fact that the row once existed (what you have now: Id, type, Date[Deleted], Reason), we cycle back to "it depends".

If no one's really going to use the data, if you just need the fact that it existed on the off-chance of an IRS audit some day, then a single table should be adequate. (The analogy is the warehouse filled with boxes of forms going back 70 years -- it'll take time, but you'll find it eventually.) However, if you are regularly going to access this data and run reports, do data mining, or whatever on it, then you're best off designing tables to support those process--normalized, star schemas, or whatever's useful.

Generally, I suspect a big table with a few indexes supporting frequent queries should suffice, unless good performance is critical.

Лицензировано под: CC-BY-SA с атрибуция

Не связан с StackOverflow