Question

What are the pros/cons of de-normalizing an enterprise application database because it will make writing reports easier?

Pro - designing reports in SSRS will probably be "easier" since no joins will be necessary.

Con - developing/maintaining the app to handle de-normalized data will become more difficult due to duplication of data and synchronization.

Others?

Was it helpful?

Solution

Denormalization for the sake of reports is Bad, m'kay.

Creating views, or a denormalized data warehouse is good.

Views have solved most of my reporting related needs. Data warehouses are great when users will be generating reports almost constantly or when your views start to slow down.

This is why you want to normalize your database

  1. To free the collection of relations from undesirable insertion, update and deletion dependencies;
  2. To reduce the need for restructuring the collection of relations as new types of data are introduced, and thus increase the life span of application programs;
  3. To make the relational model more informative to users;
  4. To make the collection of relations neutral to the query statistics, where these statistics are liable to change as time goes by.

—E.F. Codd, "Further Normalization of the Data Base Relational Model" via wikipedia

OTHER TIPS

The only time you should consider de-normaliozation is when the time it takes the report to generate is not acceptable. De-normalization will cause consistentcy issues that are sometimes impossible to determine especially in large datasets

Don't denormalize just to get rid of complexity in reporting, it can cause huge problems in the rest of the application. Either you don't enforce the rules resulting in bad data or if you do then inserts, deletes and updates can be seriously slowed for everyone not just the two or three people who run reports.

If the reports truly can't run well, then create a data warehouse that is denormalized and populate it in a nightly or weekly feed. The kind of reports that typically need this do not generally care if the data is up-to-the minute as they are usually monthly, quarterly, or annual reports that process (and especially aggregate) large amounts of data after the fact.

You can do both... let the normalized database for applications. Then create a denormalized database for reports, and create an application which regulary copy data from one database to the other.

After all, reports don't always need to have the latest updated data, most of the time you can easily launch an update every 1 hour on the reporting database, and only once a day day.

Beyond the data warehouse and views solutions provided in other answers, which are good in some ways, if you are willing to sacrifice some performance to get a good to the last second data, but still want a normalized database, you could use on Oracle a Materialized View with fast refresh on commit, or in Sql Server, you could use clustered indexes for a view.

Another Con is that the data is likely not to be real-time as there is some time moving around the data to go from a normalized form to a de-normalized. If someone wants the report to be up to the very second it was requested, that can be tough to do in this situation.

If this is a duplication of the synchronization in the original post, sorry I didn't quite see it that way.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top