Question

We are just beginning to put together a data warehouse that will be useful for our reporting requirements, bringing disparate data sources together.

Reviewing the potential uses of the data once together, we have found some potential scenarios where some of our transactional processing systems could reference this data in a useful way. Obviously the data would be out of date, and optimised for reads, however in some scenarios this is fine for the applications purposes, and would reduce the load on core servers.

My question is this: is it considered a bad design for a transactional system to access the data stored in a data warehouse? Obviously the primary purpose of our warehouse is reporting, which makes me question whether we should allow other non-reporting systems to read the data. My instincts guide me away from allowing applications to read and display the data, are there any good reasons to listen to them?!

Était-ce utile?

La solution

There is nothing wrong with having your OLTP systems access DW data and, in fact, as systems evolve, you will see the line between transactional and informational systems blur.

I, also, wouldn't worry too much about data structures so long as you come up with something that works. 3 NF might be the answer but, accessing highly summarized data from a multidimensional database might also be a good solution - depending on the problem you are trying to solve.

One last thing to consider is the type of data you are trying to get out of the data warehouse. Is it summarized transactions (e.g. average sale amount) or more like shared dimensional data (e.g. customer name and address)? If the latter, you might want to consider combining a master data management strategy with your data warehouse strategy.

One more last thing, try to figure out why you are hesitant to share data between these databases. Is it something you can put your finger on or is it really just because you've been trained by our industry to think that they need to be separate? Remember, in the end, our jobs are not really to build data warehouses & business intelligence systems; they are to solve business problems in reliable, pragmatic, cost effective ways.

Autres conseils

There's nothing fundamentally wrong with making the warehouse a hub for application data consumers as well as analytical data consumers. Here are some points to think about though.

You'll need a technical solution that supports the required level of availability, transaction isolation and consistency for both workloads. E.g. Can you ensure that the application won't starve analytical queries of resources and vice versa? Can you make data available to the applications in a consistent and timely manner even during warehouse loads? It's unwise to assume that you'll always be able to load the warehouse out of hours - even if you think you can do that today.

Make sure your warehouse is well-normalized (meaning at least Boyce-Codd / 5th Normal Form or something close to it). That's good advice for any warehouse, but perhaps especially if you need to support non-analytical queries.

Do your apps need to update the warehouse? If so then you need to consider how that integrates with the rest of the ETL process.

Consider whether to give the app a data mart of its own. That may well be the safest option to start with.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top