Question

I have a Kimball-style DW (facts and dimensions in star models - no late-arriving facts rows or columns, no columns changing in dimensions except expiry as part of Type 2 slowly changing dimensions) with heavy daily processing to insert and update rows (on new dates) and monthly and daily reporting processes. The fact tables are partitioned by the dates for easy rolloff of old data.

I understand the WITH(NOLOCK) can cause uncommitted data to be read, however, I also do not wish to create any locks which would cause the ETL processes to fail or block.

In all cases, when we are reading from the DW, we are reading from fact tables for a date which will not change (the fact tables are partitioned by date) and dimension tables which will not have attributes changing for the facts they are linked to.

So - are there any disadvantages? - perhaps in the execution plans or in the operation of such SELECT-only queries running in parallel off the same tables.

Was it helpful?

Solution

As long as it's all no-update data there's no harm, but I'd be surprised if there's much benefit either. I'd say it's worth a try. The worst that will happen is that you'll get incomplete and/or inconsistent data if you are in the middle of a batch insert, but you can decide if that invalidates anything useful.

OTHER TIPS

This is what you probably need:

`ALTER DATABASE AdventureWorks SET READ_COMMITTED_SNAPSHOT ON;

ALTER DATABASE AdventureWorks SET ALLOW_SNAPSHOT_ISOLATION ON; `

Then go ahead and use

SET TRANSACTION ISOLATION LEVEL READ COMMITTED

in your queries. According to BOL:

The behavior of READ COMMITTED depends on the setting of the READ_COMMITTED_SNAPSHOT database option:

If READ_COMMITTED_SNAPSHOT is set to OFF (the default), the Database Engine uses shared locks to prevent other transactions from modifying rows while the current transaction is running a read operation. The shared locks also block the statement from reading rows modified by other transactions until the other transaction is completed. The shared lock type determines when it will be released. Row locks are released before the next row is processed. Page locks are released when the next page is read, and table locks are released when the statement finishes.

If READ_COMMITTED_SNAPSHOT is set to ON, the Database Engine uses row versioning to present each statement with a transactionally consistent snapshot of the data as it existed at the start of the statement. Locks are not used to protect the data from updates by other transactions.

Hope this help. Raj

Have you considered creating a DATABASE SNAPSHOT of your DW and run your reports off it?

Yes. Your SQL will be far less readable. You will inevitably miss some NOLOCK hints because SQL SELECT commands using the NOLOCK strategy have to put it all over the place.

You can get the same thing by setting the isolation level

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED

In the end you get a 10% performance boost (sorry I'm too lazy too look up the article for it, but it's out there)

I'd say a 10% gain isn't worth reducing readability.

If making the whole database read-only is possbile, Then this is a better option. You'll get read-uncommitted performance without having to modify all your code.

ALTER DATABASE adventureworks SET read_only

NOLOCK performs a ‘dirty read’ (indecently READ UNCOMMITTED does the same thing as NOLOCK). If the database is being updated as you read there is a danger that you will get inconsistent data back. The only option is to either accept locking and hence blocking, or to pick one of the two new isolation levels offered in SQL 2005 onwards discussed here.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top