Is the NOLOCK (Sql Server hint) bad practice?

https://stackoverflow.com/questions/1452996

12-09-2019
|

Question

I'm in the business of making website and applications that are not mission critical -> eg. banking software, space flight, intensive care monitoring application, etc. You get the idea.

So, with that massive disclaimer, is it bad using the NOLOCK hint in some Sql statement? A number of years ago, it was suggested by a fellow Sql Administrator that I should use NOLOCK if I'm happy with a "dirty read" which will give me a bit more performance out of my system because each read doesn't lock the table/row/whatever.

I was also told that it's a great solution if I'm experiencing dead-locks. So, I started following that thought for a few years until a Sql guru was helping me with some random code and noticed all the NOLOCKS in my sql code. I was politely scolded and he tried to explain it to me (why it's not a good thing) and I sorta got lost. I felt that the essence of his explanation was 'it's a band-aid solution to a more serious problem .. especially if you're experiencing deadlocking. As such, fix the root of the problem'.

I did some googling recently about it and came across this post.

So, can some sql db guru sensei's please enlighten me?

Solution

With NOLOCK hint, the transaction isolation level for the SELECT statement is READ UNCOMMITTED. This means that the query may see dirty and inconsistent data.

This is not a good idea to apply as a rule. Even if this dirty read behavior is OK for your mission critical web based application, a NOLOCK scan can cause 601 error which will terminate the query due to data movement as a result of lack of locking protection.

I suggest reading When Snapshot Isolation Helps and When It Hurts - the MSDN recommends using READ COMMITTED SNAPSHOT rather than SNAPSHOT under most circumstances.

OTHER TIPS

Prior to working on Stack Overflow, I was against NOLOCK on the principal that you could potentially perform a SELECT with NOLOCK and get back results with data that may be out of date or inconsistent. A factor to think about is how many records may be inserted/updated at the same time another process may be selecting data from the same table. If this happens a lot then there's a high probability of deadlocks unless you use a database mode such as READ COMMITED SNAPSHOT.

I have since changed my perspective on the use of NOLOCK after witnessing how it can improve SELECT performance as well as eliminate deadlocks on a massively loaded SQL Server. There are times that you may not care that your data isn't exactly 100% committed and you need results back quickly even though they may be out of date.

Ask yourself a question when thinking of using NOLOCK:

Does my query include a table that has a high number of INSERT/UPDATE commands and do I care if the data returned from a query may be missing these changes at a given moment?

If the answer is no, then use NOLOCK to improve performance.

I just performed a quick search for the NOLOCK keyword within the code base for Stack Overflow and found 138 instances, so we use it in quite a few places.

If you don't care about dirty reads (i.e. in a predominately READ situation), then NOLOCK is fine.

BUT, be aware that the majority of locking problems are due to not having the 'correct' indexes for your query workload (assuming the hardware is up to the task).

And the guru's explanation was correct. It is usually a band-aid solution to a more serious problem.

Edit: I'm definitely not suggesting that NOLOCK should be used. I guess I should have made that obviously clear. (I would only ever use it, in extreme circumstances where I had analysed that it was OK). AS an example, a while back I worked on some TSQL that had been sprinkled with NOLOCK to try and alleviate locking problems. I removed them all, implemented the correct indexes, and ALL of the deadlocks went away.

Doubt it was a "guru" who'd had any experience in high traffic...

Websites are usually "dirty" by the time the person is viewing the completely loaded page. Consider a form that loads from the database and then saves the data that's edited?? It's idiotic the way people go on about dirty reads being such a no no.

That said, if you have a number of layers building on your selects, you could be building in a dangerous redundancy. If you're dealing in money or status scenarios, then you need not only transactional data read/writes, but a proper concurrency solution (something most "gurus" don't bother with).

On the other hand, if you have an advanced product search for a website (ie something that likely won't be cached and be a little intensive) and you've ever built a site with more than a few concurrent users (phenominal how many "experts" haven't), it is rediculous to bottle neck every other process behind it.

Know what it means and use it when appropriate. Your database will almost always be your main bottle neck these days and being smart about using NOLOCK can save you thousands in infrastructure.

EDIT: It's not just deadlocks it helps with, it's also how much you are going to make everybody else wait until you're finished, or vice versa.

Using NOLOCK Hint in EF4?

None of the answers is wrong, however a little confusing maybe.

When querying single values/rows it's always bad practise to use NOLOCK -- you probably never want to display incorrect information or maybe even take any action on incorrect data.
When displaying rough statistical information, NOLOCK can be very useful. Take SO as an example: It would be nonsense to take locks to read the exact number of views of a question, or the exact number of questions for a tag. Nobody cares if you incorrectly state 3360 questions tagged with "sql-server" now, and because of a transaction rollback, 3359 questions one second later.

As a professional Developer I'd say it depends. But I definitely follow GATS and OMG Ponies advice. Know What You are doing, know when it helps and when it hurts and

read hints and other poor ideas

what might make You understand the sql server deeper. I generally follow the rule that SQL Hints are EVIL, but unfortunately I use them every now and then when I get fed up with forcing SQL server do things... But these are rare cases.

luke

When app-support wanted to answer ad-hock queries from the production-server using SSMS (that weren't catered for via reporting) I requested they use nolock. That way the 'main' business is not affected.

I agree with some comments about NOLOCK hint and especially with those saying "use it when it's appropriate". If the application written poorly and is using concurrency inappropriate way – that may cause the lock escalation. Highly transactional table also are getting locked all the time due to their nature. Having good index coverage won't help with retrieving the data, but setting ISOLATION LEVEL to READ UNCOMMITTED does. Also I believe that using NOLOCK hint is safe in many cases when the nature of changes is predictable. For example – in manufacturing when jobs with travellers are going through different processes with lots of inserts of measurements, you can safely execute query against the finished job with NOLOCK hint and this way avoid collision with other sessions that put PROMOTED or EXCLUSIVE locks on the table/page. The data you access in this case is static, but it may reside in a very transactional table with hundreds of millions of records and thousands updates/inserts per minute. Cheers

I believe that it is virtually never correct to use nolock.

If you are reading a single row, then the correct index means that you won't need NOLOCK as individual row actions are completed quickly.

If you are reading many rows for anything other than temporary display, and care about being able repeat the result, or defend by the number produced, then NOLOCK is not appropriate.

NOLOCK is a surrogate tag for "i don't care if this answer contains duplicate rows, rows which are deleted, or rows which were never inserted to begin with because of rollback"

Errors which are possible under NOLOCK:

Rows which match are not returned at all.
single rows are returned multiple times (including multiple instances of the same primary key)
Rows which do not match are returned.

Any action which can cause a page split while the noLock select is running can cause these things to occur. Almost any action (even a delete) can cause a page split.

Therefore: if you "know" that the row won't be changed while you are running, don't use nolock, as an index will allow efficient retrieval.

If you suspect the row can change while the query is running, and you care about accuracy, don't use nolock.

If you are considering NOLOCK because of deadlocks, examine the query plan structure for unexpected table scans, trace the deadlocks and see why they occur. NOLOCK around writes can mean that queries which previously deadlocked will potentially both write the wrong answer.

The better solutions, when possible are:

Replicate your data (using log-replication) to a reporting database.
Use SAN snapshots and mount a consistent version of the DB
Use a database which has a better fundamental transaction isolation level

The SNAPSHOT transaction isolation level was created because MS was losing sales to Oracle. Oracle uses undo/redo logs to avoid this problem. Postgres uses MVCC. In the future MS's Heckaton will use MVCC, but that's years away from being production ready.

NOLOCK is often exploited as a magic way to speed up database reads, but I try to avoid using it whever possible.

The result set can contain rows that have not yet been committed, that are often later rolled back.

An error or Result set can be empty, be missing rows or display the same row multiple times.

This is because other transactions are moving data at the same time you're reading it.

READ COMMITTED adds an additional issue where data is corrupted within a single column where multiple users change the same cell simultaneously.

In real life where you encounter systems already written and adding indexes to tables then drastically slows down the data loading of a 14gig data table, you are sometime forced to used WITH NOLOCK on your reports and end of month proessing so that the aggregate funtions (sum, count etc) do not do row, page, table locking and deteriate the overall performance. Easy to say in a new system never use WITH NOLOCK and use indexes - but adding indexes severly downgrades data loading, and when I'm then told, well, alter the code base to delete indexes, then bulk load then recreate the indexes - which is all well and good, if you are developing a new system. But Not when you have a system already in place.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow