In SQL Server, in layman's terms, how do you lock resources well enough to perform an INSERT-IF-NOT-EXISTS transaction?

https://dba.stackexchange.com/questions/273765

06-03-2021
|

Frage

If someone asks how to perform an INSERT-IF-NOT-EXISTS operation in SQL Server, they'll typically get an answer like this back:

IF NOT EXISTS(SELECT 1 FROM [TheTable] WHERE [ColumnX] = @valX)
    INSERT [TheTable] ([ColumnX]) VALUES (@valX)

The problem I'm seeing with this is that in between the SELECT statement and the INSERT statement, the situation could change externally. Another process could insert the ColumnX value after the SELECT statement, but before the INSERT statement, resulting in an error being raised.

I've worked in software for a while, but am not a DB specialist, and when I search for an answer to this problem in SQL Server, the results I'm seeing are either irrelevant or quite difficult to really apply (because they're either answering a different question or are written in terms tailored to DB specialists).

So in layman's terms, how do you resolve this problem? I did get a little bit rusty with SQL in recent history, but am thinking that there really should be a pragmatic locking mechanism to use for this (whether there is or isn't). As a fallback, maybe error handling can specifically determine whether an error raised matches this exact issue, ignoring it in that specific case.

Preferably this doesn't involve just locking the whole table every time.

Lösung

Option 1: take a lock out which locks at least the range in the index that the row would exist in.

SET XACT_ABORT ON;
BEGIN TRAN
IF NOT EXISTS(SELECT 1 FROM [TheTable] WITH (UPDLOCK, HOLDLOCK) WHERE [ColumnX] = @valX)
    INSERT [TheTable] ([ColumnX]) VALUES (@valX)
COMMIT

HOLDLOCK will give serializable semantics and lock the range between existing keys in the index where the value would fit (if there is a suitable index, otherwise it will lock the whole table). UPDLOCK reduces the probability of deadlocks in this pattern as two concurrent queries can't take out the same range lock in the reading phase.

Option 2: You can just add a unique constraint on ColumnX and try the insert anyway and catch the error raised from duplicate key violation.

Given that Option 1 needs an index with leading column ColumnX anyway to meet your preference of not "locking the whole table every time" you might as well add one and define it as unique. The index will speed up the existence check anyway. With that in place I would select between option 1 and 2 on the basis of how frequently I expect attempts to insert duplicates.

Lizenziert unter: CC-BY-SA mit Zuschreibung

Nicht verbunden mit dba.stackexchange