Question

Is it OK to swallow Duplicate key violation exceptions for INSERTS or should you check if the record exists?

So let's say that I have a table Photo with one field: PhotoName. I'm looking through a file directory to add items to Photo. In the process, it's possible that when I find a photoname, it might already be in the database. So there are two ways to go about this:

1) //Look to see if it exists before adding it. Only add it if it does not exist.

bool photoExists = SQLSELECTStatementToCheckIfThePhotoExists(photoName);
if(!photoExists)
  SQLCommandToInsertPhoto(photoName)

or 2) //Assume that it doesn't exist. If it does, catch and ignore.
try
{
  SQLCommandToInsertPhoto(photoName);
}
catch(DuplicateKeyException ex)
{
  //swallow it and continue on as if nothing happened.
}

On the one hand, I don't necessarily like the notion of just "swallowing" an exception, but on the other hand, try...catch uses only one call to the DB. This happens to be in SQL Server.

Was it helpful?

Solution

You should definitely not just "swallow" the exception. You should be trying to find these duplicates and not insert them if needed.

On method could be checking where not exists on the key.

INSERT INTO TargetTable

SELECT 
    KeyID,
    blah,
    blerg,
FROM SourceTable AS S
WHERE NOT EXISTS (
    SELECT 1
    FROM TargetTable AS T
    WHERE S.KeyID = T.KeyID
    )

This method will allow you to only INSERT new rows into the table. This method of course does not account for any matching you may need to do for an UPDATE, however that's outside the scope of this question but should still be thought about. Most users could also use MERGE I'll post an example of that when I get time.

OTHER TIPS

It can be very expensive to let SQL Server raise exceptions (even if you only swallow them) - see here and here.

So my suggestion is to check for the violation first, and only insert if you have to. However, I wouldn't separate these out into separate statements, especially in completely separate round-trips to the app, as you can have this scenario:

-- connection A, at 12:00:00.0000001:

SELECT FROM TABLE WHERE key = 'x'; -- 0 rows returned

-- connection B, at 12:00:00.0000002:

SELECT FROM TABLE WHERE key = 'x'; -- 0 rows returned

-- connection A, at 12:00:00.0000003:

INSERT dbo.TABLE(key) VALUES('x'); -- succeeds

-- connection B, at 12:00:00.0000003:

INSERT dbo.TABLE(key) VALUES('x'); -- fails

I would rather do this in a single INSERT ... WHERE NOT EXISTS statement, as @Zane's answer demonstrates, though I would add higher escalation on the SELECT portion. Or you could use an INSTEAD OF INSERT trigger to just bail from the insert if a key violation is spotted (I wrote about this here.) As an aside, I'd use extreme caution with MERGE - see this article for my reasoning and some other opinions, too.

Answers using IF NOT EXIST, EXCEPT, MERGE, etc are not correct. They are not safe to use under concurrent load. Your code may work correctly for months and suddenly one day: boom!. You need something which guarantees an atomic action.

Let's review the current state, year 2022, while comparing MS SQL Server to some immediate competitors:

PostgreSQL: Native support: INSERT INTO ... ON CONFLICT DO NOTHING

MySQL: Native support: INSERT IGNORE INTO ...

MS SQL Server: well, hmm, no native support and it quickly gets ugly. The closest thing to native support is the IGNORE_DUP_KEY option on the index, but since you set it on the index, you cannot do it on a per-statement basis. You can do the statement inside a transaction and use SQL Server's lock hints, i.e. UPDLOCK, SERIALIZABLE or a MERGE with HOLDLOCK hint but it feels a bit like you need to have a PhD in MS SQL Server and I think there's to much of a risk for crafting the statement wrong.

Conclusion for MS SQL Server: In the end doing a try-catch and ignoring a specific error isn't a bad idea at all. There may be solutions which perform slightly better if you need to perform a million inserts in one go, but that hasn't been my concern.

YMMV. My basic point is that you need to pay attention to concurrency.

This article is quite informative, IMO.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top