What are the performance differences of inserting a bunch of data with one INSERT versus a bunch of INSERTs in a transaction?

https://dba.stackexchange.com/questions/242700

06-02-2021
|

سؤال

I'm trying to understand the performance differences between something like

INSERT INTO Person (id, name) VALUES
(1, "Kevin"),
(2, "John"),
(3, "Jane"),
...

and

BEGIN TRANSACTION;

INSERT INTO Person (id, name) VALUES (1, "Kevin");
INSERT INTO Person (id, name) VALUES (2, "John");
INSERT INTO Person (id, name) VALUES (3, "Jane");
...

END TRANSACTION;

I'm aware that during a transaction indexes are temporarily built, but I'm not quite sure otherwise. I also don't know the other performance differences between the two.

المحلول

Using a transaction eliminates the most expensive operation, the transaction log flush, but other per-statement operations can still make the multi-row insert more efficient than single-row inserts.

In SQL Server a multi-row insert may modify indexes more efficiently, and insert rows more efficiently, as the locking and latching necessary for the operation taken on a statement-by-statement, rather than a row-by-row basis. And there is some overhead for running each statement, as TSQL is (by default) an interpreted language.

Also referential integrity constraints are checked and triggers fired after each statement, not at the end of a transaction. So there could even be a scenario where inserting multiple rows in a single statement behaves differently than using multiple statements. EG

use tempdb
go

drop table if exists t 
create table t(id int primary key, tid int not null)
insert into t(id,tid) values (1,2),(2,1)

مرخصة بموجب: CC-BY-SA مع الإسناد

لا تنتمي إلى dba.stackexchange