What is the idiomatic solution in SQL Server for reserving a block of ids for use in a bulk insert?
-
14-02-2021 - |
سؤال
I have a table with an identity column and I want to reserve a block of ids which I can use for bulk inserting, whilst allowing inserts to still happen into that table.
Note this is part of a bulk insert of several tables, where those other tables relate to these ids via an FK. Therefore I need to block them out so I can prepare the relationships beforehand.
I've found a solution which works by taking a lock on the table in a transaction and then does the reseeding (which is pretty fast). But it looks a bit hacky to me - is there a generally accepted pattern for doing this?
create table dbo.test
(
id bigint not null primary key identity(1,1),
SomeColumn nvarchar(100) not null
)
Here's the code to block out (make room for) some ids:
declare @numRowsToMakeRoomFor int = 100
BEGIN TRANSACTION;
SELECT MAX(Id) FROM dbo.test WITH ( XLOCK, TABLOCK ) -- will exclusively lock the table whilst this tran is in progress,
--another instance of this query will not be able to pass this line until this instance commits
--get the next id in the block to reserve
DECLARE @firstId BIGINT = (SELECT IDENT_CURRENT( 'dbo.test' ) +1);
--calculate the block range
DECLARE @lastId BIGINT = @firstId + (@numRowsToMakeRoomFor -1);
--reseed the table
DBCC CHECKIDENT ('dbo.test',RESEED, @lastId);
COMMIT TRANSACTION;
select @firstId;
My code is batch processing blocks of data in chunks of about 1000. I have about a billion rows to insert in total. Everything is working fine - the database isn't the bottle neck, the batch processing itself is computationally expensive and requires me to add a couple of servers to run in parallel, so I need to accommodate more than one process "batch inserting" at the same time.
المحلول
You can use procedure (introduced in SQL Server 2012):
sp_sequence_get_range
To use it you need to create a SEQUENCE object and use it as a default value instead of IDENTITY column.
There is an example:
CREATE SCHEMA Test ;
GO
CREATE SEQUENCE Test.RangeSeq
AS int
START WITH 1
INCREMENT BY 1
CACHE 10
;
CREATE TABLE Test.ProcessEvents
(
EventID int PRIMARY KEY CLUSTERED
DEFAULT (NEXT VALUE FOR Test.RangeSeq),
EventTime datetime NOT NULL DEFAULT (getdate()),
EventCode nvarchar(5) NOT NULL,
Description nvarchar(300) NULL
) ;
DECLARE
@range_first_value_output sql_variant ;
EXEC sp_sequence_get_range
@sequence_name = N'Test.RangeSeq'
, @range_size = 4
, @range_first_value = @range_first_value_output OUTPUT ;
SELECT @range_first_value_output;
Documentation: sp_sequence_get_range