Question

When I importing data to a new table from a large excel, if one record failed, then nothing is imported. I think it's ok because it meet the Atomicity rule. However, when I fixed the source data error and import again, the identity column does not start from 1, but start from a big value.

For example

create table #test (id int identity(1,1), name varchar(4) default '')

insert into #test (name) values('1 insert will failed');
select ident_current('#test') as ident_current
insert into #test (name) values('2 insert will failed');
select ident_current('#test') as ident_current

insert into #test (name) values('3 OK');
select ident_current('#test') as ident_current

select * from #test

drop table #test

Result

id          name 
----------- ---- 
3           3 OK

Wikipedia descripbe ACID as the following

Atomicity

Atomicity requires that each transaction is "all or nothing": if one part of the transaction fails, the entire transaction fails, and the database state is left unchanged. An atomic system must guarantee atomicity in each and every situation, including power failures, errors, and crashes.

So, it looks like SQL Server doesn't let the database state (the identity value) unchanged if insert failed, so, does this break the ACID rule?

BTW, PostgreSQL doesn't let identity(serial) value grows when insert failed. (Update: Only sometimes, see comments. DO NOT rely on this.).

test=# create table AutoIncrementTest (id serial not null, name varchar(4));
NOTICE:  CREATE TABLE will create implicit sequence "autoincrementtest_id_seq" for serial column "autoincrementtest.id"
CREATE TABLE
test=# insert into autoincrementtest(name) values('12345');
ERROR:  value too long for type character varying(4)
test=# insert into autoincrementtest(name) values('12345');
ERROR:  value too long for type character varying(4)
test=# insert into autoincrementtest(name) values('1234');
INSERT 0 1
test=# select * from autoincrementtest;
 id | name
----+------
  1 | 1234
Was it helpful?

Solution

Since the identity value is not something that's physically stored in any part of the database that you can access, I disagree that this breaks atomicity. If you don't want to "break atomicity", or if you care about gaps (you shouldn't), there are other ways to do this (e.g. use a serializable transaction and take MAX(col)+1 for the new row).

OTHER TIPS

Yes it does, so don't rely on contiguous values with MSSQL Server.

I would suggest that relying on contiguous identity values per-se, with any engine is a brittle and naive approach. This could always occur as the result of subsequent deletes.

I suppose this deviation from purist ACID compliance allows a performance optimization in MS SQL Server.

Atomicity guarantees, according to this formulation, that the database state is left unchanged. The question is what we mean by the database state.

As long you understand the SQL concept of "identity insert" to neither claim nor guarantee that identity columns will be sequential, there is no issue. It does require a rethinking about what SQL guarantees when you consider identity insert, but since we know that this can fail in the case mentioned, it wasn't ever really guaranteed to be the NEXT value.

Before the insert, the 'next' value of the identity column is only guaranteed to be greater than the current value - not that it is the next value. This is still the state afterwards.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top