Question

What is the best, DBMS-independent way of generating an ID number that will be used immediately in an INSERT statement, keeping the IDs roughly in sequence?

Was it helpful?

Solution

DBMS independent? That's a problem. The two most common methods are auto incrementing columns, and sequences, and most DBMSes do one or the other but not both. So the database independent way is to have another table with one column with one value that you lock, select, update, and unlock.

Usually I say "to hell with DBMS independence" and do it with sequences in PostgreSQL, or autoincrement columns in MySQL. For my purposes, supporting both is better than trying to find out one way that works everywhere.

OTHER TIPS

If you can create a Globally Unique Identifier (GUID) in your chosen programming language - consider that as your id.

They are harder to work with when troubleshooting (it is much easier to type in a where condition that is an INT) but there are also some advantages. By assigning the GUID as your key locally, you can easily build parent-child record relationships without first having to save the parent to the database and retrieve the id. And since the GUID, by definition, is unique, you don't have to worry about incrementing your key on the server.

There is auto increment or sequence

What is the point of this, that is the least of your worries?

How will you handle SQL itself? MySQL has Limit,

SQL Server has Top,

Oracle has Rank

Then there are a million other things like triggers, alter table syntax etc etc

Yep, the obvious ways in raw SQL (and in my order of preference) are a) sequences b) auto-increment fields. The better, more modern, more DBMS-independent way is to not touch SQL at all, but to use a (good) ORM.

There's no universal way to do this. If there were, everyone would use it. SQL by definition abhors the idea - it's an antipattern for set-based logic (although a useful one, in many real-world cases).

The biggest problem you'd have trying to interpose an identity value from elsewhere is when a SQL statement involves several records, and several values must be generated simultaneously.

If you need it, then make it part of your selection requirements for a database to use with your application. Any serious DBMS product will provide its own mechanism to use, and it's easy enough to code around the differences in DML. The variations are pretty much all in the DDL.

I'd always go for the DB specific solution, but if you really have to the usual way of doing this is to implement your own sequence. Your RDBMS has to support transactions.

You create a sequence table which contains an int column and seed this with the first number, your transaction logic then looks something like this

begin transaction
update tblSeq set intID = intID + 1
select @myID = intID from tblSeq

inset into tblData (intID, ...) values (@myID, ...)
end transaction

The transaction forces a write lock such that the then next queued insert cannot update the tblSeq value before the record has been inserted into tblData. So long as all inserts go though this transaction then your generated ID is in sequence.

Use an auto-incrementing id column.

Is there really a reason that they have to be in sequence? If you're just using it as an ID, then you should just be able to use part of a UUID or the first couple digits of md5(now()).

You could take the time and massage it. It'd be the equivalent of something like

DateTime.Now.Ticks

So it be something like YYYYMMDDHHMMSSSS

It may be of a bit lateral approach, but a good ORM-type library will probably be able to at least hide the differences. For example, in Ruby there is ActiveRecord (commonly used in but not exclusively tied to the Ruby the Rails web framework) which has Migrations. Within a table definition, which is declared in platform-agnostic code, implementation details such as datatypes, sequential id generation and index creation are pushed down below your vision.

I have transparently developed a schema on SQLite, then implemented it on MS SQL Server and later ported to Oracle. Without ever changing the code that generates my schema definition.

As I say, it may not be what you're looking for, but the easiest way to encapsulate what varies is to use a library that has already done the encapsulation for you.

With only SQL, following could be one to the approaches:

  1. Create a table to contain the starting id for your needs
  2. When the application is deployed for the first time, the application should read the value in its context.
  3. Thereafter, increment id (in thread-safe fashion) as required 3.1 Write the id to the database (in thread-safe fashion) which always keeps updated value 3.2 Don't write it to the database, just keep incrementing in the memory (thread-safe manner)
  4. If for any reason server is going down, write the current id value to the database
  5. When the server is up again it will pick from where it left, the last time.
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top