The main problem here is that the incremental cost of detecting collisions from the generated string, and try again, increases as you generate more and more strings (since you have to read all of those strings to make sure you didn't generate a duplicate). At the same time, the odds of hitting a duplicate goes up, meaning the bigger the table gets, the slower this process will get.
Why do you need to generate the unique string at runtime? Build them all in advance. This article and this post are about random numbers, but the basic concept is the same. You build up a set of unique strings and pull one off the stack when you need one. Your chance of collisions stays constant at 0% throughout the lifetime of the application (provided you build up a stack of enough unique values). Pay for the cost of collisions up front, in your own setup, instead of incrementally over time (and at the cost of a user waiting for those attempts to finally yield a unique number).
This will generate 100,000 unique 5-character strings, at the low, one-time cost of about 1 second (on my machine):
;WITH
a(a) AS
(
SELECT TOP (26) number + 65 FROM master..spt_values
WHERE type = N'P' ORDER BY number
),
b(a) AS
(
SELECT TOP (10) a FROM a ORDER BY NEWID()
)
SELECT DISTINCT CHAR(b.a) + CHAR(c.a) + CHAR(d.a) + CHAR(e.a) + CHAR(f.a)
FROM b, b AS c, b AS d, b AS e, b AS f;
That's not enough? You can generate about 1.12 million unique values by changing TOP (10)
to TOP (20)
. This took 18 seconds. Still not enough? TOP (24)
will give you just under 8 million in about 2 minutes. It will get exponentially more expensive as you generate more strings, because that DISTINCT
has to do the same duplicate checking you want to do every single time you add a customer.
So, create a table:
CREATE TABLE dbo.StringStack
(
ID INT IDENTITY(1,1) PRIMARY KEY,
String CHAR(5) NOT NULL UNIQUE
);
Insert that set:
;WITH
a(a) AS
(
SELECT TOP (26) number + 65 FROM master..spt_values
WHERE type = N'P' ORDER BY number
),
b(a) AS
(
SELECT TOP (10) a FROM a ORDER BY NEWID()
)
INSERT dbo.StringStack(String)
SELECT DISTINCT CHAR(b.a) + CHAR(c.a) + CHAR(d.a) + CHAR(e.a) + CHAR(f.a)
FROM b, b AS c, b AS d, b AS e, b AS f;
And then just create a procedure that pops one off the stack when you need it:
CREATE PROCEDURE dbo.AddCustomer
@CustomerName VARCHAR(64) /* , other params */
AS
BEGIN
SET NOCOUNT ON;
DELETE TOP (1) dbo.StringStack
OUTPUT deleted.String, @CustomerName /* , other params */
INTO dbo.Customers(CustomerID, CustomerName /*, ...other columns... */);
END
GO
No silly looping, no needing to check if the CustomerID
you generated just exists, etc. The only additional thing you'll want to build is some type of check that notifies you when you're getting low.
As an aside, these are terrible identifiers for a CustomerID. What is wrong with a sequential surrogate key, like an IDENTITY column? How is a 5-digit random string with all this effort involved, any better than a unique number the system can generate for you much more easily?