سؤال

In an app, a user can initiate the generation a six-digit alphanumeric code (actually done via Web API). The code needs to be checked against a remote database to ensure it is unique. Other than for diagnostics or logging, the DB table is only used by the Web API and an Azure web job that cleans out expired codes.

In my tests, collisions occurred on average only one in ~24K times with a high of 85K and a low of 170. It is the latter that concerns me. Although the table is emptied of expired codes daily, at some point there will be a collision. Although unlikely, it could happen with very few records in the table.

It is worth noting that because of exterior requirement, the code is limited to six characters from AEFGHJKLPQRSTUWXYZ123456789.

A simplistic solution would be to keep running the generation/check until a unique code is created. In all likelihood, this would only mean 1-2 iterations. But there's no guarantee. And certainly it can't run unfettered.

The next step would be to put a limit - say 5 tries - through a simple loop. This may be enough of a solution. Despite concern that the user would be frustrated if it ultimately fails, if it does there's probably enough wrong with the overall system to ruin the user experience anyway.

I have good exponential backoff code so although this is not a network issue (also a concern), it may help to incorporate it into the retry scheme. In fact, that might be a nice catch-all solution. Given 5 increasing temporal delays in retrying, it would allow both enough chances for the logical code to exceed the chances of a collision and mitigate introducing more stress on network resources.

Any suggestions/observations would be appreciated.

FYI, this is the code used to generate codes:

    /// <summary>
/// Generates a randomized alphanumeric code
/// </summary>
/// <param name="strLength">e.g. 6 chars</param>
/// <param name="charSet">e.g. AEFGHJKLPQRSTUWXYZ123456789</param>
/// <returns>string of the specified length</returns>
public static string GenerateCode(int strLength, string charSet)
{
    var chars = charSet.ToCharArray();
    var data = new byte[1];
    var crypto = new RNGCryptoServiceProvider();
    crypto.GetNonZeroBytes(data);
    data = new byte[strLength];
    crypto.GetNonZeroBytes(data);
    var result = new StringBuilder(strLength);
    foreach (var b in data)
    {
        result.Append(chars[b % (chars.Length)]);
    }
    return result.ToString();
}
هل كانت مفيدة؟

المحلول

Lets think about your different six-digit codes. You can generate (26 (for alpha) + 10 (for numeric))^6 = 2176782336 different entries. Lets now step into the birthday paradoxon if you have sqrt(2176782336) = 46656 (36^3) elements already in your list, you will have a 50 percent chance to find a duplicate for each new try. Since you are filling the list on the server, it will be harder for the client to find a non-used-code.

If you will not reach 46656 elements within a day, you have very good chances, to use up to 4 tries. But if you come closer to the 46K-barrier, you will have to increase the retry count. If you want lesser probable collisons use more digits or more characters 25 for lower case (omit l) and 24 (Omit O and I) for upper case and 10 for numbers this will increase your 50 Percent barrier to 205379 (59^3).

on the server side you will experience the so called Coupon collectors problem Analogoue to the problem, that the clients are try to give you (the server) coupons you do not already have...

hth.

نصائح أخرى

You can use a table with an auto-incrementing id as the basis for your code. If need be, start the increments at 100,000 for 6 digits. Insert a row, get the id, and return it. It will be guaranteed to be unique.

Depending on your expected usage levels, you might consider tacking on a few random letters to the id. So that code '1234' becomes '1234AB'. This makes it more difficult to stumble upon other people's codes.

You can also implement the above as a lookup-table. Fill a table with suitable codes, and return the corresponding one based on the auto-incremented id.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى softwareengineering.stackexchange
scroll top