Question

I'm designing a HTTP-service, with capacity of up to 500 million requests per day (served by more than one independent machine).

For each request I have to generate unique ID and return it to user. ID must be 100% unique within a window of 10 minutes. (1 day is preferred, globally unique IDs are ideal.) No server-server communication must be needed to generate that ID.

Silly pseudo-session example:

Client: GET /foo

Server: Content-Type: text/xml

        <root>
            <id>ab9d1972-2844-11e0-86b2-000c29544403</id>
            <other_data/>
        </root>

In previous generation of this HTTP service I used UUIDs.

I'm happy with UUIDs, but there is one problem: they are too long. On that number of requests, this extra size in noticeable in disk space waste for log files.

What is the best way to create a short, but unique identifier? To make things worthwhile, I guess, algorithm should produce at most half of UUID length while being unique for all day long (10 minutes should be even shorter).

Ideally, suggested algorithm would have sane, lightweight production-quality implementation in plain C.

Update: Generated ID should not require URI-encoding when passed in the GET request.

Was it helpful?

Solution

Give each machine a unique prefix. Give each machine a counter. To generate an ID, increment the counter, and append its value to the prefix.

If you want to obfuscate the IDs, encrypt them - a cipher is a reversible transformation, so applying it to unique values will produce unique values.

OTHER TIPS

A few thoughts:

  • 500 million requests a day. Really?
  • Use UUIDs.
  • If required, don't use HTTP (as that's the more significant overhead) and transfer the UUID in a binary form.
  • You need a certain amount of bytes to guarantee that your server returns a truly unique ID.
  • How about using UDP?

Anyway, what the heck are you trying to do?

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top