Question

I'm working on an application that lets registered users create or upload content, and allows anonymous users to view that content and browse registered users' pages to find that content - this is very similar to how a site like Flickr, for example, allows people to browse its users' pages.

To do this, I need a way to identify the user in the anonymous HTTP GET request. A user should be able to type http://myapplication.com/browse/<userid>/<contentid> and get to the right page - should be unique, but mustn't be something like the user's email address, for privacy reasons.

Through Google App Engine, I can get the email address associated with the user, but like I said, I don't want to use that. I can have users of my application pick a unique user name when they register, but I would like to make that optional if at all possible, so that the registration process is as short as possible.

Another option is to generate some random cookie (a GUID?) during the registration process, and use that, I don't see an obvious way of guaranteeing uniqueness of such a cookie without a trip to the database.

Is there a way, given an App Engine user object, of getting a unique identifier for that object that can be used in this way?

I'm looking for a Python solution - I forgot that GAE also supports Java now. Still, I expect the techniques to be similar, regardless of the language.

Was it helpful?

Solution

Your timing is impeccable: Just yesterday, a new release of the SDK came out, with support for unique, permanent user IDs. They meet all the criteria you specified.

OTHER TIPS

I think you should distinguish between two types of users:

1) users that have logged in via Google Accounts or that have already registered on your site with a non-google e-mail address

2) users that opened your site for the first time and are not logged in in any way

For the second case, I can see no other way than to generate some random string (e.g. via uuid.uuid4() or from this user's session cookie key), as an anonymous user does not carry any unique information with himself.

For users that are logged in, however, you already have a unique identifier -- their e-mail address. I agree with your privacy concerns -- you shouldn't use it as an identifier. Instead, how about generating a string that seems random, but is in fact generated from the e-mail address? Hashing functions are perfect for this purpose. Example:

>>> import hashlib

>>> email = 'user@host.com'
>>> salt = 'SomeLongStringThatWillBeAppendedToEachEmail'

>>> key = hashlib.sha1('%s$%s' % (email, salt)).hexdigest()
>>> print key
f6cd3459f9a39c97635c652884b3e328f05be0f7

As hashlib.sha1 is not a random function, but for given data returns always the same result, but it is proven to be practically irreversible, you can safely present the hashed key on the website without compromising user's e-mail address. Also, you can safely assume that no two hashes of distinct e-mails will be the same (they can be, but probability of it happening is very, very small). For more information on hashing functions, consult the Wikipedia entry.

Do you mean session cookies?

Try http://code.google.com/p/gaeutilities/


What DzinX said. The only way to create an opaque key that can be authenticated without a database roundtrip is using encryption or a cryptographic hash.

Give the user a random number and hash it or encrypt it with a private key. You still run the (tiny) risk of collisions, but you can avoid this by touching the database on key creation, changing the random number in case of a collision. Make sure the random number is cryptographic, and add a long server-side random number to prevent chosen plaintext attacks.

You'll end up with a token like the Google Docs key, basically a signature proving the user is authenticated, which can be verified without touching the database.

However, given the pricing of GAE and the speed of bigtable, you're probably better off using a session ID if you really can't use Google's own authentication.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top