Question

I'm developing a web platform that may reach some million of users where I need to store users' images and docs. I'm using Rackspace and now I need to define the files logic into cloud files service. Rackspace allows to create up to 500,000 containers with an account (reference page 17, paragraph 4.2.2) and in addition they suggest to limit each container size up to 500,000 objects (reference Best practice - Limit the Number of Objects in Your Container), which is the best practice for users files management?

One container for user don't seems to be a good solution because there is the 500,000 containers limit. Rackspace suggests to use virtual container. I'm a bit undecided how to use them.

Thanks in advance.

Was it helpful?

Solution

If you will only be interactive with the files via API calls having 200,000 objects is fine (from my experience, haven't had the need for anything larger).

if you want to try to use the web interface for ANY TASKS AT ALL you need to have far, far less than that. The web interface does not break contents up by folder, so if you have 30,000 objects, the web interface will just paginate them and show them to you in alphabetical order. This is ok for containers with up to a few hundred objects, but beyond that the web interface is unusable.

If you have some number of millions of users, you can use some part of the user ID as a shard key to decide what bucket to use. See http://docs.mongodb.org/manual/core/sharding-internals/#sharding-internals-shard-keys for information about choosing a shard key. It's written for Mongo users, but is applicable here. The takeaway is pick some attribute that will distribute your users somewhat evenly so you don't have one bucket that exceeds the max number of files you want to have per bucket.

One way is to use user ID's, which we can randomly assign and shard based on the first digit. For this example, we'll use the UID's 1234, 2234, 1123, and 2134. Say you want to break files up by the first digit of UID, you'd save user the files for 1234 and 1123 in the container "files_group_1" and the files for 2234 and 2134 in the "files_group_2" container.

Before picking a shard key, make sure you think about how many files users might store. If, for example, a user may store hundreds (or thousands) of files, then you will want to shard by a more unique key than the first digit of a UID.

Hope that helped.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top