Since you have tagged this with [azure] and [sql-server], I assume that you are trying to do this in Windows Azure. If that is the case then a) partitioning by client is not necessarily a good idea, and b) SQL may not be the best (complete) fit for your problem.
When building scalable architectures, the partitioning strategy shouldn't be based on something specific like 'client', but rather something more arbitrary. The reason is simple — unless clients have a reason to be separate, such as not wanting their data mixed with others, or different SLAs per client, then the choice of 'client' as a partition may not render the optimal result. If 80% of your business is generated by a single client, you have not solved your problem, and still have to maintain n separate databases for marginal load.
5 mil database inserts per day is not a big number, but may be a big number for SQL Server hosted in Azure IaaS, or Azure SQL Database — due to the performance of the underlying commodity hardware. Before determining how to partition SQL, ask yourself two questions. First, what are the usages and performance characteristics that you want from the data? (Does it have to be immediately consistent? Can you process data asynchronously?) Secondly, have you mapped those characteristics against other data store technologies? Have you considered Table Storage (or non-MS solutions like Redis)?
You may find, after trying out a few options that:
- SQL is a good store for some of the data, some of the time.
- Much of the processing can be done asynchronously, so the peak performance of inserts is of little concern (and doing 5 mil inserts over a 24-hour period is not a problem).
- SQL may not be suited to long-term storage.
- Querying of older data can be done effectively using map-reduce, rather than SQL queries.
For example, I have an app that tracks vehicles at one-second intervals. It is targeted for 100,000 vehicles but architected in such a way as to be able to scale up to millions without changing any code or databases. But in the medium term, it has to cope with 72 mil inserts per day. All of this runs on a single Windows Azure SQL database that is less than 10GB, and a whole bunch of table storage. The reason why this works is because although I want to archive all data (72 mil rows), I don't need complex SQL query access to it, so it sits happily in table storage. What I store in SQL is a summary of the data. So in my example, I am only interested in a vehicle's journey (start and end position, distance travelled etc), which means that I only have two or three rows per vehicle per day that I need in SQL — greatly reducing the load on the database. Also, my bottleneck is in the collection of data, so I add the data immediately to a (Windows Azure) queue — and worry about the summarising of data in a separate workload.
This answer may be a bit long, but is intended for you to think about your data model more carefully, rather than just trying to think about how to solve the problem with SQL. For more detail, have a look at the data model in CALM.