Question

How would you design a hosted web application? I'm looking at applications like Basecamp, Campaign Monitor, Freshbooks, etc... where users can sign up online and the application is hosted for them.

  1. Would you use 1 big database to store all your customer's data or would you handle data differently? Would you use more than 1 database? Would you make a database for each customer?
  2. Would you duplicate your code base for each signup/customer or would you use 1 codebase to handle all customers?
  3. Are there other design elements I should think about?
  4. Any web sites or books out there that talk about this?

Edit: I found an MSDN article that discussed multi-tenant Data Architecture: http://msdn.microsoft.com/en-us/library/aa479086.aspx#mlttntda_topic4

Was it helpful?

Solution

Refer to 37signals -- they are experts in this field and have a lot of articles where they answer community questions (many like yours should come up).

High Scalability = 37signals Architecture

Ask 37signals: How do you process credit cards?

In regards to number of databases, from David Heinemeier Hansson in What do you want to know?

Some technical answers…

Lance, all our scheduled billing operations are automated. Anything sort of that would drive us insane. It’s especially important to make sure that contingency handling is in place for failing credit cards. Last I looked, I believe 5% of our charges bounced thanks to credit cards that were expired, over the limit, or closed. Be sure to handle that gracefully.

We just use Authorize.net and a separate credit card application (tiny app developed in Rails and used by the other apps on the internal network through REST ) that keeps numbers secure.

Warren, we run free and pay accounts on the same database. It’s one database per application. One database per account is normally a really, really bad idea. Usually the data is fairly normalized, but we’re definitely not religious about it. I generally value my source code over my schema. So if I can get better/prettier source code by bending a schema, I’ll typically do that. But start from normalized and denormalize as performance or code structure demands it.

Jason, we use email for sms. All US carriers have a phone@carrier-gateway.com gateway.

Jake Good, ahh, the good ol’ “but does it scale” question. I answered that on a couple of years back. Nothing has changed for us since then. We manage millions and millions of dynamic requests every day without even resorting to much caching (most screens in most of our applications are different on a per-user basis, so traditional caching schemes are harder to apply).

There are many other Rails applications out there managing tens of millions of daily requests. All follow more or less the same Shared Nothing approach. All the techniques for scaling high and tall are out there. It’s hardly a turn-key solution, but anything that promises to be that is usually just full of it.

OTHER TIPS

If you're only talking about thousands of customers (vs hundreds of thousands or millions) then the difference is pretty minimal unless you know you have tables that might have thousands of rows per customer or more. Then your design might change.

Normal setup for a relational-database-based datastore is going to be putting a customer_id foreign key on most of your tables. Then just don't show that data to anyone but that customer (or in cases where they've somehow indicated explicit permissions are granted to someone else).

Don't worry too much about RDBMS scaling issues until it looks like you might start having multiple millions of rows in one table. Then it might be time to investigate a distributed key/value store. But keep in mind that that sort of problem is the good kind of problem to have, because presumably it means that you're making a ton of cash.

i.e., cross the scaling bridge when you come to it. Design things to the best of your current ability, but otherwise, premature optimization is the root of all evil.

I work as a consultant to a number of SaaS apps, so have seen different architectures. I'd recommend:

  1. One database for all customers. Be sure to design the db well so that you have a primary key for the user which is your own unique ID. I've seen some messes where the design effectivey (not actually, but it might as well have) made something like email, phone number, etc as the primary key). Also, don't end up throwing everything into a giant user table.

    1. You'll want to start tracking lots of user interaction behavior at some point. For that, you can use a NoSQL name-value store and just start throwing events into it for later analysis. Or, use something like MixPanel or KISSmetrics.

    2. Keep track of daily KPIs by writing rows to a KPI table that make it easy to query what has happened over time. Otherwise you'll end up wanting to ask questions of the db and find out that it's a giant query to do so.

A key advantage of having a single SQL db is that if your marketing person knows SQL (recommended!) then they can just query it directly. If you go the NoSQL route, then it's much harder, and then marketing starts making assumptions which are usually wrong, and you waste a lot of time going down the wrong path based on those assumptions.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top