Question

I am about to dive headfirst into an SAAS project that I have put together a small team to produce. For the most part we already have everything feature wise planned out, but I'm at the point where I need to begin thinking about database architecture and security concerns.

Are there any security concerns that are different than a normal web application project since you'd be having potentially hundreds of users?

As far as the database goes, I've been reading posts all day about database design and I think I'm going to go with a single database for everything, but I'm wondering if that will get slow? Lets say you have a "clients" table with 30,000 records in it. Won't that be way slower loading than doing a db for each customer?

Also, should the signups and MY customer information go on the same database as the actual SAAS app. In other words, should a user that comes to my site and signs up for a free trial go into one database while all of his team and clients etc go into the actual product database, or should it all just be one database?

Any other considerations before diving into this? I'm going to be building it on the Codeigniter framework.

Was it helpful?

Solution

Here's some thoughts:

Security

You'll be getting a lot more attention from potential hackers with a larger userbase, but otherwise your security concerns should be much the same, for the most part (just that the chances and consequences of someone finding and exploiting a security flaw are much higher).

I don't know how much you already know about Codeigniter, but it has a number of security features built in. There's also the general security stuff you'd want for any site.

  • Consider enabling CI's cross-site scripting protection
  • Consider using CI's Database Sessions
  • Use CI's Active Record DB class properly to avoid MySQL injection vulnerabilities
  • Ensure you are hashing and salting passwords (per user salts are good, also ensure your hashing algorithm is secure - ie, not MD5)
  • Also ensure anything else confidential is encrypted in some way or another - Codeigniter has a library for this, which is decent provided you have mcrypt support on your server
  • Ensure user input is being filtered to prevent XSS (cross site scripting) attacks. Codeigniter has a feature to attempt to filter these out, however it isn't implemented amazingly well
  • Make sure that permissions checks are watertight, so that there is no way a user can access information that doesn't belong to them
  • Implement some kind of login attempts per hour per ip restriction

Performance

I don't have any hard cold numbers to give you here, but definitely look at ensuring all appropriate fields are indexed, as table scans can get really slow with larger row counts.

Avoid overnormalising your data - Sometimes it can be better to store the same data in multiple places. You then incur a performance hit updating the data, but potentially save time on joins when reading the data.

If there is ever the opportunity to use database triggers, go for it - Its much faster than having your web app send several queries to the db to accomplish the same thing.

If you segregate the free trial accounts into another DB, it would help keep the "real customer data" seperate from the trial data (and makes it easier to say, only perofrm automated backups on non-trial data). From an infrastructure standpoint, it would also make it easier if down the line you wanted to move the free trial and regular services onto seperate systems.

Don't have too many databases though, remember that every time you push an update, you're going to need to ensure that every database schema is up to date and happy. If you have hundreds of client databases, any kind of non-automated administration on them could easily turn into a nightmare.

OTHER TIPS

One big thing to consider when using a single database for SaaS applications:

  • Add the accountId to all your tables.
  • Always use the accountid in all your queries to the database. Otherwise you may end up mixing up data between accounts.

Also, before deciding to have separate databases for trial accounts and paid accounts, think about what happens when a trial account converts over to paid account. They will definitely want to keep all their data. Are you going to move all of their data from trial db to paid db?

In my experience with SaaS, it's best to keep the data in the same database for each of access. If you're using MySQL, there's a limit to how many tables you can have open at any one time per database. If you're using a lot of databases with a lot of tables, you'll run into memory and I/O issues down the track. I remember one project we had 1 database per client and we were hitting our memory limit on a 4GB server after 10 clients (30~40 tables each).

Like William said, you want to avoid over-normalisation if possible. You'll want to implement EAV tables where possible for object properties, for example on a users table where each client may have a different configuration per site. Same for user roles, user groups, etc. You'll definitely want to use EAV if you're building e-commerce functionality. For normal websites, 3NF is OK; most repetitive data is separated out and linked together with foreign keys so that you don't have tables with 20 - 30 columns. For SaaS applications you may need to step it up to BCNF of 4NF so that you're allowing additional application flexibility.

Site/Application Determination:

It's critically important to display the correct website for the URL used. The simplest form is to use a XML/JSON file and on request scan the XML file to match the URL. Cache it in memory as much as possible because you'll find that you'll have a linear rise in traffic for each new site. If you've got 3 websites, you'll have 3 times as much traffic to the one application as opposed to being spread out across three different applications.

Security:

It depends on the level of security you need. If you need an SSL certificate, you can get away with buying a shared server certificate if you're going to hosting all sites from the same IP. If each site wants their own SSL, then you're going to have to accommodate for different IPs for sites. That's when you need to map URLs => IPs.

User based security is just as important as displaying the correct site. You need to make sure that you're detecting injection-based attacks. You can leverage off of the inbuilt CI functions, but to be safe you should be doing your own filtering. Check that the integer really is an integer. You can add additional libraries to CI to achieve this. ACLs are very important in SaaS applications because you'll want to limit who see's what, and that you're limiting each person to the content of their own site. It's easy to conceptualize but hard to implement, and should be considered the no.1 development task to get right. Design the ACL, test the design, develop a test case and if it works, implement it.

Performance:

CI is pretty good for performance, lightweight without much of a memory footprint. You'll want to leverage off of caching as much as possible - database, file, APC, Memcache etc. Keep in mind that database caching is always faster than file caching, but it'll create more socket/tcp requests to the database server. If there's a bottleneck in the DB server, it'll affect site performance.

One thing I will recommend when it comes to performance, look at using several different web servers and load test on the hardware you'll be running. Good options are Apache, Nginx & LiteSpeed. I've used Apache & Nginx before; they both perform well under certain circumstances and if you tune them appropriately will be able to handle large amounts of traffic effortlessly. Keep in mind that Nginx configuration is different than Apache, so htaccess rules have to be written in the Nginx format and saved into the site config. You'll also use php-fpm instead of suexec. If you do use Apache, make sure that the htaccess rules are saved to the site configuration file rather than using a .htaccess file in the root directory. The reason is that file stats create I/O requests and a lot of I/O requests will eventually create a bottleneck.

100% use application-level caching where possible, just make sure that you plan ahead for key indexes so that you're fetching the correctly cached information for each client site. A good way is to hash the site name, or even use the url as part of the key index (eg. www_joeblogs_com_somekeyasdasd). I know some CMS applications don't do it (Joomla) so you'll be returning data from another site!

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top