Question

I was wondering what you do when developing a new application in terms of estimating database size.

E.g. I am planning to launch a website, and I am having a hard time estimating what size I could expect my database to grow. I don't expect you to tell me what size my database will be, but I'd like to know if there are general principles in estimating this.

E.g. When Jeff developed StackOverflow, he (presumably) guesstimated his database size and growth.

My dilemma is that I am going for a hosted solution for my web application (its about cost at this stage), and preferably don't want to shoot myself in the foot by not purchasing enough SQL Server space (they charge a premium for this).

Was it helpful?

Solution

If you have a database schema, sizing is pretty straightforward ... it's just estimated rows * avg row size for each table * some factor for indexes * some other factor for overhead. Given the ridiculously low price of storage nowadays, sizing often isn't a problem unless you intend to have a very high traffic site (or are building an app for a large enterprise).

For my own sizing exercises, I've always created an excel spreadsheet listing:

  • col 1: each table that will grow
  • col 2: estimated column size in bytes
  • col 3: estimated # of rows (per year or max, depending on application)
  • col 4: index factor (I always set this to 2)
  • col 5: overhead factor (I always set this to 1.2)
  • col 6: total column (col 2 X 3 X 4 X 5)

The sum of col 6 (total column), plus the initial size of your database without growth tables, is your size estimate. You can get much more scientific, but this is my quick and dirty way.

OTHER TIPS

The cost of estimating is likely to be larger than the cost of the storage

Most hosting providers sell capacity by the ammount used at the end of each month, so just let it run

Determine:

  • how many visitors per day, V
  • how many records of each type will be created per visit, N1, N2, N3...
  • the size of each record type, S1, S2, S3...

EDIT: forgot index factor which a good rule of thumb is 2 times

Total growth per day = 2* V * (N1*S1 + N2*S2 + N3*S3 + ...)

My rules-of-thumb to follow are

  • how many users do I expect?
  • what content can they post?
  • how big is a user record?
  • how big is each content item a user can add?
  • how much will I be adding?
  • how long will those content items live? forever? just a couple weeks?

Multiply the user record size times the number of users; add the number of users times the content item size; multiply by two (for a convenient fudge factor).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top