Estimating IOPS requirements of a production SQL Server system

https://stackoverflow.com/questions/8707841

13-04-2021
|

Question

We're working on an application that's going to serve thousands of users daily (90% of them will be active during the working hours, using the system constantly during their workday). The main purpose of the system is to query multiple databases and combine the information from the databases into a single response to the user. Depending on the user input, our query load could be around 500 queries per second for a system with 1000 users. 80% of those queries are read queries.

Now, I did some profiling using the SQL Server Profiler tool and I get on average ~300 logical reads for the read queries (I did not bother with the write queries yet). That would amount to 150k logical reads per second for 1k users. Full production system is expected to have ~10k users.

How do I estimate actual read requirement on the storage for those databases? I am pretty sure that actual physical reads will amount to much less than that, but how do I estimate that? Of course, I can't do an actual run in the production environment as the production environment is not there yet, and I need to tell the hardware guys how much IOPS we're going to need for the system so that they know what to buy.

I tried the HP sizing tool suggested in the previous answers, but it only suggests HP products, without actual performance estimates. Any insight is appreciated.

EDIT: Main read-only dataset (where most of the queries will go) is a couple of gigs (order of magnitude 4gigs) on the disk. This will probably significantly affect the logical vs physical reads. Any insight how to get this ratio?

La solution

Disk I/O demand varies tremendously based on many factors, including:

How much data is already in RAM
Structure of your schema (indexes, row width, data types, triggers, etc)
Nature of your queries (joins, multiple single-row vs. row range, etc)
Data access methodology (ORM vs. set-oriented, single command vs. batching)
Ratio of reads vs. writes
Disk (database, table, index) fragmentation status
Use of SSDs vs. rotating media

For those reasons, the best way to estimate production disk load is usually by building a small prototype and benchmarking it. Use a copy of production data if you can; otherwise, use a data generation tool to build a similarly sized DB.

With the sample data in place, build a simple benchmark app that produces a mix of the types of queries you're expecting. Scale memory size if you need to.

Measure the results with Windows performance counters. The most useful stats are for the Physical Disk: time per transfer, transfers per second, queue depth, etc.

You can then apply some heuristics (also known as "experience") to those results and extrapolate them to a first-cut estimate for production I/O requirements.

If you absolutely can't build a prototype, then it's possible to make some educated guesses based on initial measurements, but it still takes work. For starters, turn on statistics:

SET STATISTICS IO ON

Before you run a test query, clear the RAM cache:

CHECKPOINT
DBCC DROPCLEANBUFFERS

Then, run your query, and look at physical reads + read-ahead reads to see the physical disk I/O demand. Repeat in some mix without clearing the RAM cache first to get an idea of how much caching will help.

Having said that, I would recommend against using IOPS alone as a target. I realize that SAN vendors and IT managers seem to love IOPS, but they are a very misleading measure of disk subsystem performance. As an example, there can be a 40:1 difference in deliverable IOPS when you switch from sequential I/O to random.

Autres conseils

You certainly cannot derive your estimates from logical reads. This counter really is not that helpful because it is often unclear how much of it is physical and also the CPU cost of each of these accesses is unknown. I do not look at this number at all.

You need to gather virtual file stats which will show you the physical IO. For example: http://sqlserverio.com/2011/02/08/gather-virtual-file-statistics-using-t-sql-tsql2sday-15/

Google for "virtual file stats sql server".

Please note that you can only extrapolate IOs from the user count if you assume that cache hit ratio of the buffer pool will stay the same. Estimating this is much harder. Basically you need to estimate the working set of pages you will have under full load.

If you can ensure that your buffer pool can always take all hot data you can basically live without any reads. Then you only have to scale writes (for example with an SSD drive).

Licencié sous: CC-BY-SA avec attribution

Non affilié à StackOverflow