Question

Is there any existing (non-proprietary) solution how to mirror current day data onto separate disk with minimum delay (under second), so that I can use smaller, non redundant but more expensive disk for high rate of reads? (specifically for SQL Server 2012)

More background:

  • We have a transnational DB that needs to be redundant, there are few dozen thousands of new rows daily (handled without any problem); but main perf impact on DB is from reads that happen few times per second. Reads need to have fresh data (at max 1 second old), but they are only interested in current day data. All writes need to go to disk array with also very small delay.
  • I can use small fast but non-redundant SSD disk to optimize reads.

There are few proprietary options - e.g. send each update to two separate nonrelated DBs - one slower and redundant; and other faster for reading.

However is there any existing solution for this - how to mirror all inserts to secondary DB (and purge each midnight)?

Was it helpful?

Solution

You can use an indexed view to store the current days data on a separate dataspace. You'll have to build a new view each day so you may want to use a synonym to do the actual switching so as to avoid locks on insert while creating the view. Make sure you end date the filter on the column. This is how i would do it using to agent jobs:

Job 1 (10 Minutes to midnight)

create view my_view_yyyymmdd (yyyymmdd = today + 1)
with schema_binding 
as 
select from mytable where datecol >=day+1 and datecol < today+2 --dateliterals;
create clustered index on my_view_yyyymmdd (yyyymmdd = today + 1)


Job2 (midnight + 1 second)
drop synonym mysyn;
create synonym mysyn for my_view_yyyymmdd (yyyymmdd = today )
drop view my_view_yyyymmdd my_view_yyyymmdd (yyyymmdd = today - 1)

You don't mention updates so a clustered index on date shouldn't fragment enough to worry about. You may want to set the fillfactor on the index to 100 which will reduce the size on disk by 20 % over the default. This will help further reduce the IO hit.

OTHER TIPS

You say in the comments that the data to be queried fits in RAM on the SQL Server box. This means that SQL Server will never perform even a single physical IO to read data. It all comes from its buffer pool.

Read latencies will be extremely low. Disk performance will not matter at all. If you properly structure and index the data read queries will execute blindingly fast.

Usually, having enough RAM to fit all data into memory is the single most beneficial thing you can do to performance.

You can use any high availability solution you like. It will only impact writes, not reads. We already concluded in the comments that writes do not matter much here.

In other words, I don't think you have a problem.

what about having two databases, one for today and another for historical data (yesterday and previously), with all the same tables. when new data is added just write it to today's database and at midnight( or whenever is most appropriate) write todays data to historical database and truncate todays tables. You can keep todays database on SSD and historical on whatever disks you choose.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top