Question

Some background:

  • SQL 2012 Enterprise
  • OLTP table currently at 1B records
  • records increase 5-10M per day
  • SAN lun with flash storage
  • two indexes (one clustered identity int)

What we have noticed is a gradual slowing of INSERTs over time, which will eventually become an issue to the business if left unchanged.

The question: We are planning to transition to use partitioning on this table soon, primarily for space management, but the main question here is whether partitioning can help improve the performance of INSERTs. (We know about performance increases for SELECTs already, please do not comment on that side of things if possible.)

I have done a fair amount of searching forums, etc, but have not found anything specific to INSERTs and partitioning. It seems reasonable to me that this would help maintain speed over time, given that we will partition on "date inserted" and that the INSERTs would always be directed to the most recent partition, but looking for confirmation.

Thanks all.

UPDATE: To clarify: a staging table (for switching in) is not an option due to the OLTP nature and needing to read the data out immediately after inserting, both of which cannot be delayed awaiting a switch in.

Was it helpful?

Solution

Short answer: yes it can help, because it's theoretically instantaneous. You would insert your data into a staging table with the same definition as your main partitioned table, and then switch it into the partitioned table, which is a metadata operation (schema lock).

Long answer: it might make performance suffer over the long run to have such a large table, and unless you move to later versions of SQL Server, statistics updates are quite difficult to manage.

If this is a read-heavy table, I'd consider using view partitioning instead. Each month (for example) would get its own table, with check constraints on a date column to help the query optimizer know where the data is physically stored, and a view over the top of that:

SELECT col1, col2, col3, col4 FROM period201501
UNION ALL
SELECT col1, col2, col3, col4 FROM period201502
...
SELECT col1, col2, col3, col4 FROM period201608

(or whatever). Then your metadata operation is the update of the view, as opposed to the switching in of the partitioned table.

OTHER TIPS

It depends on a lot of circumstances. For the most part Partitioning is primarily a benefit via concurrency for INSERTS/UPDATES/DELETES and query optimizer shortcuts when every related query uses the partitioning filter. Usually a smaller partition will perform better for the former as you surmised.

However, the partitioning overhead can just as likely cause performance problems depending on the complexity involved. You should try looking over your indexing and table design before trying partitioning, but if you're sure about moving to partitioning (or need to for disk reasons) the best possible INSERT performance you can get out of Partitioning relies on Partition Switching.

The high level premise is you insert the data into a staging heap table throughout the day and then when you're ready to move it into the partition apply the matching indexes and constraints. You can then use the Switch commands to pop the new 'partition' into your larger partition table as a new partition. Details can be found here and a more detailed overview here.

This is primarily used for data warehousing so depending on your primary key structure and/or business needs this INSERT method may not work for you. You can also make that staging table an active component versus a heap you're dumping into with the added performance implications of doing such.

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top