Question

The BI project I'm currently working on receives it's data from a complex processes based in SPs that use (too) many temp tables. This is something we do not have control over (or I would've changed it).
Our part in this project is mainly based on SSIS.

The processes run fine but the tempdb fills up relatively fast and then we start getting errors.

I'm looking for 2 things:

  1. How to prevent the tempdb from "overflowing" (or at least slow it down)?
  2. How to clean up the tempdb when it does overflow? (preferably using SSIS)
Was it helpful?

Solution

The first question is why tempdb is filling but you've already identified the source of the fill as most likely due to excessive temporary table usage. It sounds as if reworking the stored procedures to be use fewer resources is not an option (correct me if I am wrong).

The next thing I would look at is adding more space for tempdb but that's a bandage and one I'm sure you've already tried.

Something you can do is look at how your tempdb is defined. What is your initial size and what is the growth pattern? Right click on tempdb and select properties or have a DBA perform this. Every time the SQL Server service restarts, tempdb is dropped and recreated and the defaults for sizing and growth are not sane values for a production box. If you see 8 MB data and 1MB log with 10% growth pattern, you have an excellent opportunity to improve performance for everyone using the instance. I won't go into the details but this leads to lots of little file growth which is bad all around for performance. Maximizing TempDB performance by correct initial sizing and msdn on Optimizing tempdb Performance. Fixing tempdb growth won't magically make your problems go away but if you're tinkering with things, it won't hurt.

The one thing you can control is SSIS. I assume SSIS is calling these stored procs and such for each of the tables you're loading? You could look at serializing them, assuming you have multiple packages and/or dataflows running in parallel. This will increase your total run time but should spread your tempdb cost out over a longer period of time.

Another option is to look at your Isolation level in your packages. I'm weak in this area but my understanding is that different isolation levels have different impacts on tempdb usage. My, probably flawed, understanding is Snapshot basically makes a copy of the table you are updating in tempdb to allow others to continue accessing the table and only when you are done does it push those changes back into the real table. This would cost more tempdb space than a different isolation level. There is no freebie though, so if you switch it to to the default of Serializable it might lower tempdb usage but at the cost of higher blocking/less concurrent access.

References

OTHER TIPS

you cant! If you have a process A (the complex processes you mentioned) that create a lot of data on the tempDB, you can use a process B to clean that data, otherwise you may break process A.

I know you said you cant but the only solution would be to find a way of improving process A so it wont use too much space. Or increase the space of the temp DB

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top