Question

I have a parallelism issue with my first "advanced" SSIS package.

First, here is the end of the package : SSIS Package

What does this package :

  • loads data from Excel and performs some operations / aggregations / controls
  • based on a control, it determines if the row should be integrated (left branch) or not (right branch)
  • if it is ok, some calculations are performed, the row is added and a log is written with some stats (this is the OLE DB Command of the left branch)
  • if the row is rejected, it is added in a rejection file and a log is written with some stats (OLE DB Command of the right branch)

Each file load generates only ONE log row, so success and failure logs info are written on the same row (identified by execution guid and imported file name as it can imported several files during the same execution), but the columns filled are not the same.

Both OLE DB Commands call the same stored procedure that :

  • Checks if the row exists
  • If it doesn't exists, it adds a new row
  • Else it updates the correct columns

This was working pretty well until someone raised an issue : sometimes an execution generates 2 rows : one containing the success info and one containing the rejected ones.
After analysing, I noticed that it may happen that both OLE DB Commands are executed at the same time, so both see no existing row and both insert a new one.

I'd like to know if there is a way to prevent the OLE DB Commands to be executed at the same time. Performance is not an issue here, so I don't care if it generates higher execution times.

The only solution I see is adding a unique constraint on the table and catching the error but I'm not sure this is a really "elegant" solution (I don't like the idea of duplicating the UPDATE statement, the "normal" one and the "exception" one) so I'm looking for a solution that doesn't involve catching exceptions !
Of course, if there is no better solution, I guess I'll do that.

Was it helpful?

Solution

You should be able to merge the two log paths into a single write operation. Unfortunately this means they will need to be sorted: http://msdn.microsoft.com/en-us/library/ms141703.aspx

Note that calling an OLEDBCommand over and over like this isn't that great of an idea for large operations - sometimes it's a lot better to bring all the data into a staging table and then use a single command to do the operation.

OTHER TIPS

Since you are using a stored proc in an OLEDB command, you can just use a union. that way you don't have to sort. The OLEDB command will process once for each row.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top