Question

In an ssis Dataflow there is a lookup component ,which lookup on a table with 18 million records.I have configured the lookup with full cache.

   Default buffer size :20485760 
Default Buffer Max rows: 100000 
The lookup join is based on an ID column of varchar(13)type

It gives an error as shown below.What lookup configuration is suitable to cache these many records

Error: The buffer manager cannot write 8 bytes to file "C:\Users\usrname\AppData\Local\Temp\16\DTS{B98CD347-1EF1-4BC1-9DD9-C1B3AB2B8D73}.tmp". There was insufficient disk space or quota.

what would be the difference in performance if i use a lookup with no cache? I did understand that in full cache mode ,the data is cached before pre execute stage and do not have to go back to database.This full cache memory takes large amount of memory and add aditional startup time for the data flow.My question is what configuration do i have to setup in order to handle large amount of data in full cache mode

Whats the solution if the lookup table has million records (and they dont fit in a full cache)

Was it helpful?

Solution

Use a Merge Join component instead. Sort both inputs on the join key, specify inner/left/full joins based on your specification. Use the different outputs to get functionality like the lookup component.

Merge Join usually performs better on larger datasets.

OTHER TIPS

You can set Buffertempstoragepath property in SSIS to some of the fastdrives as Blobtempstoragepath and buffertempstoragepath will be using temp and tmp system variables. So if the tmp variable cannot hold the large dataset in you case, its using lookup transformation. So the largedataset will be using the drive space and will perform the job for you.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top