Question

I have a financial system that needs to attach PDF receipts for each record saved on my system.

These receipts will be scanned by a proper device attached to the computer and saved in PDF to be stored in my database.

Today the system saves the PDF file as varbinary(max), but because of the number of rows in my table, the filesize of my DB is increasing too fast. The average file size is about 1 to 2 MB.

What is the best way to store these kind of files without compromising my database performance?

Was it helpful?

Solution

You are not sacrificing performance as in database performance using varbinary(MAX). You are not searching on the varbinary(MAX). You are not indexing the column.

What is nice about in the table is single consistent backup. The down side is a bigger backup.

Delivering the file to the client varbinary(MAX) is going to be less efficient than FILESTEAM or file system.

1 - 2 mb is relatively small. If it was 200+ mb then you typically would want to avoid storing in varbinary(MAX).

If table size alone is problem then FILESTREAM is probably your best solution. This will not hurt database performance and files are not stored in a table.

FILESTREAM enables SQL Server-based applications to store unstructured data, such as documents and images, on the file system. Applications can leverage the rich streaming APIs and performance of the file system and at the same time maintain transactional consistency between the unstructured data and corresponding structured data.

From a licensing perspective I am pretty sure FILESTREAM does not count as database size. Like for EXPRESS with a 10GB database limit the FILESTREAM does not count.

Another benefit is if you have OCR on the PDF with an Adobe iFilter you can Full Text search the document. But you may be able to do that with FILESTREAM also - from what I see no.

Or you can just manage the file totally separate and only store a path in SQL.

OTHER TIPS

The best way, just as Documentum and other major packages use, is to store a path and filename in the database, and put the file in that location.

For even smaller database size, have one table of common paths, and then make the filename based on the short primary or candidate key of that row; you'd only need to store the path ID and the extension, then.

If your financial compliance rules require integrity or tamper protection, you could check with your compliance or legal group to see if storing a hash in the database or a public/private key signature in that location would suffice.

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top