Most Space-Efficient Way to Store a Byte Array in a Database Table - ASP.NET

Question 1

I hate to be a jerk and answer my own question, but I thought I'd summarize my findings into a complete answer for anyone else looking to space-efficiently store file/image data within a database:

* Using varbinary(MAX) versus Image?

Many reasons for using varbinary(MAX), but top among them is that Image is deprecated and in a future version of SQL it will be removed altogether. Not starting any new projects with it is just nipping a future problem in the bud.

According to the info in this question: SQL Server table structure for storing a large number of images, varbinary(MAX) has more operations available to be used on it.

Varbinary(MAX) is easy to stream from a .NET application by using an SQL Parameter. Negative one is for 'MAX' length. Like so:

SQLCommand1.Parameters.Add("@binaryValue", SqlDbType.VarBinary, -1).Value = compressedBytes;

* What compression algorithm to use?

I'm really not much closer to a decent answer on this one. I used ICSharpCode.SharpZipLib.Gzip and found it had better performance than the built in zipping functions simply by running it on a bunch of stuff and comparing it.

My results:

I reduced my total file size by about 20%. Unfortunately, a lot of the files I had were PDFs which don't compress that well, but there was still some benefit. Not much luck (obviously) with file types that were already compressed.

Question 2

that's not a byte array, that's a BLOB. 10 years ago, you would have used the IMAGE datatype.

these days, it's more efficient to use VARBINARY(MAX) I really reccomend that people use FILESTREAM for VarBinary(Max) as it makes backing up the database (without the blobs) quite easy.

Keep in mind that using the native formats (without compression) will allow full text searches.. Which is pretty incredible if you think about it. You have to install some iFilter from Adobe for searching inside PDF.. but it's a killer feature, I can't live without it.