Can I split a single SQL 2008 DB Table into multiple filegroups, based on a discriminator column?

https://stackoverflow.com/questions/4675218

10-10-2019
|

Question

I've got a SQL Server 2008 R2 database which has a number of tables. Two of these tables contains a lot of large data .. mainly because one of them is VARBINARY(MAX) and the sister table is GEOGRAPHY. (Why two tables? Read Below if you're interested***)

The data in these tables are geospatial shapes, such as zipcode boundaries.

Now, the first 70K odd rows are for DataType = 1 the rest 5mil rows are for DataType = 2

Now, is it possible to split the table data into two files? so all rows that are for DataType != 2 goes into File_A and DataType = 2 goes into File_B?

This way, when I backup the DB, I can skip adding File_B so my download is waaaaay smaller? Is this possible?

I guessing you might be thinking -> why not keep them as TWO extra tables? Mainly because in the code, the data is conceptually the same .. it's just happens that I want to split the storage of this model data. It really messes up my model if I now how two aggregates in my model, instead of one.

***Entity Framework doesn't like Tables with GEOGRAPHY, so i have to create a new table which transforms the GEOGRAPHY to VARBINARY, and then drop that into EF.

Solution

It's a bit of overkill, but you could use table paritioning to do this, as each partition can be mapped to a distinct file group. Some caveats:

Table partitioning is only available in Enterprise (and developer) edition
Like clustered indexes, you only get one, so be sure that this is how you'd want to partition your tables
I'm not sure how well this would play out against "selective backups" or, much more importantly, partial restores. You'd want to test a lot of oddball recovery scenarios before going to Production with this

An older-fashioned way to do it would be to set up partitioned views. This gives you two tables, true, but the partitioned view "structure" is pretty solid and fairly elegant, and you wouldn't have to worry about having your data split across multiple backup files.

OTHER TIPS

I think you might want to look into data partitioning. You can partition your data into multiple file groups, and therefore files, based on a key value, such as your DataType column.

Data partitioning can also help with performance. So, if you need that too, you can check out partition schemes for your indexes as well.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow