Question

I have a database schema that I've inherited where all the integer columns have been defined as int but because of business rules many could be replaced by tinyiny and other data types that would require less storage. While I don't expect much of a performance improvement by using the smaller types my reasons for wanting to do so are:

  • The application uses SQL Server Express so I'd like to make the most of the 10GB database size before needing to purge data.

  • The application is shipped with a full initial copy of the database and updates are synced using the Microsoft Sync Framework. Some client PCs are on relatively slow Internet connections so the smaller the database size the better.

I was wondering if there was some way to quickly identify columns that may be 'oversized'? I realize that the results will need to be carefully reviewed to make sure that the column will never exceed the limits of the new data type.

Was it helpful?

Solution

You might be able to save yourself a lot of work if you can upgrade to SQL Server 2016 SP1 or later. Data compression is available even in Express edition starting with that service pack. You could simply apply row compression to all tables and indexes to save even more space than what you can get from changing data types. That's because the data types that you mention only use bytes needed to fit the actual values with row compression. Screenshot from the documentation:

enter image description here

OTHER TIPS

The following stored procedure will help identify those columns. It starts by creating a temporary table that stores the minimum and maximum values that each integer type can hold and then runs a dynamic query to find the minimum and maximum value for every table / column that defines one of those integer types. Once the min/max values have been determined it looks up the smallest type that can hold the value current values and gives a result such as the following:

table_name      column_name         current_type    min_val max_val proposed_type   proposed_min    proposed_max    space_saved
RideLegVetting  RideNumber          int             1       21      tinyint         0               255             2025654
RideLegVetting  LegNumber           int             0       99      tinyint         0               255             2025654

For each column it shows the curent type and min/max values found in the table along with a new proposed type and the min/max value it can hold. Based on the cardinality it also indicates an estimate of the amount of space that will be saved. The stored procecedure is as follows:

CREATE PROCEDURE OptimizeIntSizes AS
BEGIN
    SET NOCOUNT ON
    CREATE TABLE #DataTypeRanges
    (
        name nvarchar(128),
        size tinyint,
        low bigint,
        high bigint
    )
    INSERT INTO #DataTypeRanges VALUES (N'tinyint', 1, 0, 255)
    INSERT INTO #DataTypeRanges VALUES (N'smallint', 2, -32768, 32767)
    INSERT INTO #DataTypeRanges VALUES (N'int', 4, -2147483648, 2147483647)
    INSERT INTO #DataTypeRanges VALUES (N'bigint', 8, -9223372036854775808, 9223372036854775807) 
    CREATE TABLE #Results
    (
        table_name nvarchar(128),
        column_name nvarchar(128),
        current_type nvarchar(128),
        min_val bigint,
        max_val bigint,
        proposed_type nvarchar(128),
        proposed_min bigint,
        proposed_max bigint,
        space_saved bigint
    )
    DECLARE @table_name nvarchar(128)
    DECLARE @column_name nvarchar(128)
    DECLARE @current_type nvarchar(128)
    DECLARE @proposed_type nvarchar(128)
    DECLARE @low bigint
    DECLARE @high bigint
    DECLARE @size tinyint
    DECLARE @cardinality bigint
    DECLARE @min_val bigint
    DECLARE @max_val bigint
    DECLARE @proposed_min bigint
    DECLARE @proposed_max bigint
    DECLARE @proposed_size tinyint
    DECLARE @space_saved bigint
    DECLARE @sql nvarchar(max)
    DECLARE @params nvarchar(max)
    SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
    DECLARE FindCursor CURSOR STATIC FOR
        SELECT tables.name AS table_name, cols.name AS column_name, types.name, dtr.size, dtr.low, dtr.high
        FROM sys.columns cols
            JOIN sys.tables tables ON tables.object_id = cols.object_id
            JOIN sys.types types ON types.system_type_id = cols.system_type_id
            JOIN #DataTypeRanges dtr ON dtr.name COLLATE DATABASE_DEFAULT = types.name COLLATE DATABASE_DEFAULT
    OPEN FindCursor
    FETCH FindCursor INTO @table_name, @column_name, @current_type, @size, @low, @high
    WHILE @@FETCH_STATUS = 0
    BEGIN
        SET @sql = N'SELECT @cardinality_out = COUNT(1), @min_val_out = MIN([' + @column_name + ']), @max_val_out = MAX([' + @column_name + ']) FROM [' + @table_name + ']'
        SET @params = N'@cardinality_out bigint OUTPUT, @min_val_out bigint OUTPUT, @max_val_out bigint OUTPUT'
        EXECUTE sp_executesql @sql, @params, @cardinality_out = @cardinality OUTPUT, @min_val_out = @min_val OUTPUT, @max_val_out = @max_val OUTPUT
        SELECT TOP 1 @proposed_type = name, @proposed_size = size, @proposed_min = low, @proposed_max = high
            FROM #DataTypeRanges
            WHERE @min_val >= low  AND @max_val <= high
            ORDER BY size
        IF @proposed_type <> @current_type AND @max_val IS NOT NULL
        BEGIN
            SET @space_saved = (@size - @proposed_size) * @cardinality
            INSERT INTO #Results VALUES (@table_name, @column_name, @current_type, @min_val, @max_val, @proposed_type, @proposed_min, @proposed_max, @space_saved)
        END
        FETCH FindCursor INTO @table_name, @column_name, @current_type, @size, @low, @high
    END
    CLOSE FindCursor
    DEALLOCATE FindCursor
    SELECT * FROM #Results ORDER BY space_saved DESC
END
Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top