Question

TL\DR

I'm looking for a way to efficiently identify the object located closest to the end of a SQL Server data file. This approach needs to remain performant against large data files.

What I have so far

The following query utilizes an undocumented Dynamic Management Function that shipped with SQL 2012: sys.dm_db_database_page_allocations; this DMF provides a rough equivalent to the DBCC IND command.

The following query identifies the last object in a given data file (Warning: Don't run this against a database larger than 25 GB unless you want to cancel it at some point):

-- Return object with highest Extent Page ID
SELECT   files.name as logical_file_name
        , files.physical_name as physical_file_name
        , OBJECT_SCHEMA_NAME(object_id) + N'.' + OBJECT_NAME(object_id) AS object_name
        , alloc.*
FROM sys.dm_db_database_page_allocations(DB_ID(), NULL, NULL, NULL, NULL) alloc
    INNER JOIN sys.database_files files
        ON alloc.extent_file_id = files.file_id
WHERE is_allocated = 1
    AND files.name = 'Logical_FileName'
ORDER BY files.name , files.physical_name, extent_page_id DESC

What's wrong with this approach

As the Warning above implies, this query will run slower as the size of the database increases because the function is really designed for a pointed approach to look at a specific object and not look at a specific data file in question. When passing in the NULL parameters as I did, this function likely iterates through all objects within the database behind the scenes and spits out the combined output. This accomplishes what I need, but it does so in a very brute-force way that doesn't lend itself to optimizations.

What I'm asking for

I'm hoping there's a way to iterate through the GAM, SGAM, and/or IAM chains to quickly identify the object at the end of a given data file. I'm assuming I have to push this approach outside of TSQL to something like PowerShell and go back to using DBCC PAGE calls, or something of that nature, traversing page allocation maps to find what the last object for a given data file is. ... and I'm hoping someone has already thrown that code together or knows these structures and/or the output of these undocumented procedures better than I do.

Why do I need this?

This is the inevitable question many will ask, so I'm just going to answer it out of the gate. The project I've been working on is to bring a legacy system up to speed, but after consolidating a bunch of heap tables together (which was a nightmare for other reasons), I am now left with a bunch of free space within my data files. I want to release this space back to the OS, however, the traditional approach of migrating objects to a different data file isn't viable at this stage because I don't have enough free space to work with on the system (until I am able to release more space from this data file).

I've resorted to disabling file growth and running a DBCC SHRINKFILE TRUNCATEONLY command nightly to free up any open pages toward the end of the data file, but this is a slow and arduous process that may work as often as it doesn't. I'm hoping to identify what the objects toward the end of the file are so I can rebuild them manually and free up space in a quicker timetable.

In Summary

Is there a way to quickly identify the name of the object located at the end of a given data file? The method I'm employing now is not meeting my needs and I'm open to using any approach available.

Was it helpful?

Solution

I think this will do it. This code basically does the following:

  1. Retrieve highest allocated page ID from the last GAM interval in the file
  2. Retrieve highest allocated page ID from the last SGAM interval in the file
  3. Compare two values to find highest page
  4. Identify the last ObjectId (table) from the Last allocated page
  5. Identify the Index defined on the Object as well as it's partition
  6. Provide a DBCC SHRINKFILE command that will release only the remaining white-space at the end of the file back to the OS (which should be immediate) and is effectively equivalent to DBCC SHRINKFILE using TRUNCATEONLY

This is nested in a cursor that iterates through Page IDs of the data files within the database and executes pretty quickly based on my localized testing. I've also added functionality to identify if the end of a data file is occupied by pages that are not reserved by tables or indexes, such as an IAM or PFS page.

SET NOCOUNT ON;

-- Create Temp Table to push DBCC PAGE results into
CREATE TABLE #dbccPage_output(
      ID                INT IDENTITY(1,1)
    , [ParentObject]    VARCHAR(255)
    , [Object]          VARCHAR(255)
    , [Field]           VARCHAR(255)
    , [Value]           VARCHAR(255)
)
GO

-- Variables to hold pointer information for traversing GAM and SGAM pages
DECLARE @GAM_maxPageID INT, @SGAM_maxPageID INT, @maxPageID INT,
        @GAM_page INT, @SGAM_page INT
DECLARE @stmt VARCHAR(2000)

-- Final Output Table
DECLARE @myOutputTable TABLE
(
      LogicalFileName   VARCHAR(255)
    , ObjectID          BIGINT
    , IndexID           BIGINT
    , PartitionID       BIGINT
    , MaxPageID         BIGINT
)

-- Cursor to iterate through each file
DECLARE cursorFileIds CURSOR
FOR
        SELECT file_id, size
        FROM sys.database_files
        WHERE type = 0

-- Variable to hold fileID
DECLARE @fileID INT, @size INT, @interval INT

-- Inject the data into the cursor
OPEN cursorFileIds
FETCH NEXT FROM cursorFileIds
INTO @fileID, @size

-- Enter the While Loop.  This loop will end when the
--  end of the data injected into the cursor is reached.
WHILE @@FETCH_STATUS = 0
BEGIN
        -- Truncate table (mainly used for 2nd pass and forward)
        TRUNCATE TABLE #dbccPage_output

        -- Referenced if we need to step back a GAM/SGAM interval
        STEPBACK:

        -- # of pages in a GAM interval
        SET @interval = @size / 511232
        -- Set GAM Page to read
        SET @GAM_page = CASE @interval WHEN 0 THEN 2 ELSE @interval * 511232 END
        -- Set SGAM page to read (always the next page after the GAM)
        SET @SGAM_page = CASE @interval WHEN 0 THEN 3 ELSE (@interval * 511232) + 1 END

        -- Search Last GAM Interval page
        SET @stmt = 'DBCC PAGE(0, ' + CAST(@fileID AS VARCHAR(10)) + ', ' + CAST(@GAM_page AS VARCHAR(20)) + ', 3) WITH TABLERESULTS, NO_INFOMSGS' -- GAM on Primary Datafile
        PRINT @stmt

        INSERT INTO #dbccPage_output ([ParentObject], [Object], [Field], [Value])
        EXEC (@stmt)

        -- Get Last Allocated Page Number
        SELECT TOP 1
                @GAM_maxPageID = REVERSE(SUBSTRING(REVERSE(Field), CHARINDEX(')', REVERSE(Field)) + 1, CHARINDEX(':', REVERSE(Field)) - CHARINDEX(')', REVERSE(Field)) - 1))
        FROM #dbccPage_output
        WHERE [Value] = '    ALLOCATED'
        ORDER BY ID DESC

        -- Truncate Table
        TRUNCATE TABLE #dbccPage_output

        -- Search Last SGAM Interval page
        SET @stmt = 'DBCC PAGE(0, ' + CAST(@fileID AS VARCHAR(10)) + ', ' + CAST(@SGAM_page AS VARCHAR(20)) + ', 3) WITH TABLERESULTS, NO_INFOMSGS' -- SGAM on Primary Datafile
        PRINT @stmt

        INSERT INTO #dbccPage_output ([ParentObject], [Object], [Field], [Value])
        EXEC (@stmt)

        -- Get Last Allocated Page Number
        SELECT TOP 1
                @SGAM_maxPageID = REVERSE(SUBSTRING(REVERSE(Field), CHARINDEX(')', REVERSE(Field)) + 1, CHARINDEX(':', REVERSE(Field)) - CHARINDEX(')', REVERSE(Field)) - 1))
        FROM #dbccPage_output
        WHERE [Value] = '    ALLOCATED'
        ORDER BY ID DESC

        -- Get highest page value between SGAM and GAM
        SELECT @maxPageID = MAX(t.value)
        FROM (VALUES (@GAM_maxPageID), (@SGAM_maxPageID)) t(value)

        TRUNCATE TABLE #dbccPage_output

        -- Check if GAM or SGAM is last allocated page in the chain, if so, step back one interval
        IF(@maxPageID IN (@GAM_page, @SGAM_page))
        BEGIN
            SET @size = ABS(@size - 511232)
            GOTO STEPBACK
        END

        -- Search Highest Page Number of Data File
        SET @stmt = 'DBCC PAGE(0, ' + CAST(@fileID AS VARCHAR(10)) + ', ' + CAST(CASE WHEN @maxPageID = @SGAM_maxPageID THEN @maxPageID + 7 ELSE @maxPageID END AS VARCHAR(50)) + ', 1) WITH TABLERESULTS, NO_INFOMSGS' -- Page ID of Last Allocated Object
        PRINT @stmt

        INSERT INTO #dbccPage_output ([ParentObject], [Object], [Field], [Value])
        EXEC (@stmt)

        -- Capture Object Name of DataFile
        INSERT INTO @myOutputTable
        SELECT (SELECT name FROM sys.database_files WHERE file_id = @fileID) AS LogicalFileName
            , CASE WHEN (SELECT [Value] FROM #dbccPage_output WHERE Field = 'm_type') IN ('1', '2') THEN -- If page type is data or index
                        CAST((SELECT [Value] FROM #dbccPage_output WHERE Field = 'Metadata: ObjectId') AS BIGINT)
                   ELSE CAST((SELECT [Value] FROM #dbccPage_output WHERE Field = 'm_type') AS BIGINT)
              END AS ObjectID
            , NULLIF(CAST((SELECT [Value] FROM #dbccPage_output WHERE Field = 'Metadata: IndexId') AS BIGINT), -1) AS IndexID
            , NULLIF(CAST((SELECT [Value] FROM #dbccPage_output WHERE Field = 'Metadata: PartitionId') AS BIGINT), 0) AS PartitionID
            , @maxPageID + 7 AS MaxPageID

        -- Reset Max Page Values
        SELECT @GAM_maxPageID = 0, @SGAM_maxPageID = 0, @maxPageID = 0

     -- Traverse the Data in the cursor
     FETCH NEXT FROM cursorFileIds
     INTO @fileID, @size
END

-- Close and deallocate the cursor because you've finished traversing all it's data
CLOSE cursorFileIds
DEALLOCATE cursorFileIds

-- page type values pt. 1: https://www.sqlskills.com/blogs/paul/inside-the-storage-engine-using-dbcc-page-and-dbcc-ind-to-find-out-if-page-splits-ever-roll-back/
-- page type values pt. 2: https://www.sqlskills.com/blogs/paul/inside-the-storage-engine-anatomy-of-a-page/
-- ObjectIDs of either 0 or 99: https://www.sqlskills.com/blogs/paul/finding-table-name-page-id/
-- Output Object Closest to the End
SELECT  t.LogicalFileName
    ,   CAST(CASE WHEN t.IndexID IS NULL THEN 
                CASE t.ObjectID
                    WHEN 0 THEN  '>>No MetaData Found<<'  -- This isn't m_type, rather ObjectID
                    WHEN 1 THEN  '>>Data Page<<'
                    WHEN 2 THEN  '>>Index Page<<'
                    WHEN 3 THEN  '>>Text Mix Page<<'
                    WHEN 4 THEN  '>>Text Tree Page<<'
                    WHEN 6 THEN  '>>DCM Page<<<'
                    WHEN 7 THEN  '>>Sort Page<<'
                    WHEN 8 THEN  '>>GAM Page<<'
                    WHEN 9 THEN  '>>SGAM Page<<'
                    WHEN 10 THEN '>>IAM Page<<'
                    WHEN 11 THEN '>>PFS Page<<'
                    WHEN 13 THEN '>>Boot Page<<'
                    WHEN 15 THEN '>>File Header Page<<'
                    WHEN 16 THEN '>>Diff Map Page<<'
                    WHEN 17 THEN '>>ML Map Page<<'
                    WHEN 18 THEN '>>Deallocated DBCC CHECKDB Repair Page<<'
                    WHEN 19 THEN '>>Temporary ALTER INDEX Page<<'
                    WHEN 20 THEN '>>Pre-Allocated BULK LOAD Page<<'
                    WHEN 99 THEN '>>Possible Page Corruption/Run DBCC CHECKDB<<'  -- This isn't m_type, rather ObjectID
                    ELSE CAST(t.ObjectID AS VARCHAR(50))
                END
            ELSE QUOTENAME(OBJECT_SCHEMA_NAME(t.ObjectID)) + '.' + QUOTENAME(OBJECT_NAME(t.ObjectID)) END AS VARCHAR(250)) AS TableName
    ,   QUOTENAME(i.name) AS IndexName
    ,   p.partition_number AS PartitionNumber
    ,   'DBCC SHRINKFILE(' + t.LogicalFileName + ', ' + CAST(CEILING((t.MaxPageID + 8) * 0.0078125) AS VARCHAR(50)) + ')' AS ShrinkCommand_Explicit
    ,   'DBCC SHRINKFILE(' + t.LogicalFileName + ', TRUNCATEONLY)' AS ShrinkCommand_TRUNCATEONLY
FROM @myOutputTable t
    LEFT JOIN sys.indexes i
        ON t.ObjectID = i.object_id
        AND t.IndexID = i.index_id
    LEFT JOIN sys.partitions p
        ON t.ObjectID = p.object_id
        AND t.PartitionID = p.partition_id

-- Cleanup
DROP TABLE #dbccPage_output
GO

OTHER TIPS

The following code checks each database page, from highest page number to lowest, to see if it is allocated. Once it finds the first allocated page, it then shows the object associated with that page. It's not guaranteed to work since the last allocated page may not reference an actual object; however it should work most of the time.

SET NOCOUNT ON;
IF OBJECT_ID(N'tempdb..#dbcrep', N'U') IS NOT NULL
DROP TABLE #dbcrep;
CREATE TABLE #dbcrep
(
        ParentObject VARCHAR(128)
        , [Object] VARCHAR(128)
        , [Field] VARCHAR(128)
        , VALUE VARCHAR(2000)
);
DECLARE @cmd nvarchar(max);
DECLARE @PageNum int;
DECLARE @PageCount int;
DECLARE @FileID int;
DECLARE @Status varchar(2000);

SET @FileID = 1;

SET @PageCount = (
    SELECT df.size
    FROM sys.database_files df
    WHERE df.file_id = @FileID
    );
SET @PageNum = @PageCount - 1;
WHILE @PageNum > 0
BEGIN
    SET @cmd = N'DBCC PAGE (''' + DB_NAME() + N''', ' + CONVERT(nvarchar(20), @FileID) + N', ' + CONVERT(nvarchar(20), @PageNum) + N', 0) WITH TABLERESULTS, NO_INFOMSGS;';
    DELETE FROM #dbcrep;
    INSERT INTO #dbcrep (ParentObject, [Object], [Field], [VALUE])
    EXEC sys.sp_executesql @cmd;
    SELECT @Status = VALUE
    FROM #dbcrep
    WHERE ParentObject = 'PAGE HEADER:'
        AND Object = 'Allocation Status'
        AND Field LIKE 'GAM %';
    SET @PageNum -= 1;
    PRINT @Status;
    IF @Status <> 'NOT ALLOCATED' BREAK
END

SELECT ObjectName = s.name + N'.' + o.name
    , d.*
FROM #dbcrep d
    LEFT JOIN sys.all_objects o ON d.VALUE = o.object_id
    LEFT JOIN sys.schemas s ON o.schema_id = s.schema_id
WHERE ParentObject = 'PAGE HEADER:'
    AND Object = 'Page @0x00000001BA28E000'
    AND Field = 'Metadata: ObjectId';

We get the number of allocated pages for the given file_id in the current database, then use a loop to inspect each page via DBCC PAGE, saving that output into a temporary table. The temp table is then joined to sys.all_objects to obtain the name of the object the page is allocated to.

In my test rig, I see the following results:

╔════════════════════╦══════════════╦══════════════════════════╦════════════════════╦════════════╗
║ ObjectName         ║ ParentObject ║ Object                   ║ Field              ║ VALUE      ║
╠════════════════════╬══════════════╬══════════════════════════╬════════════════════╬════════════╣
║ dbo.EmptyDatabases ║ PAGE HEADER: ║ Page @0x00000001BA28E000 ║ Metadata: ObjectId ║ 1938105945 ║
╚════════════════════╩══════════════╩══════════════════════════╩════════════════════╩════════════╝

The #dbcrep temp table contains the following details:

╔══════════════╦══════════════════════════╦═══════════════════════════════╦════════════════════╗
║ ParentObject ║ Object                   ║ Field                         ║ VALUE              ║
╠══════════════╬══════════════════════════╬═══════════════════════════════╬════════════════════╣
║ BUFFER:      ║ BUF @0x0000000200E95B80  ║ bpage                         ║ 0x00000001BA28E000 ║
║ BUFFER:      ║ BUF @0x0000000200E95B80  ║ bhash                         ║ 0x0000000000000000 ║
║ BUFFER:      ║ BUF @0x0000000200E95B80  ║ bpageno                       ║ (1:42743)          ║
║ BUFFER:      ║ BUF @0x0000000200E95B80  ║ bdbid                         ║ 7                  ║
║ BUFFER:      ║ BUF @0x0000000200E95B80  ║ breferences                   ║ 0                  ║
║ BUFFER:      ║ BUF @0x0000000200E95B80  ║ bcputicks                     ║ 0                  ║
║ BUFFER:      ║ BUF @0x0000000200E95B80  ║ bsampleCount                  ║ 0                  ║
║ BUFFER:      ║ BUF @0x0000000200E95B80  ║ bUse1                         ║ 10982              ║
║ BUFFER:      ║ BUF @0x0000000200E95B80  ║ bstat                         ║ 0x9                ║
║ BUFFER:      ║ BUF @0x0000000200E95B80  ║ blog                          ║ 0x2121215a         ║
║ BUFFER:      ║ BUF @0x0000000200E95B80  ║ bnext                         ║ 0x0000000000000000 ║
║ PAGE HEADER: ║ Page @0x00000001BA28E000 ║ m_pageId                      ║ (1:42743)          ║
║ PAGE HEADER: ║ Page @0x00000001BA28E000 ║ m_headerVersion               ║ 1                  ║
║ PAGE HEADER: ║ Page @0x00000001BA28E000 ║ m_type                        ║ 20                 ║
║ PAGE HEADER: ║ Page @0x00000001BA28E000 ║ m_typeFlagBits                ║ 0x0                ║
║ PAGE HEADER: ║ Page @0x00000001BA28E000 ║ m_level                       ║ 0                  ║
║ PAGE HEADER: ║ Page @0x00000001BA28E000 ║ m_flagBits                    ║ 0x204              ║
║ PAGE HEADER: ║ Page @0x00000001BA28E000 ║ m_objId (AllocUnitId.idObj)   ║ 227                ║
║ PAGE HEADER: ║ Page @0x00000001BA28E000 ║ m_indexId (AllocUnitId.idInd) ║ 256                ║
║ PAGE HEADER: ║ Page @0x00000001BA28E000 ║ Metadata: AllocUnitId         ║ 72057594052804608  ║
║ PAGE HEADER: ║ Page @0x00000001BA28E000 ║ Metadata: PartitionId         ║ 72057594043301888  ║
║ PAGE HEADER: ║ Page @0x00000001BA28E000 ║ Metadata: IndexId             ║ 1                  ║
║ PAGE HEADER: ║ Page @0x00000001BA28E000 ║ Metadata: ObjectId            ║ 1938105945         ║
║ PAGE HEADER: ║ Page @0x00000001BA28E000 ║ m_prevPage                    ║ (0:0)              ║
║ PAGE HEADER: ║ Page @0x00000001BA28E000 ║ m_nextPage                    ║ (0:0)              ║
║ PAGE HEADER: ║ Page @0x00000001BA28E000 ║ pminlen                       ║ 8                  ║
║ PAGE HEADER: ║ Page @0x00000001BA28E000 ║ m_slotCnt                     ║ 0                  ║
║ PAGE HEADER: ║ Page @0x00000001BA28E000 ║ m_freeCnt                     ║ 8096               ║
║ PAGE HEADER: ║ Page @0x00000001BA28E000 ║ m_freeData                    ║ 96                 ║
║ PAGE HEADER: ║ Page @0x00000001BA28E000 ║ m_reservedCnt                 ║ 0                  ║
║ PAGE HEADER: ║ Page @0x00000001BA28E000 ║ m_lsn                         ║ (321:6718:151)     ║
║ PAGE HEADER: ║ Page @0x00000001BA28E000 ║ m_xactReserved                ║ 0                  ║
║ PAGE HEADER: ║ Page @0x00000001BA28E000 ║ m_xdesId                      ║ (0:0)              ║
║ PAGE HEADER: ║ Page @0x00000001BA28E000 ║ m_ghostRecCnt                 ║ 0                  ║
║ PAGE HEADER: ║ Page @0x00000001BA28E000 ║ m_tornBits                    ║ 1253867700         ║
║ PAGE HEADER: ║ Page @0x00000001BA28E000 ║ DB Frag ID                    ║ 1                  ║
║ PAGE HEADER: ║ Allocation Status        ║ GAM (1:2)                     ║ ALLOCATED          ║
║ PAGE HEADER: ║ Allocation Status        ║ SGAM (1:3)                    ║ NOT ALLOCATED      ║
║ PAGE HEADER: ║ Allocation Status        ║ PFS (1:40440)                 ║ 0x0   0_PCT_FULL   ║
║ PAGE HEADER: ║ Allocation Status        ║ DIFF (1:6)                    ║ NOT CHANGED        ║
║ PAGE HEADER: ║ Allocation Status        ║ ML (1:7)                      ║ NOT MIN_LOGGED     ║
╚══════════════╩══════════════════════════╩═══════════════════════════════╩════════════════════╝
Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top