space usage on sys.allocation_units and sp_spaceused
-
12-12-2020 - |
Question
It is a known fact that the DMVs dont hold accurate information regarding number of pages and count of rows. However, when you have the stats updated, I can't see why they wouldn't.
I am working on a monitoring tool, want to know disk size of each index and data, etc. Eventually I would like to find the right fill factor, and other things etc.
The space used by my function and the old sp_spaceused differs a little bit on the space usage, but not on record count.
Can you see if there is anything missing in my select?
this is the sp_spaceused (then I convert the numbers in MB):
sp_spaceused 'tblBOrderRelationship'
go
select 318008/1024.00 AS reserved,
140208/1024.00 AS data,
177048/1024.00 AS index_size,
752/1024.00 AS unused
But when I run my select, code below\picture below, I get slightly different figures.
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
SELECT
schema_name(t.schema_id) as SchemaName,
t.NAME AS TableName,
t.type_desc,
t.is_ms_shipped,
t.is_published,
t.lob_data_space_id,
t.filestream_data_space_id,
t.is_replicated,
t.has_replication_filter,
t.is_merge_published,
t.is_sync_tran_subscribed,
--t.is_filetable,
i.name as indexName,
i.type_desc,
i.is_unique,
i.is_primary_key,
i.is_unique_constraint,
i.fill_factor,
i.is_padded,
sum(p.rows) OVER (PARTITION BY t.OBJECT_ID,i.index_id) as RowCounts,
sum(a.total_pages) OVER (PARTITION BY t.OBJECT_ID,i.index_id) as TotalPages,
sum(a.used_pages) OVER (PARTITION BY t.OBJECT_ID,i.index_id) as UsedPages,
sum(a.data_pages) OVER (PARTITION BY t.OBJECT_ID,i.index_id) as DataPages,
(sum(a.total_pages) OVER (PARTITION BY t.OBJECT_ID,i.index_id) * 8) / 1024 as TotalSpaceMB,
(sum(a.used_pages) OVER (PARTITION BY t.OBJECT_ID,i.index_id) * 8) / 1024 as UsedSpaceMB,
(sum(a.data_pages) OVER (PARTITION BY t.OBJECT_ID,i.index_id) * 8) / 1024 as DataSpaceMB
FROM
sys.tables t
INNER JOIN
sys.indexes i ON t.OBJECT_ID = i.object_id
INNER JOIN
sys.partitions p ON i.object_id = p.OBJECT_ID AND i.index_id = p.index_id
INNER JOIN
sys.allocation_units a ON p.partition_id = a.container_id
WHERE
t.NAME NOT LIKE 'dt%' AND
i.OBJECT_ID > 255
AND T.NAME = 'tblBOrderRelationship'
the figures
the bigger picture, including the index names
Now doing some calculations to check the results:
--==================================
-- the figures from sp_spaceused
--==================================
select 318008/1024.00 AS reserved,
140208/1024.00 AS data,
177048/1024.00 AS index_size,
752/1024.00 AS unused
--==================================
-- the figures from my select
--==================================
select 137+61+56+54 AS reserved,
137 AS data,
61+56+54 AS index_size
It is not so far off, really, apart the fact I did not calculate the unused space!
What can I do to make this accurate?
AFTER CHANGES:
After I replaced 1024 by 1024.00 the results are much more accurate. I noticed records have been inserted into the table in question, and obviously the stats are not so up to date, but still the results match (under 1 MB difference - which is all right for me)
The new result sets are:
--==================================
-- the figures from sp_spaceused
--==================================
select
318072 /1024.00 AS reserved,
140208 /1024.00 AS data,
177096 /1024.00 AS index_size,
768 /1024.00 AS unused
go
--==================================
-- the figures from my select
--==================================
select 137.7578125+61.7968750+56.4218750+54.6406250 as reserved,
137.7578125 as data,
61.7968750+56.4218750+54.6406250 as index_size
Solution
Even though you fixed the immediate rounding issue, the overall algorithm to get per-object / index stats is incorrect. It does not properly handle LOB and row-overflow data. It also excludes: Indexed Views, FullText indexes, XML indexes, and a few other cases. Hence, you might not be seeing all of your data.
The following is an adaptation of the code I posted to an answer on StackOverflow ( sp_spaceused - How to measure the size in GB in all the tables in SQL ) that handles all of the cases that sp_spaceused
handles. That S.O. question was only concerned with per-object stats, not per index, so I have adjusted the code to handle things at the index level.
;WITH agg AS
( -- Get info for Tables, Indexed Views, etc
SELECT ps.[object_id] AS [ObjectID],
ps.index_id AS [IndexID],
NULL AS [ParentIndexID],
NULL AS [PassThroughIndexName],
NULL AS [PassThroughIndexType],
SUM(ps.in_row_data_page_count) AS [InRowDataPageCount],
SUM(ps.used_page_count) AS [UsedPageCount],
SUM(ps.reserved_page_count) AS [ReservedPageCount],
SUM(ps.row_count) AS [RowCount],
SUM(ps.lob_used_page_count + ps.row_overflow_used_page_count)
AS [LobAndRowOverflowUsedPageCount]
FROM sys.dm_db_partition_stats ps
GROUP BY ps.[object_id],
ps.[index_id]
UNION ALL
-- Get info for FullText indexes, XML indexes, Spatial indexes, etc
SELECT sit.[parent_id] AS [ObjectID],
sit.[object_id] AS [IndexID],
sit.[parent_minor_id] AS [ParentIndexID],
sit.[name] AS [PassThroughIndexName],
sit.[internal_type_desc] AS [PassThroughIndexType],
0 AS [InRowDataPageCount],
SUM(ps.used_page_count) AS [UsedPageCount],
SUM(ps.reserved_page_count) AS [ReservedPageCount],
0 AS [RowCount],
0 AS [LobAndRowOverflowUsedPageCount]
FROM sys.dm_db_partition_stats ps
INNER JOIN sys.internal_tables sit
ON sit.[object_id] = ps.[object_id]
WHERE sit.internal_type IN
(202, 204, 207, 211, 212, 213, 214, 215, 216, 221, 222, 236)
GROUP BY sit.[parent_id],
sit.[object_id],
sit.[parent_minor_id],
sit.[name],
sit.[internal_type_desc]
), spaceused AS
(
SELECT agg.[ObjectID],
agg.[IndexID],
agg.[ParentIndexID],
agg.[PassThroughIndexName],
agg.[PassThroughIndexType],
OBJECT_SCHEMA_NAME(agg.[ObjectID]) AS [SchemaName],
OBJECT_NAME(agg.[ObjectID]) AS [TableName],
SUM(CASE
WHEN (agg.IndexID < 2) THEN agg.[RowCount]
ELSE 0
END) AS [Rows],
SUM(agg.ReservedPageCount) * 8 AS [ReservedKB],
SUM(agg.LobAndRowOverflowUsedPageCount +
CASE
WHEN (agg.IndexID < 2) THEN (agg.InRowDataPageCount)
ELSE 0
END) * 8 AS [DataKB],
SUM(agg.UsedPageCount - agg.LobAndRowOverflowUsedPageCount -
CASE
WHEN (agg.IndexID < 2) THEN agg.InRowDataPageCount
ELSE 0
END) * 8 AS [IndexKB],
SUM(agg.ReservedPageCount - agg.UsedPageCount) * 8 AS [UnusedKB],
SUM(agg.UsedPageCount) * 8 AS [UsedKB]
FROM agg
GROUP BY agg.[ObjectID],
agg.[IndexID],
agg.[ParentIndexID],
agg.[PassThroughIndexName],
agg.[PassThroughIndexType],
OBJECT_SCHEMA_NAME(agg.[ObjectID]),
OBJECT_NAME(agg.[ObjectID])
)
SELECT sp.SchemaName,
sp.TableName,
sp.IndexID,
CASE
WHEN (sp.IndexID > 0) THEN COALESCE(si.[name], sp.[PassThroughIndexName])
ELSE N'<Heap>'
END AS [IndexName],
sp.[PassThroughIndexName] AS [InternalTableName],
sp.[Rows],
sp.ReservedKB,
(sp.ReservedKB / 1024.0 / 1024.0) AS [ReservedGB],
sp.DataKB,
(sp.DataKB / 1024.0 / 1024.0) AS [DataGB],
sp.IndexKB,
(sp.IndexKB / 1024.0 / 1024.0) AS [IndexGB],
sp.UsedKB AS [UsedKB],
(sp.UsedKB / 1024.0 / 1024.0) AS [UsedGB],
sp.UnusedKB,
(sp.UnusedKB / 1024.0 / 1024.0) AS [UnusedGB],
so.[type_desc] AS [ObjectType],
COALESCE(si.type_desc, sp.[PassThroughIndexType]) AS [IndexPrimaryType],
sp.[PassThroughIndexType] AS [IndexSecondaryType],
SCHEMA_ID(sp.[SchemaName]) AS [SchemaID],
sp.ObjectID
--,sp.ParentIndexID
FROM spaceused sp
INNER JOIN sys.all_objects so -- in case "WHERE so.is_ms_shipped = 0" is removed
ON so.[object_id] = sp.ObjectID
LEFT JOIN sys.indexes si
ON si.[object_id] = sp.ObjectID
AND (si.[index_id] = sp.IndexID
OR si.[index_id] = sp.[ParentIndexID])
WHERE so.is_ms_shipped = 0
--so.[name] LIKE N'' -- optional name filter
--ORDER BY ????
OTHER TIPS
You're dividing by INT
so you'll only ever get a whole number answer.
You therefore end up with a rounding problem on your own Space calculations. This is why, when you sum them together, you get a different answer.
Although the difference is minimal this is one of those key 'gotchas' with handling non-whole numbers in SQL Server.
Change your partition query in the procedure:
(sum(a.total_pages) OVER (PARTITION BY t.OBJECT_ID,i.index_id) * 8) / 1024.00 as TotalSpaceMB,
(sum(a.used_pages) OVER (PARTITION BY t.OBJECT_ID,i.index_id) * 8) / 1024.00 as UsedSpaceMB,
(sum(a.data_pages) OVER (PARTITION BY t.OBJECT_ID,i.index_id) * 8) / 1024.00 as DataSpaceMB