Database corruption: QueryStore internal table
-
25-12-2020 - |
Domanda
This morning, the following email alert was received:
DATE/TIME: 2/28/2018 9:26:42 AM
DESCRIPTION: Attempt to fetch logical page (1:3948712) in database 9 failed. It belongs to allocation unit 72057594045857792 not to 72059184917512192.
COMMENT: (None)
JOB RUN: SQL Sentry 2.0 Alert Trap
Looking in the event log of the secondary replica there are three occurrences of the same message:
Source spid138
Message Attempt to fetch logical page (1:3948712) in database 9 failed. It belongs to allocation unit 72057594045857792 not to 72059184917512192.
Running the following on the secondary replica (2 node synchronous Availability Group):
DBCC TRACEON(3604)
dbcc page (9, 1,3948712,3)
go
DBCC TRACEOff(3604)
Snippet of the results from either replica:
Page @0x00000070DAB8C000
m_pageId = (1:3948712) m_headerVersion = 1
m_type = 3 m_typeFlagBits = 0x0 m_level = 0
m_flagBits = 0x8200 m_objId (AllocUnitId.idObj) = 129 m_indexId
(AllocUnitId.idInd) = 256 Metadata: AllocUnitId = 72057594046382080
Metadata: PartitionId = 72057594040811520
Metadata: IndexId = 1 Metadata: ObjectId = 197575742
m_prevPage = 0:0) m_nextPage = (0:0) pminlen = 0
m_slotCnt = 2 m_freeCnt = 1634 m_freeData = 6568
m_reservedCnt = 0 m_lsn = (46041:1506360:18)
m_xactReserved = 0 m_xdesId = (0:0)
m_ghostRecCnt = 0 m_tornBits = -99702035 DB Frag ID = 1
Running the following on the primary replica:
select OBJECT_NAME (197575742)
plan_persist_plan
Questions
- Am I right in saying that I have a clustered index corruption of the
plan_persist_plan
table which is part of Query Store? Is the best/only fix to run the following:
ALTER DATABASE MyDatabase SET QUERY_STORE CLEAR;
If #2 is the best fix, is there any good way of preserving the data in Query Store that would be deleted?
- Does this kind of corruption indicate a problem with the IO subsystem?
Other info
- I have QueryStore enabled obviously, it has a capacity of 350MB, is in Read-Write mode currently, flush interval 15 minutes, stats collection hourly, Capture mode ALL, Auto size based cleanup, 5 day stale query threshold.
- DB id 9 is a business critical user database
- The error details are Error: 605, Severity: 21, State: 3.
I have checked the Windows System Event log as per the guidance. This has yielded only "Informational" events, no errors.
DBCC CHECKTABLE ('sys.plan_persist_plan');
results:
DBCC results for 'sys.plan_persist_plan'. There are 12562 rows in 240 pages for object "sys.plan_persist_plan". DBCC execution completed. If DBCC printed error messages, contact your system administrator.
I cannot establish the correct command to rebuild the index, the following does not work:
ALTER INDEX PK_plan_persist_plan_cidx ON sys.plan_persist_plan REBUILD;
Soluzione
As noted in my comment above I had a similar corruption issue with a query store internal table.
As you yourself have suggested I used ALTER DATABASE MyDatabase SET QUERY_STORE CLEAR;
to attempt to fix the issue and that did work fine. In SQL Server 2017, Microsoft added a repair procedure that can be attempted prior to clearing the data: sp_query_store_consistency_check
(source)
If you want to preserve the data then probably the only method is to copy the tables - I can't find anyone who has created a script for that.
Usually with corruption I too would be worried about my disks, but in this case I'm a little suspicious that the issue is with query store itself.
Altri suggerimenti
To answer you question #3
If #2 is the best fix, is there any good way of preserving the data in Query Store that would be deleted?
See How can I export Query Store data? it is not difficult in most cases to export the QS data. I can't say if your error will impact the export.
You may find some data missing when you export, See Why is Query Store missing details?
I fixed the error changing the compatibility level of my database from 110 to 130 (SQL Server 2016).
My instance build version is 13.0.5062.0 but the databases were migrated from an older version and never changed the compatibility level.