Database corruption: QueryStore internal table

https://dba.stackexchange.com/questions/199023

25-12-2020
|

Domanda

This morning, the following email alert was received:

DATE/TIME: 2/28/2018 9:26:42 AM

DESCRIPTION: Attempt to fetch logical page (1:3948712) in database 9 failed. It belongs to allocation unit 72057594045857792 not to 72059184917512192.

COMMENT: (None)

JOB RUN: SQL Sentry 2.0 Alert Trap

Looking in the event log of the secondary replica there are three occurrences of the same message:

Source spid138

Message Attempt to fetch logical page (1:3948712) in database 9 failed. It belongs to allocation unit 72057594045857792 not to 72059184917512192.

Running the following on the secondary replica (2 node synchronous Availability Group):

DBCC TRACEON(3604)
dbcc page (9, 1,3948712,3)
go
DBCC TRACEOff(3604)

Snippet of the results from either replica:

Page @0x00000070DAB8C000

m_pageId = (1:3948712)              m_headerVersion = 1               
m_type = 3 m_typeFlagBits = 0x0                m_level = 0            
m_flagBits = 0x8200 m_objId (AllocUnitId.idObj) = 129   m_indexId
(AllocUnitId.idInd) = 256  Metadata: AllocUnitId = 72057594046382080  
Metadata: PartitionId = 72057594040811520                             
Metadata: IndexId = 1 Metadata: ObjectId = 197575742      
m_prevPage = 0:0)                  m_nextPage = (0:0) pminlen = 0                 
m_slotCnt = 2                       m_freeCnt = 1634 m_freeData = 6568
m_reservedCnt = 0                   m_lsn = (46041:1506360:18)
m_xactReserved = 0                  m_xdesId = (0:0)                  
m_ghostRecCnt = 0 m_tornBits = -99702035              DB Frag ID = 1

Running the following on the primary replica:

select OBJECT_NAME (197575742)

plan_persist_plan

Questions

Am I right in saying that I have a clustered index corruption of the plan_persist_plan table which is part of Query Store?

Is the best/only fix to run the following:

ALTER DATABASE MyDatabase SET QUERY_STORE CLEAR;

If #2 is the best fix, is there any good way of preserving the data in Query Store that would be deleted?
Does this kind of corruption indicate a problem with the IO subsystem?

Other info

I have QueryStore enabled obviously, it has a capacity of 350MB, is in Read-Write mode currently, flush interval 15 minutes, stats collection hourly, Capture mode ALL, Auto size based cleanup, 5 day stale query threshold.
DB id 9 is a business critical user database
The error details are Error: 605, Severity: 21, State: 3.

I have checked the Windows System Event log as per the guidance. This has yielded only "Informational" events, no errors.

DBCC CHECKTABLE ('sys.plan_persist_plan');

results:

DBCC results for 'sys.plan_persist_plan'.
There are 12562 rows in 240 pages for object "sys.plan_persist_plan". 
DBCC execution completed. 
If DBCC printed error messages, contact your system administrator.

I cannot establish the correct command to rebuild the index, the following does not work:

ALTER INDEX PK_plan_persist_plan_cidx ON sys.plan_persist_plan REBUILD;

Soluzione

As noted in my comment above I had a similar corruption issue with a query store internal table.

As you yourself have suggested I used ALTER DATABASE MyDatabase SET QUERY_STORE CLEAR; to attempt to fix the issue and that did work fine. In SQL Server 2017, Microsoft added a repair procedure that can be attempted prior to clearing the data: sp_query_store_consistency_check (source)

If you want to preserve the data then probably the only method is to copy the tables - I can't find anyone who has created a script for that.

Usually with corruption I too would be worried about my disks, but in this case I'm a little suspicious that the issue is with query store itself.

Altri suggerimenti

To answer you question #3

If #2 is the best fix, is there any good way of preserving the data in Query Store that would be deleted?

See How can I export Query Store data? it is not difficult in most cases to export the QS data. I can't say if your error will impact the export.

You may find some data missing when you export, See Why is Query Store missing details?

I fixed the error changing the compatibility level of my database from 110 to 130 (SQL Server 2016).

My instance build version is 13.0.5062.0 but the databases were migrated from an older version and never changed the compatibility level.

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a dba.stackexchange