Dropping PLE on query

https://dba.stackexchange.com/questions/139468

02-10-2020
|

Question

In company we are working in project on very big database. It uses 100GB RAM. What's weird before first running a query PLE is 11k~, after running it drops to about 70, anyway when after 15 mins I check PLE again its about 1k~ and when I run query again it drops to 60. Why is it happening? If in time between running queries PLE increase doesn’t it mean that all needed data is in cache? If so why then running same query after 15 minutes cause PLE to drop again?

Here is the query:

select
    ResultType = case r.TypeID
        when 'dlp' then 'DLP'
        when 'bill' then 'BILL'
        when 'evtlog' then 'EVTLOG'
    end,
    SerialNumber, 
    ESerialNumber, 
    ResultDateTime, 
    DateTimeStamp as SavedInSystem
from 
    Results r
where
    TypeID in ('typeid1','typeid2')
    and DateTimeStamp > '2016-05-19 23:00:00'
    and SerialNumber in ('serialnumber')

I have a clustered index on datetimestamp and non clustered on typeid, datetimestamp and resultdatetime, resultid, typeid and few others... @update

    select
(physical_memory_in_use_kb/1024)Memory_usedby_Sqlserver_MB,
(locked_page_allocations_kb/1024 )Locked_pages_used_Sqlserver_MB,
(total_virtual_address_space_kb/1024 )Total_VAS_in_MB,
process_physical_memory_low,
process_virtual_memory_low
from sys. dm_os_process_memory

returns

Memory_usedby_Sqlserver_MB  Locked_pages_used_Sqlserver_MB  Total_VAS_in_MB process_physical_memory_low process_virtual_memory_low
102569  0   134217727   0   0

and here are counters, since freepages was removed in 2012 version i added few other counters

object_name counter_name instance_name cntr_value cntr_type
MSSQL$ServerNameC3SCW:Buffer Manager Database pages 11356975 65792
MSSQL$ServerNameC3SCW:Buffer Manager Checkpoint pages/sec 1053996662 272696576
MSSQL$ServerNameC3SCW:Buffer Manager Page life expectancy 4233 65792
MSSQL$ServerNameC3SCW:Buffer Node Database pages 003 2975519 65792
MSSQL$ServerNameC3SCW:Buffer Node Page life expectancy 003 4892 65792
MSSQL$ServerNameC3SCW:Buffer Node Database pages 002 2938151 65792
MSSQL$ServerNameC3SCW:Buffer Node Page life expectancy 002 4051 65792
MSSQL$ServerNameC3SCW:Buffer Node Database pages 001 3002872 65792
MSSQL$ServerNameC3SCW:Buffer Node Page life expectancy 001 4052 65792
MSSQL$ServerNameC3SCW:Buffer Node Database pages 000 2440433 65792
MSSQL$ServerNameC3SCW:Buffer Node Page life expectancy 000 4052 65792
MSSQL$ServerNameC3SCW:Memory Manager Database Cache Memory (KB) 90855800 65792
MSSQL$ServerNameC3SCW:Memory Manager Free Memory (KB) 286672 65792
MSSQL$ServerNameC3SCW:Memory Manager Memory Grants Pending 0 65792
MSSQL$ServerNameC3SCW:Memory Manager Total Server Memory (KB) 104857608 65792
MSSQL$ServerNameC3SCW:Memory Node Database Node Memory (KB) 003 23804152 65792
MSSQL$ServerNameC3SCW:Memory Node Free Node Memory (KB) 003 73256 65792
MSSQL$ServerNameC3SCW:Memory Node Database Node Memory (KB) 002 23505208 65792
MSSQL$ServerNameC3SCW:Memory Node Free Node Memory (KB) 002 76392 65792
MSSQL$ServerNameC3SCW:Memory Node Database Node Memory (KB) 001 24022976 65792
MSSQL$ServerNameC3SCW:Memory Node Free Node Memory (KB) 001 77504 65792
MSSQL$ServerNameC3SCW:Memory Node Database Node Memory (KB) 000 19523464 65792
MSSQL$ServerNameC3SCW:Memory Node Free Node Memory (KB) 000 59520 65792

and counters after query

object_name counter_name instance_name cntr_value cntr_type
MSSQL$ ServerNameC3SCW:Buffer Manager Database pages 11355652 65792
MSSQL$ ServerNameC3SCW:Buffer Manager Checkpoint pages/sec 1054000434 272696576
MSSQL$ ServerNameC3SCW:Buffer Manager Page life expectancy 310 65792
MSSQL$ ServerNameC3SCW:Buffer Node Database pages 003 2980418 65792
MSSQL$ ServerNameC3SCW:Buffer Node Page life expectancy 003 5417 65792
MSSQL$ ServerNameC3SCW:Buffer Node Database pages 002 2946298 65792
MSSQL$ ServerNameC3SCW:Buffer Node Page life expectancy 002 4591 65792
MSSQL$ ServerNameC3SCW:Buffer Node Database pages 001 2995850 65792
MSSQL$ ServerNameC3SCW:Buffer Node Page life expectancy 001 155 65792
MSSQL$ ServerNameC3SCW:Buffer Node Database pages 000 2433086 65792
MSSQL$ ServerNameC3SCW:Buffer Node Page life expectancy 000 165 65792
MSSQL$ ServerNameC3SCW:Memory Manager Database Cache Memory (KB) 90845216 65792
MSSQL$ ServerNameC3SCW:Memory Manager Free Memory (KB) 219240 65792
MSSQL$ ServerNameC3SCW:Memory Manager Memory Grants Pending 0 65792
MSSQL$ ServerNameC3SCW:Memory Manager Total Server Memory (KB) 104857608 65792
MSSQL$ ServerNameC3SCW:Memory Node Database Node Memory (KB) 003 23843344 65792
MSSQL$ ServerNameC3SCW:Memory Node Free Node Memory (KB) 003 51864 65792
MSSQL$ ServerNameC3SCW:Memory Node Database Node Memory (KB) 002 23570384 65792
MSSQL$ ServerNameC3SCW:Memory Node Free Node Memory (KB) 002 59680 65792
MSSQL$ ServerNameC3SCW:Memory Node Database Node Memory (KB) 001 23966800 65792
MSSQL$ ServerNameC3SCW:Memory Node Free Node Memory (KB) 001 58144 65792
MSSQL$ ServerNameC3SCW:Memory Node Database Node Memory (KB) 000 19464688 65792
MSSQL$ ServerNameC3SCW:Memory Node Free Node Memory (KB) 000 49552 65792
Target Server Memory (KB) MSSQL$GKKTSQLC3SCW:Memory Manager 104857608 65792

SELECT @@version returns:

Microsoft SQL Server 2012 - 11.0.5058.0 (X64) 
    May 14 2014 18:34:29 
    Copyright (c) Microsoft Corporation
    Enterprise Edition: Core-based Licensing (64-bit) on Windows NT 6.3  (Build 9600: )

In execution plan I can see that clustered index seek has cost 100%, sorting, hash matching has cost 0%. There is a hint about a missing index, but it would be an additional index over 5 columns for a table with 29 mln rows.

Solution

It is not unusual for a query to drop PLE. Queries can consume arbitrary amounts of memory. This generally happens for two reasons:

Work memory for sorting and hashing.
Filling the buffer pool with data that was read.

You can prove that (1) is happening by watching outstanding memory grants from the respective DMV while the query runs. If the query consumes large amounts of memory you should try to achieve an execution plan that does not need that amount of memory. You can also try to limit the memory consumed using resource governor.

Issue (2) can be ameliorated by trying to make the query read less data. Here, you'd create an index with the key columns set to support the where clause and the include columns to cover the selected columns. You can enable DATA_COMPRESSION.

Normally, huge scans are not devastating for the buffer pool because of "page disfavoring". Newly read pages of large scans are treated such that they are the first, not the last, pages to be evicted when memory must be freed. I don't recall the exact rules but generally SQL Server can scan huge data sets without killing the buffer pool.

So as an experiment create that index (if possible) and/or watch query memory grants.

OTHER TIPS

So, there really is insufficient information to answer this question. How large is the Results table? Where is the query plan?

I am assuming it is very large...so when you run this query, PLE plummets because you are bringing in a large amount of data that wasn't previously cached, and PLE is dropping to around 70. That means ~70 seconds later, most if not all of what you have brought into cache, has been aged out. So, when you run it again 15 minutes later, you are bringing that huge amount of data back into cache and PLE is plummeting again. This is not a bug, and it is how SQL Server works.

In addition, I see that PLE is only plummeting on 2 of the 4 NUMA nodes.

Licensed under: CC-BY-SA with attribution

Not affiliated with dba.stackexchange