Question

I am testing how one of our stored procedure is working on SQL Server Standard vs SQL Server Enterprise editions.

I have created two virtual machines and install a SQL Server(SQL Server 2014) and restore a database on each instance. Then I have generated some test data (large volume but the same on each database) and started testing if there is an execution time difference.

I have been told that there should be 4x better performance on the Enterprise edition when a large volume of data is used because there is NUMA Aware Large Page Memory and Buffer Array Allocation.

So far, there is no such difference and the execution time is almost the same (a - or + second difference).

I cannot say I understand completely what is NUMA and how it works, but I guess the hardware NUMA is on as the following query returns me 0 and 64:

SELECT DISTINCT memory_node_id
FROM sys.dm_os_memory_clerks

I am not interested in the software Numa.

Could anyone tell if such 4x optimization really exists and should I continue generated more data in order to see it?

This is the VM set up:

enter image description here

Was it helpful?

Solution

You can't on a host with only 1 node. Your query on sys.dm_os_memory_clerks shows you only have 1 physical node (0). The other node (64) is a logical node for Dedicated Admin Access (DAC).

In older versions of SQL Server that were NUMA supported instead of NUMA aware/optimized, it is possible to tank performance when running on large NUMA hosts. Consider this: you have a large server with 4-NUMA nodes (say HPDL980). Query executed from node-0 retrieves a large amount of data which gets cached in node-2. To access that data, you now do at least 2 hops across the CPUs and perform cache-to-cache transfers. They're REALLY fast compared to disk but still a LOT slower than pulling directly from node-0 where the query originated.

Versions of SQL Server that are NUMA aware/optimized will try to keep memory usage within the same local NUMA node to avoid the hops. Even if soft NUMA is further layered on top of this, SQLOS memory node still works with the physical layout presented by the OS. That way, you SQL Server will still optimize for physical local nodes even when they look like different nodes logically. You can also track remote node access from perfmon using the SQL Server:Buffer Node\Remote node page lockups/sec counter.

These are very highly descriptions plus it assumes existing knowledge of NUMA, SQLOS, SQL Server memory management and query execution details. Lots of good write-ups on these areas in the CSS blogs and if you're keen on how all this started way back in 2005, read up Slava's blog (http://blogs.msdn.com/b/slavao/). It's really old but that work laid the foundation for what you see today.

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top