Question

we have a performance issue with our SQL Server 2012 Enterprise setup that I am unable to explain and I am hoping you guys have an idea.

We have a fact table with a bunch of int columns that we aggregate as well as a region dimension table.

This is the structure of our fact table:

  • regionId (int)
  • revenue (Decimal 10,2)
  • orderIntake (Decimal 10,2)

And this is the structure of our dimension table:

  • worldRegion(varchar(100)9
  • cluster (varchar(100))
  • country (varchar(100))
  • regionId (int)

The fact table and the dimension table are connected via a INNER JOIN over the regionId columns. The performance of this is quite good as long as we don't restrict the countries.

E.g.

SELECT SUM(revenue) FROM factTable f INNER JOIN regionDim r ON f.regionId=r.regionId

is fast (<1 sec).

However

SELECT SUM(revenue) FROM factTable f INNER JOIN regionDim r ON f.regionId=r.regionId WHERE r.country IN ('France','Germany')

is pretty slow (> 8 sec) for around 500k records.

We do have the following indizes in place:

  • ColumnStore Index on the fact table on the regionId column
  • Clustered Index on dimension table (regionId,country,cluster,worldRegion)

Is there anything that we can change from either an index or an overall structure point of view?

Was it helpful?

Solution

The order of columns in the index of the dim table does not allow using this index in the where clause of the 2nd query. This is because the rows are indexed by the 1st index column (regionId), then by the 2nd (country) and so on. Using only the 2nd column is like using a phone book when searching for someone by the first name only. Try putting a separate index on the country column and see if performance improves.

OTHER TIPS

Without the execution plan it is tricky to see what the problem is. I wonder if you extract the RegionIDs from the dimension table first either into a table expression or temporary table and then use them against the fact table if it will perform more quickly.

Maybe this:

WITH regionIDcte AS
 (
 SELECT regionId 
 FROM   regionDim 
 WHERE country IN ('France','Germany')
 )
SELECT SUM(revenue) 
FROM factTable f 
WHERE EXISTS 
  (
  SELECT *
  FROM regionIDcte x
  WHERE f.regionId = f.regionId
  );
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top