سؤال

Given this scenario,

Sales information for three countries.

CountryA: 0.9M records.
CountryB: 0.8M records.
CountryC: 0.7M records.

Theoretically, what would be the expected performance difference(*) between the following approaches?

  1. A single Cube with one partition per Country.
  2. Three Cubes, one per Country.

(*) For single Country queries, off course.

هل كانت مفيدة؟

المحلول

As long as the queries for the partitioned approach correctly target the appropriate partition, there should be little performance difference.

If the fact / dimension relationships are the same, go with the partition approach. This will make it much easier to return results across partitions (like TotalRevenue in Quarter 1 or TotalCost of ProductA). These are going to be more difficult to calculate if you separate them out into separate cubes. It will also be more complicated for the analyst using self-service research tools like Excel pivot-table or report builder, to target multiple cubes.

Here's what the SQL CAT team recommends regarding partition size:

In most cases, partitions should contain fewer than 20 million records size and each measure group should contain fewer than 2,000 total partitions. Also, avoid defining partitions containing fewer than two million records. Too many partitions causes a slowdown in metadata operations, and too few partitions can result in missed opportunities for parallelism.

You are clearly under the 20 million record recommendation. Additionally, I'd also look at the data size of each partition. A smaller cube (only necessary measures, attribute relationships and hierarchies properly defined, etc.) can perform well with significantly more records per partition.

Here's there full list of SQL CAT best practices:

http://sqlcat.com/sqlcat/b/top10lists/archive/2007/09/13/analysis-services-query-performance-top-10-best-practices.aspx

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top