Question

The customer table contains 9.5 million records. The customer_id column is the primary key. The database is Oracle.

Questions:

1) Should the table contain main partitions or sub-partitions? How do I decide? Also, I don't think indexing columnA or columnB will help here because of the type of data.

TableA.columnA (varchar) has more than 80% of the records for columnA values 5,6,7. The columnA has values from 1 to 7 only.
TableA.columnB (varchar) has 90% of the records for columnB value = 102. The columnB has values from 1 to 999.

Moreover, the typical queries are (in no particular order):

Query1: where tableA.columnA = values
Query2: where tableA.columnB = values
Query3: where tableA.columnA = values AND/OR tableA.columnB = values

2) When we create sub-partitions, what happens if the query only contains a where clause for sub-partition column? Does the query execution go directly to sub-partition or through main partition?

3) the join contains tableA.partitioned_column = tableB.indexed_column

(eg. customer_Table.branch_code = branch_table.branch_code)

Does partitioning help in the case of JOIN? Will it improve performance?

No correct solution

OTHER TIPS

1) It's very difficult to answer not knowing table structure, the way it's usually used etc. But generally for big tables partitioning is very often necessity.

2) If you will not specify partition then Oracle will have to browse through all partitions to find where the subpartition is (which is not very slow). And then use partition pruning on subpartition. It will be still significantly faster then not having subpartitions at all. But the best situation is to refer in WHERE to partition and subpartition.

3) For 99% I think it will help, because Oracle can use partition pruning to get at once needed rows from tableA. You will be 100% sure if you check query plan. But the best situation is when both column are partition keys.

If 80-90% of these columns have the same values and they are the most often queried values, then partitioning will help some. You would be pruning 10-20% of the data during these queries but you probably want to find another way for Oracle to hone in on the data your query needs (dates, perhaps?)

The value distribution in your two columns also brings up the point of statistics and making sure they are being gathered properly (with histograms to describe the skew in these columns).

As @psur points out, without knowing the details of your system it's hard give concrete suggestions.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top