Progress RDBMS index issues - for information

Question

There are certainly cases where a programmer can make a better index choice than the compiler. But it is usually pretty rare.

Without knowing all of your actual index definitions (which you have not provided) it is not possible to completely evaluate what indexes could be chosen by the compiler. Given what you have shared the choice follows the rules (see below) but the rules are not as you describe above.

And without data about the distribution of data it isn't really possible to say if the chosen indexes are "best" or optimal. Although having said that it seems intuitive that a field with values like "BROKER" is going to be less refined than one called "contract_obj". But that's just a guess.

The Progress 4GL engine can use multiple indexes to resolve a query but that doesn't mean that it will do so nor does it mean that it will necessarily be a better result if it does. To know if it did do so you need to compile with XREF and review the results.

The 4GL engine uses static, compile time index selection. You can find some very detailed information about the rules here: http://pugchallenge.org/downloads/352_Pick_an_Index_2013.pdf

The most important rule is: maximize the depth of equality matches on leading components. You have two possible equality matches:

cac_role_person.contract_obj = cbm_contract.contract_obj 
cac_role_person.owning_entity_mnemonic = "BROKER"

So your "best" indexes (without knowing anything about the data distribution) will almost certainly be ones that have those two fields as their leading components. Ideally your third component would be the cac_role_person.effective_to_date field. If you do not have any indexes that meet that criteria you might want to consider adding one.

The two indexes that you have shown each have a single equality match with the leading component. So they are of equal strength. Tie-breaker criteria then come in to play -- if one of them is designated as the "primary" index it wins. Otherwise, since no BY criteria are indicated, alphabetical order wins.

If you lack appropriate indexes or you are doing a table scan on purpose then it is often fastest to specify the smallest index. You can determine that by looking at the output of:

proutil dbName -C dbanalys > dbName.dba

The index with the fewest number of blocks is the one that you want. If they are all roughly the same size go for the highest "utilization".

FWIW -- the SQL engine uses a cost based optimizer. You must, however, regularly update the statistics if you want that to work well. (And it won't help your 4GL queries.) (The SQL syntax available inside the 4GL is embedded SQL-89 and it does not know about the cost-based optimizer -- that will not help either. Trying to use SQL inside a 4GL session is the road to endless frustration -- don't go there.)