Question about cost optimization formula for SELECT

https://dba.stackexchange.com/questions/137416

01-10-2020
|

Pergunta

I am a student in computer science and I have a question, which refers to cost optimization. I have to read the book (Fundamentals of Database Systems from Elmasry and Navathe) and try to present the topic to class. But I have some issues understanding one cost formula for B-tree-index:

Using a secondary (B+-tree) index: [...]If the comparison condition is (> >=, <, <=) and half the file records are assumed to satisfy the condition, then(very roughly) half of the first level index blocks are accessed, plus half the file records via the index. The cost estimate for this case, approximately, is C=x+(b_l1/2)+(r/2).[...]

Using Linear search: [...] C=b[...]

where

x = number of levels of multilevel index
r = number of records
b = number of blocks
b_li = number of first-level index blocks
C = cost function

I don't understand the formula for the secondary index. I think the formula should look like this:

C=x+(b_l1/2)+(b/2)

Can somebody explain me, why the formula uses the number of entries instead of number of blocks?

Thank you very much

Solução

Assuming "first level" means the leaf level of a nonclustered (secondary) index, the b_li/2 part accounts for the cost of scanning (using sequential I/O) through half those index blocks.

Assuming "file records" means the base table, the r/2 component represents the cost of looking up additional data not present in the secondary index. Since this is a per-row cost that is expected to result in a random I/O pattern, it makes sense to express this in terms of r.

Licenciado em: CC-BY-SA com atribuição

Não afiliado a dba.stackexchange