Question

I was reading the article on indexes at simple-talk, where it is written that

If a heap has a non-clustered index on it (as the primary key), and data is inserted into the table, two writes have to occur. One write for inserting the row, and one write for updating the non-clustered index. On the other hand, if a table has a clustered index as the primary key, inserts take only one write, not two writes. This is because a clustered index, and its data, are one in the same. Because of this, it is faster to insert rows into a table with a clustered index as the primary key than it is to insert the same data into a heap that has a non-clustered index as its primary key. This is true whether or not the primary key is monotonically increasing or not.

Isn't it wrong? Clustered indexed naturally resort to B+ trees, where only the keys are stored in intermediate nodes (unlike B-trees, where the entire record is stored. That is why B+ trees can fit more keys in a single page, resulting it being bulky in width but shorter in height), so all records are stored in the leaf pages (the pages themselves being logically sorted through linked-lists, while the data in each page being physically sorted). So if a record has to be updated, say the value 1 has to be updated to 7, won't the update need to be applied to both the key in the clustered index top node (this may, in cases, cause a re-structuring of the entire structure) and the corresponding value in the record in the leaf-page?

enter image description here Update: Okay, I did some study, and found that apart from the initial tree structure (where some values must be present twice, for example the key values in the node), when new values are inserted, they just fit into the leaf page, whereas the tree gets restructured to accommodate that. However, when, say 5 values are inserted, the 3rd value may cause the first inserted value (which is currently occupying only leaf level space) to cascade up, thus causing it to be written twice (once in the leaf level, other in the index level). Of course, such re-writes (although they don't occur at the time of insertion, they can occur later) will be much less compared to the two-time write which occurs each time there is an insertion into a heap with NCI, but still isn't it wrong to say that there is no re-write?

No correct solution

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top