Question

I have created an archive table which will store data for selecting only.

Daily there will be a program to transfer a batch of records into the archive table. There are several columns which are indexed; while others are not.

I am concerned with time cost per batch insertion:
- 1st batch insertion: N1
- 2nd batch insertion: N2
- 3rd batch insertion: N3

The question is: will N1, N2, and N3 roughly be the same, or N3 > N2 > N1?

That is, will the time cost be a constant or incremental, with existence of several indexes?

All indexes are non-clustered.

The archive table structure is this:

create table document (
   doc_id   int unsigned primary key,
   owner_id int,  -- indexed
   title    smalltext,
   country  char(2),
   year     year(4),
   time     datetime,

   key ix_owner(owner_id)
}
Was it helpful?

Solution

Cost will be incremental, O(log(n)).

In practice, you will have a step on graph when the index size reaches a certain threshold and the index pages will stop fitting into the cache

The size of the cache is defined by key_buffer_size in MyISAM and innodb_buffer_pool_size in InnoDB.

Before this threshold, the cost will be proportional to the memory seek time, after the threshold, the cost will be proportional to the disk seek time (all multiplied by the log(n) of course)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top