Question

This question is related to this one.

I have a page table with the following structure:

CREATE TABLE  mydatabase.page (
  pageid int(10) unsigned NOT NULL auto_increment,
  sourceid int(10) unsigned default NULL,
  number int(10) unsigned default NULL,
  data mediumtext,
  processed int(10) unsigned default NULL,
  PRIMARY KEY  (pageid),
  KEY sourceid (sourceid)
) ENGINE=MyISAM AUTO_INCREMENT=9768 DEFAULT CHARSET=latin1;

The data column contains text whose size is around 80KB - 200KB per record. The total size of the data stored in the data column is around 1.5GB.

Executing this query takes 0.08 seconds:

select pageid from page

But executing this query takes around 130.0 seconds:

select sourceid from page

As you see, I've got a primary index on page.pageid and an index on page.sourceid. So should the second query be taking THAT long?

Edit #1

EXPLAIN returned

id select_type table type  possible_keys key      key_len ref rows Extra
1  SIMPLE      page  index               sourceid 5           9767 Using index

I'm sorry but profiling didn't work... MySQL (its 4.1.22) did not recognize SHOW PROFILE query.

SHOW INDEX returned

Table Non_unique Key_name  Seq_in_index Column_name Collation Cardinality Sub_part Packed Null Index_type Comment
page  0          PRIMARY   1            pageid      A         9767                             BTREE 
page  1          sourceid  1            sourceid    A         3255                        YES  BTREE 
Was it helpful?

Solution

Did you try to enforce the use of the index? Like:

SELECT sourceid FROM page USE INDEX (sourceid_index)

Like sgehrig comments, check using EXPLAIN if the index is used? And share the result?

EXPLAIN select sourceid from page

It could also help to share the definiton of the indexes:

SHOW INDEX FROM page

OTHER TIPS

How different are your sourceid fields? If you have only a few different sourceid values compared to the number of rows then you can try increasing the size of the index.

As MySQL 4.1.22 is fairly old (02 November 2006) I'd suspect that it doesn't support the notion of covering indexes for secondary keys. EXPLAIN shows that the query actually uses the index, so I'd assume that the additional time is needed to read all the result rows (instead of just returning the index content when using covering indexes) to extract the sourceid column.

Do you have the possibility to check the query on a more recent MySQL server version?

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top