Question

I am new to MySQL, and I need to add indexes on an existing table (which contains roughly 200K rows).

Table mytable: (id:integer, created_time:timestamp, deleted_time:timestamp)

I have 2 queries which need to benefit from the index:

select s.id from mytable s
where s.completed_time is not null
and s.completed_time < ?
and ( s.deleted_time is null
    or s.deleted_time >= ? );

and :

select s.id from mytable s 
where
   s.completed_time is not null 
and (
   ( s.deleted_time is not null 
   and s.deleted_time >= ? 
   and s.deleted_time < ? )
 or ( s.completed_time >= ? 
   and s.completed_time < ? ) ) ;

I am considering introducing a multi column index (on completed_time and deleted_time) However, I am not sure if the condition "s.completed_time is not null" matches the criteria to make these queries use the composite index.

  • Do you have any thoughts about what is best (composite index or 2 indexes)? I am trying to use "explain" to figure out what's best but I am unsure on how to interpret the results.

  • And more generally: with a table having a composite index on (column1, column2), I understand that filtering on column2 only will not use the index. But what if I introduce a dummy condition like (column1 > MIN_VALUE), or (column1 is not null) when it is correct to do so?

Thanks!

Était-ce utile?

La solution

Assuming the two queries you mentionned will be "frequently" used, I would advise a composite index versus two distinct indexes on two columns.

As you already know, a query searching on two columns might sometimes use two separate indexes by (roughly) merging these two indexes into one. But this is sub-optimal, and has a cost in terms of performance.

Conversely, a composite index can only be used if the left-most columns are involved in the seach condition, or as the manual puts it:

MySQL can use multiple-column indexes for queries that test all the columns in the index, or queries that test just the first column, the first two columns, the first three columns, and so on

With regards to your suggested hack (introducing dummy conditions so as to be able to use the index), this might work, but I would rather advise creating a second index on column2 only (besides the two-column index on (column1, column2)). This comes at a (minor) cost, but is so much more elegant and reusable.

As for the suggestion of getting rid of NULL values, I strongly disagree. It is sematnically incorrect to use 0. 0 means "zero", NULL means "no value". All your tests would need to account for this special value, whereas IS NULL is standard and well understood everywhere. It is also just impractical in some situations (try to insert 0 with SQL_MODE='TRADITIONAL').

On the other hand, the performance gain is dubious (I believe this is mostly based on the false assumption that NULL values are not indexed). It is easy to verify that a query like s.completed_time IS NOT NULL will hit an index if such an index exists.

Autres conseils

Provided you get rid of the nulls as mentioned by a user. A composite index might work for the first query.

But the second query the index may not work. - As it has an OR between your indexed columns.

Usually it goes by the Left-Most columns in the index and in that order.

I suggest creating separate indexes. Though it has an overhead (May be) of http://dev.mysql.com/doc/refman/5.0/en/index-merge-optimization.html

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top