Domanda

I'm working on MySQL 5.5.29-0ubuntu0.12.04.1.

I have the need to create a query that can sort results by date and by a score.

I read the documentation and the posts here on stackoverflow (specifically this) about how to optimize a query but I'm still struggling to do it well. The key findings is that to avoid the use of a temporary table the ORDER BY or GROUP BY must contains only columns from the first table in the join queue, so that's why the use of the STRAIGHT_JOIN clause and the two slightly different queries.

To avoid confusion, I'm going to assign a number to various query configuration:

  1. order by date with STRAIGHT_JOIN clause
  2. order by score with STRAIGHT_JOIN clause
  3. order by date without STRAIGHT_JOIN clause
  4. order by score without STRAIGHT_JOIN clause

Following is query 1, takes about 2.5 seconds to complete:

SELECT STRAIGHT_JOIN item.id AS id
FROM item 
INNER JOIN score ON item.id = score.item_id 
LEFT JOIN url ON item.url_id = url.id 
LEFT JOIN doc ON url.doc_id = doc.id 
INNER JOIN feed ON feed.id = item.feed_id 
INNER JOIN user_feed ON feed.id = user_feed.feed_id AND score.user_id = user_feed.user_id 
LEFT JOIN star ON item.id = star.item_id AND score.user_id = star.user_id 
JOIN unseen ON item.id = unseen.item_id AND score.user_id = unseen.user_id 
WHERE score.user_id = 1 AND user_feed.id = 7 
ORDER BY zen_time DESC 
LIMIT 0, 10

Following is query 2 (first join tables are inverted and the ordering column is different), takes only about 0.01 seconds to complete:

SELECT STRAIGHT_JOIN item.id AS id
FROM score
INNER JOIN item ON item.id = score.item_id 
LEFT JOIN url ON item.url_id = url.id 
LEFT JOIN doc ON url.doc_id = doc.id 
INNER JOIN feed ON feed.id = item.feed_id 
INNER JOIN user_feed ON feed.id = user_feed.feed_id AND score.user_id = user_feed.user_id 
LEFT JOIN star ON item.id = star.item_id AND score.user_id = star.user_id 
JOIN unseen ON item.id = unseen.item_id AND score.user_id = unseen.user_id 
WHERE score.user_id = 1 AND user_feed.id = 7 
ORDER BY score DESC 
LIMIT 0, 10

Following are the EXPLAIN results for the queries.

Explain for query 1: enter image description here

Explain for query 2: enter image description here

Explain for query 3: enter image description here

Explain for query 4: enter image description here

Profiler result for query 1: enter image description here

Profiler result for query 2: enter image description here

Profiler result for query 3: enter image description here

Profiler result for query 4: enter image description here

Following are tables definitions:

CREATE TABLE `doc` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`md5` char(32) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `Md5_index` (`md5`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

CREATE TABLE `feed` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`url` text NOT NULL,
`title` text,
PRIMARY KEY (`id`),
FULLTEXT KEY `Title_url_index` (`title`,`url`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

CREATE TABLE `item` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`feed_id` bigint(20) unsigned NOT NULL,
`url_id` bigint(20) unsigned DEFAULT NULL,
`md5` char(32) NOT NULL,
PRIMARY KEY (`id`),
KEY `Md5_index` (`md5`),
KEY `Zen_time_index` (`zen_time`),
KEY `Feed_index` (`feed_id`),
KEY `Url_index` (`url_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

CREATE TABLE `score` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`user_id` bigint(20) unsigned NOT NULL,
`item_id` bigint(20) unsigned NOT NULL,
`score` float DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `User_item_index` (`user_id`,`item_id`),
KEY Score_index (`score`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

CREATE TABLE `star` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`user_id` bigint(20) unsigned NOT NULL,
`item_id` bigint(20) unsigned NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `User_item_index` (`user_id`,`item_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

CREATE TABLE `unseen` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`user_id` bigint(20) unsigned NOT NULL,
`item_id` bigint(20) unsigned NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `User_item_index` (`user_id`,`item_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

CREATE TABLE `url` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`doc_id` bigint(20) unsigned DEFAULT NULL,
PRIMARY KEY (`id`),
KEY Doc_index (`doc_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

CREATE TABLE `user` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`email` varchar(255) NOT NULL,
PRIMARY KEY (`id`),
KEY `IDX_Email` (`email`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

CREATE TABLE `user_feed` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`user_id` bigint(20) unsigned NOT NULL,
`feed_id` bigint(20) unsigned NOT NULL,
PRIMARY KEY (`id`),
KEY `User_feed_index` (`user_id`,`feed_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

Here are the row counts for the tables involved in the query:

Score: 68657
Item: 197602
Url: 198354
Doc: 186113
Feed: 754
User_feed: 721
Star: 0
Unseen: 150762

Which approach should I take since my program needs to be able to order results both by zen_time and score in the fastest way possible?

È stato utile?

Soluzione

Due to the different query speeds I decided to make an even more accurate analysis based on the various results I want to achieve.

The result sets I need are four:

  1. Select all the items from a specific feed, order them by SCORE.score (intelligent order)
  2. Select all the items from a specific feed, order them by ITEM.zen_time (time order)
  3. Select all the items, order them by SCORE.score (intelligent order)
  4. Select all the items, order them by ITEM.zen_time (time order)

The query so has to be adapted to those conditions, and its variable parts are:

  • STRAIGHT_JOIN yes/no
  • First JOIN table score/item
  • WHERE condition on specific feed yes/no
  • ORDER BY score/zen_time

All of the tests have been executed with the SELECT SQL_NO_CACHE instruction.

Following are the results: enter image description here

Now it's clear what I have to do:

  1. No STRAIGHT_JOIN, first JOIN table SCORE
  2. No STRAIGHT_JOIN, first JOIN table SCORE
  3. STRAIGHT_JOIN (I did beat MySQL engine here :D ), first JOIN table SCORE
  4. STRAIGHT_JOIN (I did beat MySQL engine here :D ), first JOIN table ITEM
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top