Join without indexing the entire table
-
27-02-2021 - |
Domanda
I have a query that pulls the 3 most recently commented posts on our forum. The issue is is that COUNT(forum_posts.id) as posts,
is indexing the entire database of 600,000 posts and the query takes forever to complete. I'm left joining the table like this LEFT JOIN forum_posts ON forum_posts.topic_id = forum_topics.id
. Apparently I will have to use another approach to this. Any suggestions?
SELECT forum_topics.id,
forum_topics.title,
forum_topics.date_created,
forum_topics.updated,
forum_topics.last_activity,
forum_topics.category_id,
SUBSTR(forum_topics.content, 1, 70) as content,
forum_topics.author_id,
users.username,
users.avatar,
COUNT(forum_posts.id) as posts,
CASE WHEN forum_topics_seen.user_id = 49 then 1 else 0 end as seen,
forum_categories.category_order,
forum_categories.name as cat_name
FROM forum_topics
INNER JOIN users ON users.id = forum_topics.author_id
LEFT JOIN forum_topics_seen ON forum_topics_seen.topic_id = forum_topics.id AND forum_topics_seen.user_id = 49
LEFT JOIN forum_posts ON forum_posts.topic_id = forum_topics.id
LEFT JOIN forum_categories ON forum_categories.id = forum_topics.category_id
GROUP BY forum_topics.id
ORDER BY forum_topics.last_activity DESC LIMIT 3
Result from SHOW CREATE TABLE forum_posts
CREATE TABLE `forum_posts` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`content` text CHARACTER SET utf8mb4 NOT NULL,
`author_id` int(10) unsigned NOT NULL,
`editor_id` int(10) unsigned DEFAULT NULL,
`topic_id` int(11) unsigned NOT NULL,
`date_created` varchar(50) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
`updated` varchar(50) COLLATE utf8_unicode_ci DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `id` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=102183 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
CREATE TABLE `forum_topics_seen` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`topic_id` int(10) unsigned NOT NULL,
`user_id` int(10) unsigned NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=40 DEFAULT CHARSET=utf8mb4
CREATE TABLE `forum_categories` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(50) COLLATE utf8_unicode_ci NOT NULL,
`description` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`game` varchar(50) COLLATE utf8_unicode_ci DEFAULT NULL,
`icon` varchar(100) COLLATE utf8_unicode_ci DEFAULT NULL,
`category_order` int(11) DEFAULT NULL,
`order` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `id` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=53 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
CREATE TABLE `forum_topics` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`title` varchar(50) CHARACTER SET utf8mb4 NOT NULL,
`content` longtext CHARACTER SET utf8mb4 NOT NULL DEFAULT '',
`author_id` int(11) unsigned DEFAULT NULL,
`editor_id` int(11) unsigned DEFAULT NULL,
`category_id` int(11) unsigned NOT NULL,
`date_created` varchar(50) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
`updated` varchar(50) COLLATE utf8_unicode_ci DEFAULT NULL,
`last_activity` varchar(50) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
`views` int(11) NOT NULL DEFAULT 0,
`is_locked` int(11) NOT NULL DEFAULT 0,
`is_sticky` int(11) NOT NULL DEFAULT 0,
PRIMARY KEY (`id`),
UNIQUE KEY `id` (`id`),
KEY `FK_forum_topics_forum_categories` (`category_id`)
) ENGINE=InnoDB AUTO_INCREMENT=7040 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
CREATE TABLE users
(
id INTEGER,
username VARCHAR (30),
avatar VARCHAR (30)
);
Mysql version: 10.4.8-MariaDB
Soluzione
OK - I've wrestled with this and I've come up with a query which, while not perfect, should run better than your one!
I have no sample data, but here is the result of EXPLAIN EXTENDED
of your query (see the fiddle here):
id select_type table type possible_keys key key_len ref rows filtered Extra
1 SIMPLE forum_topics ALL 1 100.00 Using where; Using temporary; Using filesort
1 SIMPLE users eq_ref PRIMARY PRIMARY 4 db_1663433212.forum_topics.author_id 1 100.00 Using where
1 SIMPLE forum_topics_seen ref topic_id,user_id topic_id 4 db_1663433212.forum_topics.id 1 100.00 Using where
1 SIMPLE forum_posts ALL 1 100.00 Using where; Using join buffer (flat, BNL join)
1 SIMPLE forum_categories eq_ref PRIMARY PRIMARY 4 db_1663433212.forum_topics.category_id 1 100.00
and the result of my EXPLAIN EXTENDED
i is:
id select_type table type possible_keys key key_len ref rows filtered Extra
1 PRIMARY ft ALL 1 100.00 Using where; Using temporary; Using filesort
1 PRIMARY u eq_ref PRIMARY PRIMARY 4 db_1663433212.ft.author_id 1 100.00 Using where
1 PRIMARY fts ref topic_id,user_id topic_id 4 db_1663433212.ft.id 1 100.00 Using where
1 PRIMARY fc eq_ref PRIMARY PRIMARY 4 db_1663433212.ft.category_id 1 100.00
1 PRIMARY <derived2> ALL 2 100.00 Using join buffer (flat, BNL join)
2 DERIVED forum_posts index PRIMARY 4 1 100.00 Using index
6 rows
If you look at the last line of mine, you will see that I am using the PRIMARY KEY
of forum_posts, whereas your query does not do so. I would imagine (hope) that this will result in a considerable improvement in performance.
Here is the query:
EXPLAIN EXTENDED WITH p_cnt AS
(
SELECT COUNT(*) AS pcnt FROM forum_posts
)
SELECT
ft.id,
ft.title,
ft.date_created,
ft.updated,
ft.last_activity,
ft.category_id,
SUBSTR(ft.content, 1, 70) as content,
ft.author_id,
u.username,
u.avatar,
p.pcnt,
CASE WHEN fts.user_id = 49 then 1 else 0 end as seen,
fc.category_order,
fc.name as cat_name
FROM forum_topics ft
INNER JOIN users u ON u.id = ft.author_id -- author_id needs to be a FOREIGN KEY but I can't get it to work in a fiddle!
LEFT JOIN forum_topics_seen fts ON fts.topic_id = ft.id
AND fts.user_id = 49
LEFT JOIN forum_categories fc ON fc.id = ft.category_id
CROSS JOIN p_cnt p
GROUP BY ft.id
ORDER BY ft.last_activity DESC
LIMIT 3
The most important thing is that I am using a CTE (Common Table Expression
) to get the count of forum_posts
and then I basically use that value as a "constant" in the following query - this considerably simplifies matters. CTEs are very powerful and well worth getting to know!
I am using the explict CROSS JOIN
syntax in my join of my count to the rest - you could also write this line:
LEFT JOIN forum_categories fc ON fc.id = ft.category_id, p_cnt p
known as the comma syntax
. Personally, I prefer to put it out there in bold and caps - it makes things clearer and is easier to read. Speaking of making things easier to read, note that my query uses table aliases which, IMHO, vastly improves readability - YMMV!
I would be grateful if you could run this SQL and report back here on any performance change (improvement! :-) ).
There were several things about your SQL that I think you should look at.
As I pointed out already, your original table definitions had lines like this:
PRIMARY KEY (
id
), UNIQUE KEYid
(id
)
The UNIQUE KEY
id(
id)
is redundant, since PRIMARY KEY
s are, by definition, unique.
You should ALWAYS put
FOREIGN KEY
s on fields which do actuallyJOIN
tables - and index is automatically created which helps with performance. For example, I added:FOREIGN KEY (topic_id) REFERENCES forum_topics (id)
I had difficulty (noted in the table definitions) adding some FOREIGN KEY
definitions in the fiddle - not sure why this is. Maybe you'll have better luck?
You should also index fields upon which you are filtering - I added these lines:
KEY (
topic_id
,user_id
), -- added this KEY (user_id
), -- added this
There can be religious debates about which fields to index and which not - you can have loads of fun during the current lockdown experimenting with various indexes! :-)
I notice that you had one strange line in your table definition of
forum_topics
. You wrote:KEY
FK_forum_topics_forum_categories
(category_id
)
Now, were you trying to create a FOREIGN KEY
for the field category_id
in the table forum_topics
pointing at the table forum_categories
? If so, then the syntax goes like this:
-- FOREIGN KEY (category_id) REFERENCES forum_categories (id), -- couldn't get this to work
For some reason, as I mentioned above, I couldn't get a couple of FOREIGN KEY
defintions to work and I don't have a running MariaDB system to test on - maybe you'll have better luck on a live system?
I hope this has been helpful - if you have other issues, don't hesitate to get back to me!
========================== Tables ==================================
users:
CREATE TABLE `users`
(
`id` INT(10) NOT NULL AUTO_INCREMENT PRIMARY KEY,
`username` VARCHAR (30),
`avatar` VARCHAR (30)
);
forum_posts:
CREATE TABLE `forum_posts` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`content` text CHARACTER SET utf8mb4 NOT NULL,
`author_id` int(10) unsigned NOT NULL,
`editor_id` int(10) unsigned DEFAULT NULL,
`topic_id` int(11) unsigned NOT NULL,
`date_created` varchar(50) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
`updated` varchar(50) COLLATE utf8_unicode_ci DEFAULT NULL,
PRIMARY KEY (`id`)
-- FOREIGN KEY (author_id) REFERENCES users (id) -- can't get this to work!
) ENGINE=InnoDB AUTO_INCREMENT=102183 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
forum_categories:
CREATE TABLE `forum_categories` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(50) COLLATE utf8_unicode_ci NOT NULL,
`description` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`game` varchar(50) COLLATE utf8_unicode_ci DEFAULT NULL,
`icon` varchar(100) COLLATE utf8_unicode_ci DEFAULT NULL,
`category_order` int(11) DEFAULT NULL,
`order` int(11) DEFAULT NULL,
PRIMARY KEY (`id`)
-- UNIQUE KEY `id` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=53 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
forum_topics:
CREATE TABLE `forum_topics` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`title` varchar(50) CHARACTER SET utf8mb4 NOT NULL,
`content` longtext CHARACTER SET utf8mb4 NOT NULL DEFAULT '',
`author_id` int(11) unsigned DEFAULT NULL,
`editor_id` int(11) unsigned DEFAULT NULL,
`category_id` int(11) unsigned NOT NULL,
`date_created` varchar(50) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
`updated` varchar(50) COLLATE utf8_unicode_ci DEFAULT NULL,
`last_activity` varchar(50) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
`views` int(11) NOT NULL DEFAULT 0,
`is_locked` int(11) NOT NULL DEFAULT 0,
`is_sticky` int(11) NOT NULL DEFAULT 0,
PRIMARY KEY (`id`),
-- UNIQUE KEY `id` (`id`),
FOREIGN KEY (category_id) REFERENCES forum_categories (id), -- couldn't get this to work
KEY `FK_forum_topics_forum_categories` (`category_id`)
) ENGINE=InnoDB AUTO_INCREMENT=7040 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
forum_topics_seen:
CREATE TABLE `forum_topics` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`title` varchar(50) CHARACTER SET utf8mb4 NOT NULL,
`content` longtext CHARACTER SET utf8mb4 NOT NULL DEFAULT '',
`author_id` int(11) unsigned DEFAULT NULL,
`editor_id` int(11) unsigned DEFAULT NULL,
`category_id` int(11) unsigned NOT NULL,
`date_created` varchar(50) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
`updated` varchar(50) COLLATE utf8_unicode_ci DEFAULT NULL,
`last_activity` varchar(50) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
`views` int(11) NOT NULL DEFAULT 0,
`is_locked` int(11) NOT NULL DEFAULT 0,
`is_sticky` int(11) NOT NULL DEFAULT 0,
PRIMARY KEY (`id`),
-- UNIQUE KEY `id` (`id`),
FOREIGN KEY (category_id) REFERENCES forum_categories (id), -- couldn't get this to work
KEY `FK_forum_topics_forum_categories` (`category_id`)
) ENGINE=InnoDB AUTO_INCREMENT=7040 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
Altri suggerimenti
You need to add these indexes:
CREATE INDEX ix_forum_posts_topic_id ON forum_posts(topic_id);
CREATE INDEX ix_forum_topics_last_activity ON forum_topics(last_activity);
CREATE INDEX ix_forum_topics_seen_topic_id_user_id ON forum_topics_seen(topic_id, user_id);
After that you could use this simple query:
WITH CTE_topics AS (SELECT id FROM forum_topics ORDER BY last_activity DESC LIMIT 3)
SELECT forum_topics.id,
forum_topics.title,
forum_topics.date_created,
forum_topics.updated,
forum_topics.last_activity,
forum_topics.category_id,
SUBSTR(forum_topics.content, 1, 70) as content,
forum_topics.author_id,
users.username,
users.avatar,
(SELECT COUNT(forum_posts.id) FROM forum_posts WHERE forum_posts.topic_id = forum_topics.id) as posts,
CASE WHEN forum_topics_seen.user_id = 49 then 1 else 0 end as seen,
forum_categories.category_order,
forum_categories.name as cat_name
FROM forum_topics
INNER JOIN CTE_topics ON forum_topics.id = CTE_topics.id
INNER JOIN users ON users.id = forum_topics.author_id
LEFT JOIN forum_topics_seen ON forum_topics_seen.topic_id = forum_topics.id AND forum_topics_seen.user_id = 49
LEFT JOIN forum_categories ON forum_categories.id = forum_topics.category_id;
Turn the query inside-out. First, let's check the viability of such. Will this give you the correct 3 ids?
SELECT ft.id
FROM forum_topics AS ft
ORDER BY ft.last_activity DESC
LIMIT 3
If so, then build the rest of the query starting with that:
SELECT ...
FROM ( the-above-query )
JOIN forum_topics AS ft2 ON ft2.id = ft.id -- to get other ft2 columns
JOIN/LEFT-JOIN the other tables
ORDER BY ft.last_activity DESC
The GROUP BY
probably is not needed. This is a speedup.
The other tables will need to look at only 3 rows each, instead of lots. Hence, it may be much faster.
I don't understand the purpose of users. If it is part of the filtering, then the inside ("derived table") query will need to include it:
SELECT ft.id
FROM forum_topics AS ft
INNER JOIN users AS u ON u.id = ft.author_id
ORDER BY ft.last_activity DESC
LIMIT 3
and you would need to re-join to it to get username
and avatar
Other issues...
A PRIMARY KEY
is a UNIQUE
index; don't redundantly say UNIQUE(id)
.
You have no indexes other than id
; that is terrible. These are probably useful:
forum_topics: (last_activity, id) -- "covering" for the first use
forum_posts: (topic_id, id) -- "covering"
forum_topics_seen: (user_id, topic_id) -- in either order