Question

I have a simple tag_map table

CREATE TABLE TagMap
(
TagID mediumint(7) unsigned,
ArticleID int(11) unsigned,
FOREIGN KEY(TagID) REFERENCES Tags(TagID) ON DELETE CASCADE,
FOREIGN KEY(ArticleID) REFERENCES Articles(ArticleID) ON DELETE CASCADE,
PRIMARY KEY(TagID,ArticleID)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE utf8mb4_unicode_ci KEY_BLOCK_SIZE=1

I get the co-tags (tags, which appear along with a specific tag):

SELECT TagID AS TagID2,COUNT(*) FROM TagMap WHERE ArticleID IN(
SELECT ArticleID FROM TagMap WHERE TagID=1 // This is TagID1
) 
GROUP BY TagID

How can I do this query for all tags to get

TagID1,TagID2,COUNT(*)

The table is huge (10-50M rows) and each article has tens of tags. Thus, performance is critical.

Was it helpful?

Solution

SELECT t1.TagID AS TagID1
    ,t2.TagID AS TagID2
    ,COUNT(1)
FROM TagMap AS t1
JOIN TagMap AS t2 ON t1.ArticleID = t2.ArticleID AND t1.TagID <> t2.TagID
GROUP BY t1.TagID, t2.TagID

UPDATED: index on ArticleID may help to improve query performance

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top