Question

I have a simple tag_map table

CREATE TABLE TagMap
(
TagID mediumint(7) unsigned,
ArticleID int(11) unsigned,
FOREIGN KEY(TagID) REFERENCES Tags(TagID) ON DELETE CASCADE,
FOREIGN KEY(ArticleID) REFERENCES Articles(ArticleID) ON DELETE CASCADE,
PRIMARY KEY(TagID,ArticleID)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE utf8mb4_unicode_ci KEY_BLOCK_SIZE=1

I get the co-tags (tags, which appear along with a specific tag):

SELECT TagID AS TagID2,COUNT(*) FROM TagMap WHERE ArticleID IN(
SELECT ArticleID FROM TagMap WHERE TagID=1 // This is TagID1
) 
GROUP BY TagID

How can I do this query for all tags to get

TagID1,TagID2,COUNT(*)

The table is huge (10-50M rows) and each article has tens of tags. Thus, performance is critical.

Était-ce utile?

La solution

SELECT t1.TagID AS TagID1
    ,t2.TagID AS TagID2
    ,COUNT(1)
FROM TagMap AS t1
JOIN TagMap AS t2 ON t1.ArticleID = t2.ArticleID AND t1.TagID <> t2.TagID
GROUP BY t1.TagID, t2.TagID

UPDATED: index on ArticleID may help to improve query performance

Licencié sous: CC-BY-SA avec attribution
Non affilié à dba.stackexchange
scroll top