Help with Query. Finding records that have the same relationships (MySQL)

https://dba.stackexchange.com/questions/14391

16-10-2019
|

Pergunta

I have a table of Listings that has a many to many relationship with a Taxons table. The table structure looks like this:

listings
----------------
id (int)
name (varchar)

listings_taxons
----------------
listing_id (int)
taxon_id (int)

taxons
----------------
id (int)
name (varchar)

My goal is to select all rows in the listings table that has a matching list of taxon ids. Each returned listing record must have a relationship with the two taxons, such that I get the record set containing the intersection of records between the two taxons.

Example: I have a listing called "Muffler" and it has the following taxons: "Ford", "Mustang", "Exhaust". If I query for all listings with "Ford" and "Exhaust" I should get all listings that have "Ford" and "Exhaust" as taxons.

How would I construct this query efficiently?

Solução

SELECT B.name
FROM
(
    SELECT BB.listing_id id,COUNT(1) taxon_count
    FROM
    (
        SELECT id taxon_id FROM taxons
        WHERE name IN ('Ford','Exhaust')
    ) AA
    INNER JOIN listings_taxons BB
    USING (taxon_id)
    GROUP BY listing_id HAVING COUNT(1) = 2
) A
INNER JOIN listings B USING (id);

Subquery A will bring back all listing_ids that have Ford, Exhaust, or both. Doing the GROUP BY count within Subquery A gives any listing id that has a COUNT(1) of 2 has both Ford and Exhaust taxon ids becasue BB.listing_id would appears twice thus HAVING COUNT(1) = 2. Then Subquery A has an INNER JOIN with listings.

Make sure you have the following indexes

ALTER TABLE listings_taxons ADD INDEX taxon_listing_ndx (taxon_id,listing_id);
ALTER TABLE taxons ADD INDEX name_id_ndx (name,id);

Here is some sample data

drop database if exists nwwatson;
create database nwwatson;
use nwwatson
create table listings
(id int not null auto_increment,
name varchar(25),
primary key (id),
key (name));
create table taxons like listings;
create table listings_taxons
(
    listing_id int,
    taxon_id int,
    primary key (listing_id,taxon_id),
    unique key (taxon_id,listing_id)
);
insert into listings (name) values ('SteeringWheel'),('WindShield'),('Muffler'),('AC');
insert into taxons (name) values ('Ford'),('Escort'),('Buick'),('Exhaust'),('Mustard');
insert into listings_taxons values
(1,1),(1,3),(1,5),(2,1),(2,2),(2,3),(2,5),
(3,1),(3,4),(4,2),(4,3),(4,4),(5,1),(5,5);
SELECT * FROM listings;
SELECT * FROM taxons;
SELECT * FROM listings_taxons;
SELECT B.name
FROM
(
    SELECT BB.listing_id id,COUNT(1) taxon_count
    FROM
    (
        SELECT id taxon_id FROM taxons
        WHERE name IN ('Ford','Exhaust')
    ) AA
    INNER JOIN listings_taxons BB
    USING (taxon_id)
    GROUP BY listing_id HAVING COUNT(1) = 2
) A
INNER JOIN listings B USING (id);

Here is it executed

mysql> drop database if exists nwwatson;
Query OK, 3 rows affected (0.09 sec)

mysql> create database nwwatson;
Query OK, 1 row affected (0.00 sec)

mysql> use nwwatson
Database changed
mysql> create table listings
    -> (
    -> id int not null auto_increment,
    -> name varchar(25),
    -> primary key (id),
    -> key (name)
    -> );
Query OK, 0 rows affected (0.08 sec)

mysql> create table taxons like listings;
Query OK, 0 rows affected (0.05 sec)

mysql> create table listings_taxons
    -> (
    ->     listing_id int,
    ->     taxon_id int,
    ->     primary key (listing_id,taxon_id),
    ->     unique key (taxon_id,listing_id)
    -> );
Query OK, 0 rows affected (0.08 sec)

mysql> insert into listings (name) values ('SteeringWheel'),('WindShield'),('Muffler'),('AC');
Query OK, 4 rows affected (0.06 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> insert into taxons (name) values ('Ford'),('Escort'),('Buick'),('Exhaust'),('Mustard');
Query OK, 5 rows affected (0.06 sec)
Records: 5  Duplicates: 0  Warnings: 0

mysql> insert into listings_taxons values
    -> (1,1),(1,3),(1,5),(2,1),(2,2),(2,3),(2,5),
    -> (3,1),(3,4),(4,2),(4,3),(4,4),(5,1),(5,5);
Query OK, 14 rows affected (0.11 sec)
Records: 14  Duplicates: 0  Warnings: 0

mysql> SELECT * FROM listings;
+----+---------------+
| id | name          |
+----+---------------+
|  4 | AC            |
|  3 | Muffler       |
|  1 | SteeringWheel |
|  2 | WindShield    |
+----+---------------+
4 rows in set (0.00 sec)

mysql> SELECT * FROM taxons;
+----+---------+
| id | name    |
+----+---------+
|  3 | Buick   |
|  2 | Escort  |
|  4 | Exhaust |
|  1 | Ford    |
|  5 | Mustard |
+----+---------+
5 rows in set (0.00 sec)

mysql> SELECT * FROM listings_taxons;
+------------+----------+
| listing_id | taxon_id |
+------------+----------+
|          1 |        1 |
|          1 |        3 |
|          1 |        5 |
|          2 |        1 |
|          2 |        2 |
|          2 |        3 |
|          2 |        5 |
|          3 |        1 |
|          3 |        4 |
|          4 |        2 |
|          4 |        3 |
|          4 |        4 |
|          5 |        1 |
|          5 |        5 |
+------------+----------+
14 rows in set (0.00 sec)

mysql> SELECT B.name
    -> FROM
    -> (
    ->     SELECT BB.listing_id id,COUNT(1) taxon_count
    ->     FROM
    ->     (
    ->         SELECT id taxon_id FROM taxons
    ->         WHERE name IN ('Ford','Exhaust')
    ->     ) AA
    ->     INNER JOIN listings_taxons BB
    ->     USING (taxon_id)
    ->     GROUP BY listing_id HAVING COUNT(1) = 2
    -> ) A
    -> INNER JOIN listings B USING (id);
+---------+
| name    |
+---------+
| Muffler |
+---------+
1 row in set (0.00 sec)

mysql>

Give it a Try !!!

Outras dicas

If I understand correctly, you want to perform relational-division. Try this question with lots of different ways to accomplish that: How to filter SQL results in a has-many-through relation.

I would go for the (multiple) JOIN solution but you can always test with your data and queries:

SELECT 
    li.*

FROM
    listings AS li

  JOIN
    listings_taxons AS lt1
      ON  lt1.listing_id = li.id
  JOIN
    taxons AS t1 
      ON  t1.id = lt1.taxon_id
      AND t1.name = 'Ford'

  JOIN
    listings_taxons AS lt2
      ON  lt2.listing_id = li.id
  JOIN
    taxons AS t2 
      ON  t2.id = lt2.taxon_id
      AND t2.name = 'Exhaust'

There are many ways to solve this classical case of a relational division.
For a list of taxons (more than just a few), this form is one of the syntactically shortest:

SELECT l.*
FROM  (
   SELECT lt.listing_id
   FROM   taxons t
   JOIN   listings_taxons lt ON lt.taxon_id = t.id
   WHERE  t.name IN ('Ford', 'Mustang', 'Exhaust')
   GROUP  BY lt.listing_id
   HAVING COUNT(*) = 3
   ) x
JOIN   listings l ON l.id = x.listing_id;

This assumes a UNIQUE constraint on (listing_id, taxon_id) in table listings_taxons.

Compare to other methods under this related question @ypercube already linked to, to find whether it is among the fastest, too.

SELECT listings.*
FROM listings
INNER JOIN listings_taxons ON listings.id = listings_taxons.listing_id
INNER JOIN taxons ON listing_taxons.taxon_id = taxon.id
WHERE taxon.id in 
  (SELECT taxon_id 
   FROM taxon
   WHERE name LIKE '%whatever%' OR name LIKE '%another%');

Is this what you mean?

Licenciado em: CC-BY-SA com atribuição

Não afiliado a dba.stackexchange