Domanda

I have a few tables with some (example) values:

people
----------------------------------
| id    | name                   |
----------------------------------
| 1     | Steve Jobs             |
| 2     | Bill Gates             |
| 3     | Linus Torvalds         |
| 4     | Nikola Tesla           |
| 5     | Henry Ford             |
----------------------------------

.

groups
----------------------------------
| id    | name                   |
----------------------------------
| 1     | Country                |
| 2     | Work                   |
----------------------------------

.

categories
------------------------------------------
| id    | gid   | name                   |
------------------------------------------
| 1     | 1     | USA                    |
| 2     | 1     | Finland                |
| 3     | 1     | Croatia                |
| 4     | 2     | Mac                    |
| 5     | 2     | iPhone                 |
| 6     | 2     | iPad                   |
| 7     | 2     | Windows                |
| 8     | 1     | Office                 |
| 9     | 1     | Linux                  |
| 10    | 1     | Coil                   |
| 11    | 1     | El. stuff              |
| 12    | 1     | Cars                   |
------------------------------------------

.

people_categories_map
--------------------------
| id    | pid   | cid    |
--------------------------
| 1     | 1     | 1      |
| 2     | 1     | 4      |
| 3     | 1     | 5      |
| 4     | 1     | 6      |
| 5     | 1     | 11     |
| 6     | 2     | 1      |
| 7     | 2     | 7      |
| 8     | 2     | 8      |
| 9     | 2     | 11     |
| 10    | 3     | 2      |
| 11    | 3     | 9      |
| 12    | 3     | 11     |
| 13    | 4     | 1      |
| 14    | 4     | 3      |
| 15    | 4     | 10     |
| 16    | 4     | 11     |
| 17    | 5     | 1      |
| 18    | 5     | 12     |
--------------------------

The challenge

What I'm trying to achieve, is to list people based on matches from each group. For example, I want to list people from (country) USA (categories.id:1) OR Finland (categories.id:2) AND Work Linux (categories.id:9) OR Cars (categories.id:12). This query would match only Linus Torvalds (people.id:3) based on relations from the people_categories_map table.

However, so far I have only managed to make queries that either list all category matches or any category matches, i.e.:

SELECT people.id, people.name FROM people JOIN people_categories_map ON people_categories_map.pid = people.id WHERE people_categories_map.cid IN (1,2);

SELECT people.id, people.name FROM people JOIN people_categories_map ON people_categories_map.pid = people.id WHERE people_categories_map.cid IN (1,9) HAVING COUNT(DISTINCT(people_categories_map.cid)) = 2;

Is it possible to combine these somehow, to make a query that returns a list of people that would be connected to "(category_x OR category_y) AND (category_z OR category_w OR category_m) AND (category_b OR category_c)" and so on... in other words, to return matches from all groups?

È stato utile?

Soluzione

As Alexandre in his comment said, your design is more than problematic.

Your example is faulty too: T

  1. The categories 8 - 12 should have the gid 2 not 1.
  2. The query should return "Henry Ford" too, because he lives in the USA and works on Cars.

Following not beautiful query with a lot of joins do it (I'm sure it could be optimized, but better to optimize the database design):

SELECT *
FROM people_categories_map pmc1
INNER JOIN people_categories_map pmc2
ON  pmc1.pid = pmc2.pid AND pmc1.cid <> pmc2.cid
INNER JOIN people p
ON pmc1.pid = p.id
INNER JOIN categories c1
ON pmc1.cid = c1.id
INNER JOIN categories c2
ON pmc2.cid = c2.id
INNER JOIN groups g1
ON c1.gid = g1.id
INNER JOIN groups g2
ON c2.gid = g2.id
WHERE
    c1.name IN ('USA', 'FINLAND')
AND
    c2.name IN ('Linux', 'Cars');

Explanation

First we do a self join of the mapping table, so we have access to both country and work of a person. It's not a real INNER JOIN, because our join condition contains an unequal-condition.

Then we join our people once, but categories and groups twice: first for country, secondly for work and filter so. You could swap those two, without losing or changing data. It's only a matter of definition.

Remark: This is not efficient, but I don't believe it is worth to optimize.


See a fiddle for the example

If you want a person only once, use

SELECT DISTINCT p.id, p.name
FROM ...
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top