How to filter multiple many to many relationship based on multiple tags?
-
02-03-2021 - |
문제
I've a scheme of two main tables: problem
and tag
and a relation (which is a many to many connector) table: problem_tags
which excerpt of them is like:
Problem table:
----+------------------------------------+--------
id | name | rating
----+------------------------------------+--------
1 | Special Permutation | 1600
2 | Binary String Reconstruction | 1500
3 | Special Elements | 1500
4 | Alice, Bob and Candies | 1300
5 | K-th Not Divisible by n | 1200
6 | Same Parity Summands | 1200
7 | Sum of Round Numbers | 800
8 | Skier | 1400
9 | Square? | 900
Tag table:
id | name
----+---------------------------
1 | constructive algorithms
2 | dfs and similar
3 | math
4 | brute force
5 | implementation
6 | two pointers
7 | binary search
8 | data structures
Problem tags table:
problem_id | tag_id
------------+--------
1 | 1
2 | 1
2 | 2
2 | 3
3 | 4
3 | 5
3 | 6
4 | 5
5 | 3
5 | 7
My question is how can I filter out problems based on multiple tags, i.e. all problems that are tagged math and binary search and brute force; or all problems that are tagged math but not constructive algorithms; or for a more complex one all problems that are only tagged with math and implementation and nothing else?
Currently I've come up with something like this:
- Find all problem's ids that are tagged math (joining tag and problem_tags table)
- Find all problem's ids that are tagged binary search
- Find all problem's ids that are tagged brute force
- Get intersection of all above ids
- Select problems where their ids is in the above intersection
But my solution lacks when it reaches the second example (only tagged with selected tags) and I think it's not the most optimal and SQL-ish way for doing it.
해결책
I had just asked that on a different forum. The best solution for a 3rd Normal Form schema design was to select all that is needed in the linking table and compare the Cardinality (count) of the matching results.
The other forum was for Oracle. The answer you seek is the PostgreSQL version of this post: https://community.oracle.com/message/15613817#15613817
-- copy paste from original post
-- Oracle version
select *
from my_recipes rp
where rp.recipe_id in
(
select tg.recipe_id
from my_recipe_tags tg
where tg.tag_id in (
-- expands string '1:2:3' into a table w/ 3 row
select to_number(regexp_substr(:p_search_tags,'[^:]+', 1, level) )
from dual
connect by regexp_substr(:p_search_tags, '[^:]+', 1, level) is not null
)
group by tg.recipe_id
-- counts cardinality of Tags in search
having count(*) = regexp_count(:p_search_tags, '[^:]+')
) ;
Untested (needs to be fixed)
select *
from PROBLEMS p
where p.id in (
select pt.problem_id
from PROBLEM_TAGS pt
join TAGS t on pt.tag_id = t.tag_id
where t.name in ('math','binary search', 'brute force') -- should be from a CTE
group by pt.problem_id
having count(*) = 3 -- count(*) from CTE
)
For the 2nd version, add AND p.id NOT IN
clause.
For the 3rd version add a filter that ensures pt.problem_id
has the exact cardinality of the queried tags.
and p.id in (
select pt.problem_id
from PROBLEM_TAGS pt
group by pt.problem_id
having count(*) = 2 -- you have two tags for 3rd one
-- you should pull Cardinality from CTE also
)