SQL many-to-many matching
-
09-06-2019 - |
Question
I'm implementing a tagging system for a website. There are multiple tags per object and multiple objects per tag. This is accomplished by maintaining a table with two values per record, one for the ids of the object and the tag.
I'm looking to write a query to find the objects that match a given set of tags. Suppose I had the following data (in [object] -> [tags]* format)
apple -> fruit red food
banana -> fruit yellow food
cheese -> yellow food
firetruck -> vehicle red
If I want to match (red), I should get apple and firetruck. If I want to match (fruit, food) I should get (apple, banana).
How do I write a SQL query do do what I want?
@Jeremy Ruten,
Thanks for your answer. The notation used was used to give some sample data - my database does have a table with 1 object id and 1 tag per record.
Second, my problem is that I need to get all objects that match all tags. Substituting your OR for an AND like so:
SELECT object WHERE tag = 'fruit' AND tag = 'food';
Yields no results when run.
Solution
Given:
- object table (primary key id)
- objecttags table (foreign keys objectId, tagid)
tags table (primary key id)
SELECT distinct o.* from object o join objecttags ot on o.Id = ot.objectid join tags t on ot.tagid = t.id where t.Name = 'fruit' or t.name = 'food';
This seems backwards, since you want and, but the issue is, 2 tags aren't on the same row, and therefore, an and yields nothing, since 1 single row cannot be both a fruit and a food. This query will yield duplicates usually, because you will get 1 row of each object, per tag.
If you wish to really do an and in this case, you will need a group by
, and a having count = <number of ors>
in your query for example.
SELECT distinct o.name, count(*) as count
from object o join objecttags ot on o.Id = ot.objectid
join tags t on ot.tagid = t.id
where t.Name = 'fruit' or t.name = 'food'
group by o.name
having count = 2;
OTHER TIPS
Oh gosh I may have mis-interpreted your original comment.
The easiest way to do this in SQL would be to have three tables:
1) Tags ( tag_id, name )
2) Objects (whatever that is)
3) Object_Tag( tag_id, object_id )
Then you can ask virtually any question you want of the data quickly, easily, and efficiently (provided you index appropriately). If you want to get fancy, you can allow multi-word tags, too (there's an elegant way, and a less elegant way, I can think of).
I assume that's what you've got, so this SQL below will work:
The literal way:
SELECT obj
FROM object
WHERE EXISTS( SELECT *
FROM tags
WHERE tag = 'fruit'
AND oid = object_id )
AND EXISTS( SELECT *
FROM tags
WHERE tag = 'Apple'
AND oid = object_id )
There are also other ways you can do it, such as:
SELECT oid
FROM tags
WHERE tag = 'Apple'
INTERSECT
SELECT oid
FROM tags
WHERE tag = 'Fruit'
@Kyle: Your query should be more like:
SELECT object WHERE tag IN ('fruit', 'food');
Your query was looking for rows where the tag was both fruit AND food, which is impossible seeing as the field can only have one value, not both at the same time.
Combine Steve M.'s suggestion with Jeremy's you'll get a single record with what you are looking for:
select object
from tblTags
where tag = @firstMatch
and (
@secondMatch is null
or
(object in (select object from tblTags where tag = @secondMatch)
)
Now, that doesn't scale very well but it will get what you are looking for. I think there is a better way to go about doing this so you can easily have N number of matching items without a great deal of impact to the code but it currently escapes me.
I recommend the following schema.
Objects: objectID, objectName
Tags: tagID, tagName
ObjectTag: objectID,tagID
With the following query.
select distinct
objectName
from
ObjectTab ot
join object o
on o.objectID = ot.objectID
join tabs t
on t.tagID = ot.tagID
where
tagName in ('red','fruit')
I'd suggest making your table have 1 tag per record, like this:
apple -> fruit
apple -> red
apple -> food
banana -> fruit
banana -> yellow
banana -> food
Then you could just
SELECT object WHERE tag = 'fruit' OR tag = 'food';
If you really want to do it your way though, you could do it like this:
SELECT object WHERE tag LIKE 'red' OR tag LIKE '% red' OR tag LIKE 'red %' OR tag LIKE '% red %';