MySQL union within derived table (related_id=a AND related_id=b) OR (related_id=z)

Question 1

If you are looking for performance on MySQL you should definitely avoid using nested queries and unions — most of them result in a temporary table creation and scanning without indexes. There are rare examples that the derived temporary table still uses indexes and that only work on some specific circumstances and MySQL distributions.

My suggestion would be to rewrite the query to inner/outer joins only, like this:

select distinct u.* from users as u 
  left outer join tags_data as t on 
    t.user_id=u.user_id and t.tag_id=1003 
  inner join tags_data as t2 on 
    t2.user_id=u.user_id 
    and (t2.tag_id=1004 or (t2.tag_id=1001 and t.tag_id=1003));

If you can be sure that no user can have both 1004 and (1001 and 1003) tags, you may also remove the "distinct" from this query, which would avoid a temporary table creation.

You should also definitely use indexes, like these:

create index tags_data__user_id__idx on tags_data(user_id);
create index tags_data__tag_id__idx on tags_data(tag_id);

This would make a 150k+ result set very easy to query.

Question 2

Use an inner query that groups up all tags for each user into one value, then use a simple filter in the where clause:

select u.*
from users u
join (
  select user_id, group_concat(tag_id order by tag_id) tags
  from tags_data
  group by user_id
) t on t.user_id = u.user_id
where tags rlike '1001.*1003|1004'

See SQLFiddle of this query running against your sample data.

If there where many tags, you could add where tag_id in (1001, 1003, 1004) to the inner query to reduce the size of the tags list as a small optimization. Testing will show whether this makes much difference.

This should perform pretty well, because each table is scanned only once.

Question 3

Efficient, but inelegant, and not flexible at all:

SELECT users.*
FROM users
LEFT JOIN tags_data AS tag1001
    ON (tag1001.user_id = users.user_id AND tag1001.tag_id = 1001)
LEFT JOIN tags_data AS tag1003
    ON (tag1003.user_id = users.user_id AND tag1003.tag_id = 1003)
LEFT JOIN tags_data AS tag1004
    ON (tag1004.user_id = users.user_id AND tag1004.tag_id = 1004)
WHERE (tag1001.tag_id AND tag1003.tag_id) OR (tag1004.tag_id);