Optimizing a query depends on several factors. The most important is the database engine. The second are the characteristics of the data. Your question provides information on neither of these.
A very important piece of information is the size of the two tables, the number rows in each table, and the number of distinct values of idTokenN
in each one. It is quite possible that the left outer join
is determining the performance characteristics of the query.
The very first thing you can do is remove the distinct
keyword. This is never needed in in
subqueries, and some database engines may not ignore it.
Another step to optimizing the query is to remove the in
subqueries. In some databases, these do not optimize well. They can be replaced by a join
and aggregation subquery:
SELECT p.idPath, p.token, p.isTV, r.rel
FROM path p LEFT OUTER JOIN
relation r
ON p.idTokenN = r.idTokenN JOIN
(select idPath, max(case when p.isTV = 'true' then 1 else 0 end) as HasTv,
(case when COUNT(*) between 2 and 3 then 1 else 0 end) as Has2_3
from path p
group by idpath
) pf
on p.idpath = pf.idpath and
pf.HasTv = 1 and pf.Has2_3 = 1;
There are definitely other things you can do, but beyond this, they become database dependent.