Question

Are there rules of thumb for developers when to use join instead of subquery or are they the same.

Was it helpful?

Solution

Depends on RDBMS. You should compare execution plans for both queries.

In my experience with Oracle 10 and 11, execution plans are always the same.

OTHER TIPS

The first principle is "State the query accurately". The second principle is "state the query simply and obviously" (which is where you usually make choices). The third is "state the query so it will process efficiently".

If its a dbms with a good query processor, equivalent query designs should should result in query plans that are the same (or at least equally efficient).

My greatest frustration upon using MySQL for the first time was how conscious I had to be to anticipate the optimizer. After long experience with Oracle, SQL Server, Informix, and other dbms products, I very seldom expected to concern myself with such issues. It's better now with newer versions of MySQL, but it's still something I end up needing to pay attention to more often than with the others.

Performance-wise, they don't have any difference in most modern DB engines.

Problem with subqueries is that you might end having a sub-resultset without any key, so joining them would be more expensive.

If possible, always try to make JOIN queries and filter with ON clause, instead of WHERE (although it should be the same, as modern engines are optimized for this).

Theoretically every subquery can be changed to a join query.

As with many things, it depends. - how complex is the subquery - in a query how often is the subquery executed

I try to avoid subqueries whenever I can. Especially when expecting large result sets never use subqueries - in case the subquery is executed for each item of the result set.

take care, Alex

Let's ignore the performance impact for now (as we should if we are aware that "Premature optimization is the root of all evil").

Choose what looks clearer and easier to maintain.

In SQL Server a correlated subquery usually performs worse than a join or, often even better for performance, a join to a derived table. I almost never write a subquery for anything that will have to be performed multiple times. This is because correlated subqueries often basically turn your query into a cursor and run one row at a time. In databases it is usually better to do things in a set-based fashion

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top