質問

I have read that joins are better than subqueries.

But

EXPLAIN QUERY PLAN
SELECT Queue.Id, NULL
    FROM Queue
    INNER JOIN LastQueue
    ON Queue.Id=LastQueue.Id

gives

Array
(
    [0] => Array
        (
            [selectid] => 0
            [order] => 0
            [from] => 0
            [detail] => SCAN TABLE Queue (~1000000 rows)
        )

    [1] => Array
        (
            [selectid] => 0
            [order] => 1
            [from] => 1
            [detail] => SEARCH TABLE LastQueue USING INTEGER PRIMARY KEY (rowid=?) (~1 rows)
        )

)

while

EXPLAIN QUERY PLAN
SELECT Queue.Id, NULL
    FROM Queue
    WHERE (SELECT 1 FROM LastQueue WHERE Queue.Id=LastQueue.Id) IS NOT NULL

gives

Array
(
    [0] => Array
        (
            [selectid] => 0
            [order] => 0
            [from] => 0
            [detail] => SCAN TABLE Queue (~500000 rows)
        )

    [1] => Array
        (
            [selectid] => 0
            [order] => 0
            [from] => 0
            [detail] => EXECUTE CORRELATED SCALAR SUBQUERY 1
        )

    [2] => Array
        (
            [selectid] => 1
            [order] => 0
            [from] => 0
            [detail] => SEARCH TABLE LastQueue USING INTEGER PRIMARY KEY (rowid=?) (~1 rows)
        )

)

I think I have to multiply the number of rows to get an idea of cost. I am right?

Then,

  • Using join: 1000000*1
  • Using subquery: 500000*1*1

Then, is the subquery is faster than join?

Is there a good tutorial to learn how to understand EXPLAIN/EXPLAIN QUERY PLAN results?

And why does it say that SCAN TABLE Queue is ~1000000 and ~500000, when that table has 76 rows?

役に立ちましたか?

解決

The line EXECUTE CORRELATED SCALAR SUBQUERY 1 is just there because you're using a different query syntax.

The actual execution is the same in both cases: SQLite goes through all records of the Queue table, and tries to look up the corresponding record in the LastQueue table.

That the estimated record counts are different is of no concern to you, because you know that the actual number of records is the same.

Joins might be better than subqueries in other databases, but in SQLite, which uses only nested loop joins, the only difference is that a join allows SQLite to choose the outer and the inner table in the join.

In any case, you should write the query in the most simple and maintainable way, and optimize it only if you have measured that you get a noticeable and necessary improvement.


Please note that instead of:

WHERE (SELECT ...) IS NOT NULL

it would be more idiomatic to write:

WHERE EXISTS (SELECT ...)

他のヒント

the explain is giving inacurrate counts for your tables. Try using analyze to collect stats on tables, then try your explain again.

I think you will find both queries will return at almost the exact same speed. Maybe the subquery is slower due to an extra step. The important thing to notice here is "SCAN TABLE" which means its looking at all the rows on disk.

http://www.sqlite.org/lang_analyze.html

Also is it possible there are no indexes on your table? Because it should be using them but it appears it is not. Make sure you use primary key on your create table statement.

http://www.sqlite.org/lang_createtable.html

ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top