Question

I am using PostgreSQL 9.4.4. I have a query like this:

 SELECT COUNT(*) FROM A,B WHERE A.a = B.b

a and b are the Primary Keys of tables A and B, so there are B-indexes on a & b

By default, PostgreSQL will use seq-scan on AB and use hash join, I force it to do the index-scan and index-only-scan.

The result showed that, seq scan is much faster than the other two, it takes more time to do the full scan on a,b for index-scan and index-only-scan.

EXPLAIN ANALYZE SELECT COUNT(*) FROM journal,paper WHERE journal.paper_id = paper.paper_id;

enter image description here

Can someone explain it?

Thank you so much!

Was it helpful?

Solution 2

I know the reason now.

I need to vacuum the tables before using index-only-scan, otherwise, if a sufficiently high number of heap pages where modified since the last vacuum, the planner will not choose to use index-only scan. When only a small amount of pages have been changed, an index-only scan may happen, which then involves heap fetches.

If I force it to use index-only-scan, it will fetch data from table for each tuple scanned, which can cause great cost.

OTHER TIPS

This is quite a common query (pardon the pun! :-) ) from people running queries which perform full table scans (FTS), when the poster feels that the system should make use of the index(es).

Basically, it boils down to the explanation given here. If the tables are so small, the optimiser will say that "it's not worth the bother of going to the index, doing a lookup and then fetching the data, instead, I'll just slurp in all the data and pick out what I need", i.e. perform an FTS.

[EDIT in answer to @txsing's comment]

For an MVCC (multi-version concurrency control) database, you have to traverse every record for a count at a given moment - that's why, for example, a COUNT(*) is much more expensive for MySQL's InnoDB rather than MyISAM.

An excellent explantion (for PostgreSQL) is available here. The guy who wrote this post is a "major contributor" to PostgreSQL (thanks to @dezso for leading me to that post).

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top