postgresql - pk versus unique index

https://stackoverflow.com/questions/12932119

07-07-2021
|

Question

In a PostgreSQL 9 database there is a table that contains a serial field X which is a PK (oid enabled), and other fields.

Using postgres's pgadmin with that table - a query takes 30 seconds.

If I add a unique index on the same field X - same query in pgadmin takes 3 seconds.

PKs are implicitly indexes: http://www.postgresql.org/docs/current/static/indexes-unique.html

So why does the explicit index make a difference?

Is this a pgadmin issue?

Do I need an explicit index for the PK field or not?

La solution

No difference here (pg 9.1.2), I think it is an artifact (schema, case-insignificance?)

DROP SCHEMA tmp CASCADE;
CREATE SCHEMA tmp ;
SET search_path=tmp;

CREATE TABLE lutser
        ( id INTEGER NOT NULL PRIMARY KEY
        , val INTEGER NOT NULL
        );
INSERT INTO lutser(id,val)
SELECT g, g %31
FROM generate_series(1,100000) g
        ;

DELETE FROM lutser WHERE random() < .5;

VACUUM ANALYZE lutser;

EXPLAIN ANALYZE
SELECT COUNT(*) FROM lutser
WHERE id >= 1000 AND id < 2000
        ;

CREATE INDEX lutser_id ON lutser(id);

VACUUM ANALYZE lutser;

EXPLAIN ANALYZE
SELECT COUNT(*) FROM lutser
WHERE id >= 1000 AND id < 2000
        ;

Results:

NOTICE:  drop cascades to table tmp.lutser
DROP SCHEMA
CREATE SCHEMA
SET
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "lutser_pkey" for table "lutser"
CREATE TABLE
INSERT 0 100000
DELETE 50051
VACUUM
                                                          QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=20.28..20.29 rows=1 width=0) (actual time=0.294..0.295 rows=1 loops=1)
   ->  Index Scan using lutser_pkey on lutser  (cost=0.00..19.03 rows=499 width=0) (actual time=0.015..0.216 rows=487 loops=1)
         Index Cond: ((id >= 1000) AND (id < 2000))
 Total runtime: 0.343 ms
(4 rows)

CREATE INDEX
VACUUM
                                                         QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=19.03..19.04 rows=1 width=0) (actual time=0.232..0.232 rows=1 loops=1)
   ->  Index Scan using lutser_id on lutser  (cost=0.00..17.79 rows=497 width=0) (actual time=0.033..0.185 rows=487 loops=1)
         Index Cond: ((id >= 1000) AND (id < 2000))
 Total runtime: 0.266 ms
(4 rows)

Autres conseils

Take a look at the query plans (in pgAdmin you can select the EXPLAIN option in the query window, if I remember correctly), to see what's going on in more detail. Are you sure you're not just reading data from the postgresql cache with the second query, or is consistently behaving like that, regardless of the order in which the queries are fired?

Also, run a vacuum after deleting/inserting/manipulating lots of data as that can make a huge difference. I'd be surprised if adding a (superfluous) index is the cause. And as @Eelke points out, are you sure there's an PK already defined on that column?

Licencié sous: CC-BY-SA avec attribution

Non affilié à StackOverflow