Question

I'm having a situation very similar to the one described here:

I've got a SaaS situation where I'm using 1000+ schemas in a single database (each schema contains the same tables, just different data per tenant). I used schemas so that the shared app servers could share a connection to the single database for all schemas. Things are working fine.

and, while the application itself appears to be working fine, some queries involving system catalogs are very slow. Also, psql's auto-completion is totally useless and \dt is very slow.

In particular, I need to calculate the on-disk size of each schema using something like this:

SELECT sum(pg_total_relation_size(c.oid)) AS size
FROM   pg_namespace n
JOIN   pg_class     c ON c.relnamespace = n.oid
WHERE  n.nspname = 'abbiecarmer'
AND    c.relkind = 'r';

which is quite slow. Looking at the query plan, I see

Aggregate  (cost=136903.16..136903.17 rows=1 width=4) (actual time=1024.420..1024.420 rows=1 loops=1)             
  ->  Hash Join  (cost=8.28..136902.86 rows=59 width=4) (actual time=143.247..1016.749 rows=60 loops=1)           
        Hash Cond: (c.relnamespace = n.oid)                                                                       
        ->  Seq Scan on pg_class c  (cost=0.00..133645.24 rows=866333 width=8) (actual time=0.045..943.029 rows=879788 loops=1)                                                                                                    │
              Filter: (relkind = 'r'::"char")                                                                     
              Rows Removed by Filter: 2610112                                                                     
        ->  Hash  (cost=8.27..8.27 rows=1 width=4) (actual time=0.032..0.032 rows=1 loops=1)                      
              Buckets: 1024  Batches: 1  Memory Usage: 1kB                                                        
              ->  Index Scan using pg_namespace_nspname_index on pg_namespace n  (cost=0.00..8.27 rows=1 width=4)(actual time=0.029..0.030 rows=1 loops=1)                                                                        │
                    Index Cond: (nspname = 'abbiecarmer'::name)                                                   
Total runtime: 1024.476 ms                                                                                        

Which, if I understand it right, tells that 90% of the query time is spent sequentially scanning pg_class relation.

I stopped postmaster, ran the backend in single-user mode and added the following indexes:

create index pg_class_relnamespace_index on pg_class(relnamespace);
REINDEX INDEX pg_class_relnamespace_index;

create index pg_class_reltablespace_index on pg_class(reltablespace);
REINDEX INDEX pg_class_reltablespace_index;

(I've also got thousands of tablespaces too). Now the query is ~100 times faster and the plan looks much nicer:

Aggregate  (cost=846.91..846.92 rows=1 width=4) (actual time=10.609..10.610 rows=1 loops=1)                       
  ->  Nested Loop  (cost=0.00..846.61 rows=60 width=4) (actual time=0.069..0.320 rows=60 loops=1)                 
        ->  Index Scan using pg_namespace_nspname_index on pg_namespace n  (cost=0.00..8.27 rows=1 width=4) (actual time=0.023..0.024 rows=1 loops=1)                                                                              │
              Index Cond: (nspname = 'abbiecarmer'::name)                                                        
        ->  Index Scan using pg_class_relnamespace_index on pg_class c  (cost=0.00..837.59 rows=75 width=8) (actual time=0.043..0.271 rows=60 loops=1)                                                                             │
              Index Cond: (relnamespace = n.oid)                                                                 
              Filter: (relkind = 'r'::"char")                                                                    
              Rows Removed by Filter: 102                                                                        
Total runtime: 10.696 ms                                                                                         

However, in the above thread, Tom Lane, who is one of Postgres core contributors, says:

There are a lot of gotchas here, notably that the session in which you create the index won't know it's there (so in this case, a reindex on pg_class would likely be advisable afterwards). I still think you'd be nuts to try it on a production database, but ...

I'm also worried by the fact that modification of system catalogs seemed to be completely disabled in Postgres 9.0 and 9.1 (I'm using 9.2) - I suppose it was done for a reason?

So, the question is: what are the gotchas in adding an index to a system catalog in Postgres and will I be nuts if I (eventually) do that on a production system?

No correct solution

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top