Question

I have this from EXPLAIN ANALYZE

 ->  Nested Loop  (cost=2173.66..30075.48 rows=77 width=4)
                  (actual time=30.949..399.463 rows=95959 loops=1)

So there's a difference in almost 3 orders of magnitude in expected rows vs actual rows and it's resulted in a very slow query.

I bumped default_statistics_target to 10000 and ran VACUUM/ANALYZE to get the query planner up to date with the new statistics. How can i get the query planner to choose a better join strategy?

I'm using postgres 9.3.1. All of my planner cost constants are still default so:

seq_page_cost: 1
random_page_cost: 4
cpu_tuple_cost: .01
cpu_index_tuple_cost: .005
cpu_operator_cost: .0025
effective_cache_size: 128MB

I set enable_nested_loops = false and the query actually didn't run much faster. I was under the impression though that a large discrepancy in the number of rows the query planner estimated to be returned and the actual would likely result in an suboptimal query plan

The entire query plan looks like:

Aggregate  (cost=30444.87..30444.88 rows=1 width=0) (actual time=535.077..535.077     rows=1 loops=1)
      ->  Nested Loop  (cost=2174.08..30444.68 rows=76 width=0) (actual time=23.208..527.062 rows=95451 loops=1)
        ->  Nested Loop  (cost=2173.66..30075.48 rows=77 width=4) (actual time=23.200..351.275 rows=95959 loops=1)
          ->  Hash Left Join  (cost=2173.24..28013.64 rows=401 width=4) (actual time=23.188..133.224 rows=103609 loops=1)
                Hash Cond: (access_rights.target_id = departments.id)
                Join Filter: ((access_rights.target_type)::text = 'Department'::text)
                Filter: ((((access_rights.target_type)::text = 'Company'::text) AND (access_rights.target_id = 173)) OR (((access_rights.target_type)::text = 'User'::text) AND (access_rights.target_id = 11654)) OR (((access_rights.target_type)::text = 'UserGroup'::text) AND (access_rights.target_id = 126)) OR (((access_rights.target_type)::text = 'Department'::text) AND (departments.lft <= 7) AND (departments.rgt >= 8)))
                Rows Removed by Filter: 59127
                ->  Bitmap Heap Scan on access_rights  (cost=2135.97..27236.01 rows=26221 width=14) (actual time=22.844..79.391 rows=162736 loops=1)
                      Recheck Cond: ((((target_type)::text = 'Company'::text) AND (target_id = 173) AND ((section)::text = 'shop'::text)) OR (((target_type)::text = 'User'::text) AND (target_id = 11654) AND ((section)::text = 'shop'::text)) OR (((target_type)::text = 'UserGroup'::text) AND (target_id = 126) AND ((section)::text = 'shop'::text)) OR ((target_type)::text = 'Department'::text))
                      Filter: (((section)::text = 'shop'::text) AND (((active_on IS NOT NULL) AND (active_on <= '2013-10-29'::date) AND ((inactive_on IS NULL) OR (inactive_on > '2013-10-29'::date)) AND (frozen_activation IS NULL)) OR ((frozen_activation)::text = 'active'::text)))
                      Rows Removed by Filter: 9294
                      ->  BitmapOr  (cost=2135.97..2135.97 rows=80823 width=0) (actual time=22.530..22.530 rows=0 loops=1)
                            ->  Bitmap Index Scan on index_access_rights_on_tt_ti_cfc_cfv_ti_s  (cost=0.00..643.10 rows=6861 width=0) (actual time=16.106..16.106 rows=96993 loops=1)
                                  Index Cond: (((target_type)::text = 'Company'::text) AND (target_id = 173) AND ((section)::text = 'shop'::text))
                            ->  Bitmap Index Scan on index_access_rights_on_tt_ti_cfc_cfv_ti_s  (cost=0.00..4.77 rows=12 width=0) (actual time=0.033..0.033 rows=0 loops=1)
                                  Index Cond: (((target_type)::text = 'User'::text) AND (target_id = 11654) AND ((section)::text = 'shop'::text))
                            ->  Bitmap Index Scan on index_access_rights_on_tt_ti_cfc_cfv_ti_s  (cost=0.00..11.68 rows=112 width=0) (actual time=0.238..0.238 rows=1200 loops=1)
                                  Index Cond: (((target_type)::text = 'UserGroup'::text) AND (target_id = 126) AND ((section)::text = 'shop'::text))
                            ->  Bitmap Index Scan on index_access_rights_on_target_type  (cost=0.00..1450.21 rows=73837 width=0) (actual time=6.148..6.148 rows=73837 loops=1)
                                  Index Cond: ((target_type)::text = 'Department'::text)
                ->  Hash  (cost=24.34..24.34 rows=1034 width=12) (actual time=0.331..0.331 rows=1034 loops=1)
                      Buckets: 1024  Batches: 1  Memory Usage: 45kB
                      ->  Seq Scan on departments  (cost=0.00..24.34 rows=1034 width=12) (actual time=0.004..0.179 rows=1034 loops=1)
          ->  Index Scan using tickets_pkey on tickets  (cost=0.42..5.13 rows=1 width=8) (actual time=0.002..0.002 rows=1 loops=103609)
                Index Cond: (id = access_rights.ticket_id)
                Filter: (((hold_until IS NULL) OR (hold_until <= '2013-10-29 00:00:00'::timestamp without time zone)) AND (company_id = 173))
                Rows Removed by Filter: 0
    ->  Index Scan using events_pkey on events  (cost=0.42..4.78 rows=1 width=4) (actual time=0.001..0.002 rows=1 loops=95959)
          Index Cond: (id = tickets.event_id)
          Filter: ((NOT activity) AND ((canceled_at IS NULL) OR (canceled_at > '2013-10-29 23:11:37.486572'::timestamp without time zone)))
          Rows Removed by Filter: 0
Total runtime: 535.165 ms

We have 17GB ram

The Point of this query is to find events that have tickets that a user has shop access to. Access can be determined in a variety of ways. If a user is part of a department that has access rights to a given ticket, if a users department is a parent of a department that has access(nested set lft, rgt etc). A user can have access if the entire company is given an access_right to those tickets. A user can be a part of a UserGroup that has access. A user can be given individual access rights to the tickets. A users company must own the tickets. Tickets can be "frozen" or "inactive" in which case a user won't have access. A ticket is inactive if the "active_on" > Today or "inactive_on" < Today. Tickets are not available if they tickets.hold_until > Today

The query I'm running is

EXPLAIN ANALYZE
SELECT count(*) AS count_all
FROM "events"
INNER JOIN tickets ON events.id = tickets.event_id
INNER JOIN access_rights ON access_rights.ticket_id = tickets.id
LEFT OUTER JOIN departments ON departments.id = access_rights.target_id
     AND access_rights.target_type = 'Department'
WHERE ((("events"."activity" = 'f') AND (events.canceled_at IS NULL OR events.canceled_at > '2013-10-29 23:11:37.486572'))
AND ((((((access_rights.section = 'shop') AND (access_rights.target_type = 'Company'
AND access_rights.target_id = 173)) OR ((access_rights.section = 'shop')
AND (access_rights.target_type = 'User' AND access_rights.target_id = 11654)) OR ((access_rights.section = 'shop')
AND (access_rights.target_type = 'UserGroup'
AND access_rights.target_id IN ('126'))) OR ((access_rights.section = 'shop')
AND (access_rights.target_type = 'Department'
AND departments.lft <= 7 AND departments.rgt >= 8))) 
AND ((access_rights.section = 'shop')
AND ((((access_rights.section = 'shop')
AND (access_rights.active_on IS NOT NULL
AND access_rights.active_on <= '2013-10-29'
AND (access_rights.inactive_on IS NULL OR access_rights.inactive_on > '2013-10-29')))
AND (access_rights.frozen_activation IS NULL)) OR ((access_rights.section = 'shop')
AND (access_rights.frozen_activation = 'active')))))
AND (tickets.hold_until IS NULL OR tickets.hold_until <= '2013-10-29'))
AND (tickets.company_id = 173)));

Tables:

CREATE TABLE tickets (
    hold_until timestamp without time zone,
    event_id integer,
    id integer NOT NULL
 );

Indexes:
    "tickets_pkey" PRIMARY KEY, btree (id)
    "index_tickets_on_company_id" btree (company_id)
    "index_tickets_on_created_at" btree (created_at)
    "index_tickets_on_creation_id" btree (creation_id)
    "index_tickets_on_event_id" btree (event_id)
    "index_tickets_on_hold_until" btree (hold_until)

Foreign-key constraints:
    "tickets_attendee_id_fk" FOREIGN KEY (attendee_id) REFERENCES attendees(id)
    "tickets_company_id_fk" FOREIGN KEY (company_id) REFERENCES companies(id)
    "tickets_event_id_fk" FOREIGN KEY (event_id) REFERENCES events(id)

CREATE TABLE events (
     id integer NOT NULL,
     activity boolean DEFAULT false NOT NULL
 );

Indexes:
    "events_pkey" PRIMARY KEY, btree (id)
    "index_events_on_id_and_te_id" UNIQUE, btree (id, te_id)
    "index_events_on_activity" btree (activity)
    "index_events_on_canceled_at" btree (canceled_at)
    "index_events_on_company_id" btree (company_id)
    "index_events_on_name" btree (name)
    "index_events_on_occurs_at" btree (occurs_at)

Foreign-key constraints:
    "events_company_id_fk" FOREIGN KEY (company_id) REFERENCES companies(id)

CREATE TABLE departments (
   id integer NOT NULL,
   parent_id integer,
   lft integer NOT NULL,
   rgt integer NOT NULL
);

Indexes:
   "departments_pkey" PRIMARY KEY, btree (id)
   "index_departments_on_company_id_and_parent_id_and_name" UNIQUE, btree (company_id, parent_id, name)
   "index_departments_on_company_id" btree (company_id)
   "index_departments_on_lft" btree (lft)
   "index_departments_on_name" btree (name)
   "index_departments_on_parent_id" btree (parent_id)
   "index_departments_on_rgt" btree (rgt)

Foreign-key constraints:
   "departments_company_id_fk" FOREIGN KEY (company_id) REFERENCES companies(id)

CREATE TABLE access_rights (
   id integer NOT NULL,
   target_type character varying(255) NOT NULL,
   target_id integer NOT NULL,
   ticket_id integer NOT NULL,
   active_on date,
   visible boolean,
   inactive_on date,
   frozen_activation character varying(255)
);

Indexes:
   "access_rights_pkey" PRIMARY KEY, btree (id)
   "index_access_rights_on_tt_ti_cfc_cfv_ti_s" UNIQUE, btree (target_type, target_id, custom_field_condition, custom_field_value, ticket_id, section)
   "index_access_rights_on_active_on" btree (active_on)
   "index_access_rights_on_custom_field_value" btree (custom_field_value)
   "index_access_rights_on_frozen_activation" btree (frozen_activation)
   "index_access_rights_on_inactive_on" btree (inactive_on)
   "index_access_rights_on_section" btree (section)
   "index_access_rights_on_target_id" btree (target_id)
   "index_access_rights_on_target_type" btree (target_type)
   "index_access_rights_on_target_type_and_target_id" btree (target_type, target_id) CLUSTER
   "index_access_rights_on_ticket_id" btree (ticket_id)
   "index_access_rights_on_visible" btree (visible)

Foreign-key constraints:
   "access_rights_ticket_id_fk" FOREIGN KEY (ticket_id) REFERENCES tickets(id)

I know that's a lot, thanks for taking the time to look it over

Était-ce utile?

La solution

Server configuration

That much is clear: the default settings are very conservative and intended to work for small installations with limited resources out of the box. For a dedicated DB server, some default settings are just inadequate. You have to tune your settings.

To start with, if you have enough RAM to cache all or most of your DB, set random_page_cost drastically lower. And increase the relative cost of CPU operations. Something like (this is pure guesswork!):

seq_page_cost: 1
random_page_cost: 1.2
cpu_tuple_cost: .02
cpu_index_tuple_cost: .02
cpu_operator_cost: .005

And effective_cache_size is regularly much too low. For a dedicated DB server this can be as high as three quarters of your total RAM.

@Craig has assembled a long list of advice for performance tuning:
Optimise PostgreSQL for fast testing

The Postgres Wiki has even more.

Query

Too many redundant parentheses, too hard to read. Use table aliases and format before trying to debug - much less presenting to the general public. After untangling:

SELECT count(*) AS count_all
FROM   events           e
JOIN   tickets          t ON t.event_id = e.id
JOIN   access_rights    a ON a.ticket_id = t.id
LEFT   JOIN departments d ON d.id = a.target_id
                         AND a.target_type = 'Department'
WHERE  e.activity = 'f'
AND   (e.canceled_at IS NULL OR e.canceled_at > '2013-10-29 23:11:37')

AND   (t.hold_until IS NULL OR t.hold_until <= '2013-10-29')
AND    t.company_id = 173;

AND    a.section = 'shop'
AND   (a.target_type = 'Company'   AND a.target_id = 173
   OR  a.target_type = 'User'      AND a.target_id = 11654
   OR  a.target_type = 'UserGroup' AND a.target_id IN (126)
   OR                                  d.lft <= 7 AND d.rgt >= 8
    -- a.target_type = 'Department' is redundant
) 
AND   (a.frozen_activation = 'active'
   OR     a.active_on <= '2013-10-29'
     AND (a.inactive_on IS NULL OR a.inactive_on > '2013-10-29')
     AND  a.frozen_activation IS NULL
)

Major points

  • Redundant: AND a.active_on IS NOT NULL, since you also have AND a.active_on <= '2013-10-29'

  • AND a.target_id IN ('126') should be AND a.target_id = 126 or at least AND a.target_id IN (126) (numeric constant).

  • a.target_type = 'Department' is redundant, since its already in the LEFT JOIN

  • AND a.section = 'shop' is redundant many times.

  • target_type_id should most probably be an enum or integer referencing a table target_type instead of a varchar(255).

    CREATE TABLE access_rights (
       ...
      ,target_type_id integer NOT NULL REFERENCES target_type(target_type_id)
       ...
    );
    

    Similar for a.frozen_activation and a.section.

That would also make the index I am going to propose more effective.

Indices

Add a few multicolumn / partial indices. Tailor yourself, I don't know cardinalities and data distribution. Note the DESC clauses in strategic places.

CREATE INDEX e_idx ON events (company_id, event_id, hold_until)
WHERE activity = FALSE;

CREATE INDEX t_idx ON tickets (company_id, event_id, hold_until DESC);

CREATE INDEX a_idx1 ON access_rights (target_type_id, target_id)
WHERE section = 'shop';

CREATE INDEX a_idx2 ON access_rights
                   (frozen_activation, active_on DESC, inactive_on)
WHERE section = 'shop';

CREATE INDEX d_idx ON departments (target_type, lft DESC, rgt);

Other than that you only need the primary keys and indices on foreign keys. All other indices you display would then be useless for this query. Delete some if they are not needed elsewhere.

For details on how to tailor these indices consider the related answer on dba.SE:

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top