Postgres 11: Query plan uses seq scan after upgrade

https://dba.stackexchange.com/questions/282271

13-03-2021
|

Question

The Situation

We have a database hosted on RDS with a few hundred tables, a few of which are quite large.

We recently upgraded the database from 9.5.22 to 11.8 and performance is significantly degraded.

After upgrading, we ran VACUUM ANALYZE on the instance (as opposed to ./analyze_new_cluster.sh as we're unable to run a shell on RDS instances).

This has not helped the situation. I spun up another standalone 11.8 instance of the database and ran a VACUUM FULL ANALYZE, and that database exhibits the same query planner behavior so including FULL in the VACUUM command did not help (as is suggested in some SO answers).

We have found one query that shows the most drastic change in performance before and after the upgrade:

SELECT f.uuid, p.name
FROM flights f 
LEFT OUTER JOIN passengers p 
    ON f.uuid = p.flight_id 
WHERE f.uuid IN (< UUIDs >)
ORDER BY f.date_created ASC;

Previously P95 latency was under 4 ms. Now, P95 is 15 seconds.

the trouble arises when the number of UUIDs in the WHERE clause includes 5 or more UUIDs.

The tables involved have the following (simplified) structure:

                    Table "public.flights"
 Column       | Type                     | Modifiers | Storage | Stats target 
--------------+--------------------------+-----------+---------+--------------
 uuid         | uuid                     | not null  | plain   |              
 date_created | timestamp with time zone | not null  | plain   |              

Indexes:
    "flights_pkey" PRIMARY KEY, btree (uuid)


                    Table "public.passengers"
 Column    | Type                   | Modifiers                     | Storage | Stats target
-----------+------------------------+-------------------------------+---------+-------------
 id        | bigint                 | not null default nextval(...) | plain   |             
 flight_id | uuid                   | not null                      | plain   |             
 name      | character varying(128) | not null                      | plain   |             

Indexes:
    "passengers_pkey" PRIMARY KEY, btree (id)
    "passengers_a08cee2d" btree (flight_id)
Foreign-key constraints:
    "p_flight_id_75a46b87233dc365_fk_flights_uuid" FOREIGN KEY (flight_id) REFERENCES flights(uuid) DEFERRABLE INITIALLY DEFERRED

The flights table has approx 17 million rows. The passengers table has approx 2.6 billion rows.

The Execution Plans

postgres 9.5 instance (with 50 UUIDs in the WHERE clause)

 Sort  (cost=7273695.73..7273707.45 rows=4688 width=36) (actual time=0.420..0.420 rows=0 loops=1)
   Sort Key: f.date_created
   Sort Method: quicksort  Memory: 25kB
   ->  Nested Loop Left Join  (cost=1652.68..7273409.89 rows=4688 width=36) (actual time=0.408..0.408 rows=0 loops=1)
         ->  Index Scan using flights_pkey on flights f  (cost=0.56..428.86 rows=50 width=24) (actual time=0.406..0.406 rows=0 loops=1)
               Index Cond: (uuid = ANY ('{2c0adac6-79bb-48a1-a0ba-bd8f537d68de,...,a6605812-9a5b-46c4-9989-4d24d195e1c0}'::uuid[]))
         ->  Bitmap Heap Scan on passengers p  (cost=1652.12..145082.56 rows=37706 width=28) (never executed)
               Recheck Cond: (f.uuid = flight_id)
               ->  Bitmap Index Scan on passengers_a08cee2d  (cost=0.00..1642.70 rows=37706 width=0) (never executed)
                     Index Cond: (f.uuid = flight_id)
 Planning time: 0.289 ms
 Execution time: 0.479 ms
(12 rows)

postgres 11 instance (with 50 UUIDs in the WHERE clause)

 Gather Merge  (cost=3149109.16..3149552.99 rows=3804 width=36) (actual time=3880.756..3882.219 rows=0 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Sort  (cost=3148109.14..3148113.89 rows=1902 width=36) (actual time=3878.194..3878.194 rows=0 loops=3)
         Sort Key: f.date_created
         Sort Method: quicksort  Memory: 25kB
         Worker 0:  Sort Method: quicksort  Memory: 25kB
         Worker 1:  Sort Method: quicksort  Memory: 25kB
         ->  Nested Loop Left Join  (cost=745.27..3148005.54 rows=1902 width=36) (actual time=3878.170..3878.170 rows=0 loops=3)
               ->  Parallel Seq Scan on flights f  (cost=0.00..669647.32 rows=21 width=24) (actual time=3878.167..3878.168 rows=0 loops=3)
                     Filter: (uuid = ANY ('{2c0adac6-79bb-48a1-a0ba-bd8f537d68de,...,a6605812-9a5b-46c4-9989-4d24d195e1c0}'::uuid[]))
                     Rows Removed by Filter: 5631600
               ->  Bitmap Heap Scan on passengers p  (cost=745.27..117695.86 rows=32120 width=28) (never executed)
                     Recheck Cond: (f.uuid = flight_id)
                     ->  Bitmap Index Scan on passengers_a08cee2d  (cost=0.00..737.24 rows=32120 width=0) (never executed)
                           Index Cond: (f.uuid = flight_id)
 Planning Time: 0.286 ms
 Execution Time: 3882.262 ms
(18 rows)

My Best Assessment

In both scenarios, the scans on the passengers table are not executed. This is actually because the UUIDs I provide to the query did not exist in the flights table. I merely wanted to pass in a larger number to trigger the different behavior on how to scan the flights table.

In the postgres 9.5 instance, it performs an index scan with an index condition, as it expects 50 rows (the number of UUIDs I provide to the query) and returns none (as none of them existed)

In the postgres 11 instance, it wants to perform a sequential scan (in parallel) on the table with a filter. The filter essentially removes all rows returned by the sequential scan(s).

When there are less than 10 UUIDs passed to the WHERE clause, the postgres 11 instance generates the same index scan query plan as that used on the postgres 9.5 instance. That makes me think a difference in the statistics is causing this, however for what I checked, those statistics appeared similar in both instances - see below (unless I am not pulling the right values, which is very likely).

I have read many SO answers about "bad queries", but they don't address what I think may be a result of a major version upgrade.

I've checked the default_statistics_target for each database (both are 100) and the random_page_cost (both are 4).

I recognize that setting enable_seqscan to OFF is not a permanent solution, however it does coerce the postgres 11 instance to return a query plan identical to that of the postgres 9.5 instance.

I experimented by setting max_parallel_workers_per_gather = 0, which also had the desired effect of coercing the postgres 11 to return a query plan that avoided the sequential scan, but I do not think it is wise to disable that functionality for the database.

Changing the ORDER BY value (including removing it entirely from the query) had no impact on the query plan.

-- on pg 11 instance with enable_seqscan = OFF OR max_parallel_workers_per_gather = 0

 Sort  (cost=5901559.44..5901570.85 rows=4566 width=36)
   Sort Key: f.date_created
   ->  Nested Loop Left Join  (cost=745.83..5901281.90 rows=4566 width=36)
         ->  Index Scan using flight_pkey on flight f  (cost=0.56..428.99 rows=50 width=24)
               Index Cond: (uuid = ANY ('{2c0adac6-79bb-48a1-a0ba-bd8f537d68de,...,a6605812-9a5b-46c4-9989-4d24d195e1c0}'::uuid[]))
         ->  Bitmap Heap Scan on passengers p  (cost=745.27..117695.86 rows=32120 width=28)
               Recheck Cond: (f.uuid = flight_id)
               ->  Bitmap Index Scan on passengers_a08cee2d  (cost=0.00..737.24 rows=32120 width=0)
                     Index Cond: (f.uuid = flight_id)
(9 rows)

I'm reaching the point where I'm stabbing in the dark and was attempting to compare the pg_stats values for the uuid column in the flights table. Both of them show similar values for the null_frac, avg_width, n_distinct, and correlation values.

My Question

Given the above, what am I missing to help the postgres query planner avoid the expensive sequential scan?

All settings and statistics appear to be the same between the two instances, only the postgres version.

The 9.5 instance does not have any columns with stats targets that differ from the default. So before someone suggests to increase that value, why would that help the postgres 11 instance if the postgres 9.5 instance produces a "good" plan without them?

Is there something about postgres 11 (parallel workers?) that makes it think it can perform the sequential scan faster than the index scan? This seems unlikely given that the planner expects to return 21 rows but at a huge cost

Parallel Seq Scan on flights f  (cost=0.00..669647.32 rows=21

Thanks.

Edit:

Our Solution

Based on feedback, we disabled parallel queries by setting max_parallel_workers_per_gather = 0 and the problem went away.

We've also increased statistics targets (even though folks have expressed doubt that will help) and will experiment with ways to enable parallel queries in the future without triggering this same "bad" behavior.

Bonus: latency graph for query before and after disabling parallel queries:

La solution

Your statistics haven't changed much between versions. They are quite a bit off in both. But what changed is that the bad stats make parallel plans look attractive, now that parallel plans exist.

This is actually because the UUIDs I provide to the query did not exist in the flights table. I merely wanted to pass in a larger number to trigger the different behavior on how to scan the flights table.

Querying a column with high cardinality for values that happen to not exist is inherently hard to estimate well. Why are you doing this? It sounds here like you are doing it to intentionally create a problem, but it also sounds like you happened upon this problem because it was occuring naturally. How can both be true? Maybe this problem you have artificially created does not have the same root cause (or same solution) as the natural problems you are running into.

I experimented by setting max_parallel_workers_per_gather = 0, which also had the desired effect of coercing the postgres 11 to return a query plan that avoided the sequential scan, but I do not think it is wise to disable that functionality for the database.

The reason for setting this to off seems clear (assuming your one example is a representative instance of the real problems), whereas your reason for not wanting to turn it off seems pretty nebulous. Did you do the upgrade specifically to get access to parallel queries?

So before someone suggests to increase that value, why would that help the postgres 11 instance if the postgres 9.5 instance produces a "good" plan without them?

It is easy to make the right decision accidentally, if you don't have many options to choose from. Opening up parallelization offers many more ways to screw up, and having bad stats makes that likely for one of those bad ways to get chosen. Having said that, increasing the stats target wouldn't help anyway, unless you could increase to over 17 million, which you can't (and even then I don't think it would help anyway).

Is there something about postgres 11 (parallel workers?) that makes it think it can perform the sequential scan faster than the index scan? This seems unlikely given that the planner expects to return 21 rows but at a huge cost

It is not just the seq scan that it thinks benefits from the parallel query. By doing the seq scan on flights in parallel, it thinks that the index scan on passengers will also get done in parallel inherently, and that is where it thinks most of the supposed benefit will come from. Although this is not a complete explanation, as by turning off enable_seqscan, I would still expect to use a parallel plan, just with a parallel index scan or parallel bitmap heap scan on flights. I can't explain why it would give up on parallel altogether just due to enable_seqscan=off. And I can't reproduce that behavior in v11 with simulated data.

Autres conseils

I have similar problem. When I use CTE problem gone.

with my_uids as (
select distinct unnest(array['a','b','c','d']) uid order by 1
--a,b,c,d <--- yours uuids
)

SELECT f.uuid, p.name
FROM flights f 
join my_uids mu on (mu.uid = f.uuid)
LEFT OUTER JOIN passengers p 
    ON f.uuid = p.flight_id 
ORDER BY f.date_created ASC;

Licencié sous: CC-BY-SA avec attribution

Non affilié à dba.stackexchange