PostgreSQL v9.6, postgres_fdw

Foreign table

CREATE FOREIGN TABLE user_info (
  id bigint ,
  info jsonb 
) 
SERVER server_test_fdw OPTIONS(SCHEMA_NAME 'public', TABLE_NAME 'user_info_raw' );
-- user_info_raw is a large table (100 million records, 200 GB)

Sample data of info column

{"key1": 1, "key2": 0.678}
{"key1": 1, "key2": 1.0}
{"key1": 1, "key2": 0.986} 
{"key1": 2, "key2": 0.75}
{"key1": 2, "key2": 0.639} 

Query on foreign table (updated)

SELECT id, info 
FROM user_info
WHERE info ->> 'key1'= '1' -- OR using jsonb_extract_path_text(info, 'key1')  = '1'
ORDER BY id 
LIMIT 10; 

Limit  (cost=10750829.63..10750829.65 rows=10 width=40) (actual time=550059.320..550059.326 rows=10 loops=1)
   ->  Sort  (cost=10750829.63..10751772.77 rows=377257 width=40) (actual time=550059.318..550059.321 rows=10 loops=1)
         Sort Key: id
         Sort Method: top-N heapsort  Memory: 26kB
         ->  Foreign Scan on user_info (cost=100.00..10742677.24 rows=377257 width=40) (actual time=1.413..536718.366 rows=68281020 loops=1)
               Filter: ((info ->> 'key1'::text) = '1'::text)
               Rows Removed by Filter: 7170443
 Planning time: 4.097 ms
 Execution time: 550059.597 ms

Query on user_info_raw

EXPLAIN ANALYSE
SELECT id, info 
FROM user_info_raw
WHERE info ->> 'key1'= '1'
ORDER BY id 
LIMIT 10;

 Limit  (cost=0.57..1296.95 rows=10 width=59) (actual time=0.043..0.073 rows=10 loops=1)
   ->  Index Scan using idx_user_info_raw_info on user_info_raw  (cost=0.57..68882850.88 rows=531346 width=59) (actual time=0.042..0.070 rows=10 loops=1)
         Filter: ((info ->> 'key1'::text) = '1'::text)
 Planning time: 0.192 ms
 Execution time: 0.102 ms 

select pg_size_pretty(pg_table_size('user_info_raw'));
 pg_size_pretty 
----------------
 223 GB

It takes 10 milliseconds to execute the query on the user_info_raw (remote server).

But, It takes a lot of time when using theuser_infoforeign table. When I remove ORDER BY id, the query executes very fast.

I think that my query on foreign table should send to the remote server for executing, but it's not, I don't know why, may be due to of this excerpt from postgres_fdw document

By default, only WHERE clauses using built-in operators and functions will be considered for execution on the remote server. Clauses involving non-built-in functions are checked locally after rows are fetched. If such functions are available on the remote server and can be relied on to produce the same results as they do locally, performance can be improved by sending such WHERE clauses for remote execution

How can I fix the issue related to ORDER BY in foreign table ?


UPDATED

Adding use_remote_estimate on server and foreign table doesn't help.

有帮助吗?

解决方案

Sort operation is performed on local postgres server, not on remote. If your code always sort the results, you can just create a view on remote postgres with order by clause and then create a foreign table pointing to the view.

On remote server, adding a view

CREATE VIEW vw_user_info_raw
SELECT id, info 
FROM user_info_raw
ORDER BY id 

On local server, create foreign table references to that view

CREATE FOREIGN TABLE user_info (
  id bigint ,
  info jsonb 
) 
SERVER server_test_fdw OPTIONS(SCHEMA_NAME 'public', TABLE_NAME 'vw_user_info_raw' );

Refer to Postgres Admin Forum

许可以下: CC-BY-SA归因
不隶属于 dba.stackexchange
scroll top