Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Postgres_FDW not pushing down WHERE criteria

I'm working with two PostgreSQL 9.6 databases and am trying to query one of the DB's from the other using postgres_fdw (one is a production backup DB that has the data and the other is a db for doing various analyses).

I've come across some odd behavior though where certain types of WHERE clauses in a query aren't being passed to the remote DB but instead are retained in the local DB and used to filter the results received from the remote DB. This is causing the remote DB to try to send way more info than the local DB needs across the network and the affected queries are dramatically slower (15 seconds vs 15 minutes).

I've mostly seen this with timestamp related clauses, the below examples are how I first came across the problem, but I've seen it in several other variations such as replacing CURRENT_TIMESTAMP with a TIMESTAMP literal (slow) or TIMESTAMP WITH TIME ZONE literal (fast).

Is there a setting somewhere that I'm missing that'll help with this? I'm setting this up for a team to use with a mixed level of SQL backgrounds, most aren't experienced with reviewing EXPLAIN plans and whatnot. I've come up with some workarounds (such as putting the relative time clauses in sub-SELECT), but I keep coming across new instances of the problem.

An example:

SELECT      var_1
           ,var_2
FROM        schema_A.table_A
WHERE       execution_ts <= CURRENT_TIMESTAMP - INTERVAL '1 hour'
        AND execution_ts >= CURRENT_TIMESTAMP - INTERVAL '1 week' - INTERVAL '1 hour'
ORDER BY    var_1

Explain Plan

Sort  (cost=147.64..147.64 rows=1 width=1048)
  Output: table_A.var_1, table_A.var_2
  Sort Key: (table_A.var_1)::text
  ->  Foreign Scan on schema_A.table_A  (cost=100.00..147.63 rows=1 width=1048)
        Output: table_A.var_1, table_A.var_2
        Filter: ((table_A.execution_ts <= (now() - '01:00:00'::interval)) 
             AND (table_A.execution_ts >= ((now() - '7 days'::interval) - '01:00:00'::interval)))
        Remote SQL: SELECT var_1, execution_ts FROM model.table_A
                    WHERE ((model_id::text = 'ABCD'::text))
                      AND ((var_1 = ANY ('{1,2,3,4,5}'::bigint[])))

The above takes around 15-20 minutes to run, while the below completes in seconds.

SELECT      var_1
           ,var_2
FROM        schema_A.table_A
WHERE       execution_ts <= (SELECT CURRENT_TIMESTAMP - INTERVAL '1 hour')
        AND execution_ts >= (SELECT CURRENT_TIMESTAMP - INTERVAL '1 week' - INTERVAL '1 hour')
ORDER BY    var_1

Explain Plan

Sort  (cost=158.70..158.71 rows=1 width=16)
  Output: table_A.var_1, table_A.var_2
  Sort Key: table_A.var_1
  InitPlan 1 (returns $0)
    ->  Result  (cost=0.00..0.01 rows=1 width=8)
          Output: (now() - '01:00:00'::interval)
  InitPlan 2 (returns $1)
    ->  Result  (cost=0.00..0.02 rows=1 width=8)
          Output: ((now() - '7 days'::interval) - '01:00:00'::interval)
  ->  Foreign Scan on schema_A.table_A  (cost=100.00..158.66 rows=1 width=16)
        Output: table_A.var_1, table_A.var_2
        Remote SQL: SELECT var_1, var_2 FROM model.table_A
                    WHERE ((execution_ts <= $1::timestamp with time zone))
                      AND ((execution_ts >= $2::timestamp with time zone))
                      AND ((model_id::text = 'ABCD'::text))
                      AND ((var_1 = ANY ('{1,2,3,4,5}'::bigint[])))
like image 681
Sven the Mediocre Avatar asked Dec 14 '22 16:12

Sven the Mediocre


1 Answers

Any function that is not IMMUTABLE will not be pushed down.

See function is_foreign_expr in contrib/postgres_fdw/deparse.c:

/*
 * Returns true if given expr is safe to evaluate on the foreign server.
 */
bool
is_foreign_expr(PlannerInfo *root,
                RelOptInfo *baserel,
                Expr *expr)
{
...
    /*   
     * An expression which includes any mutable functions can't be sent over
     * because its result is not stable.  For example, sending now() remote
     * side could cause confusion from clock offsets.  Future versions might
     * be able to make this choice with more granularity.  (We check this last
     * because it requires a lot of expensive catalog lookups.)
     */
    if (contain_mutable_functions((Node *) expr))
        return false;

    /* OK to evaluate on the remote server */
    return true;
}
like image 97
Laurenz Albe Avatar answered Dec 29 '22 10:12

Laurenz Albe