Function Scan FDW Pushdown — Technical Analysis
Context and Core Problem
This thread concerns a long-running patch series by Alexander Pyhalov (Postgres Pro) to extend postgres_fdw — and the FDW infrastructure more broadly — so that function RTEs (RangeTblEntry of kind RTE_FUNCTION) can participate in pushdown to a remote server. Today, when a query joins a foreign table against a set-returning function (SRF) such as generate_series(), unnest(), or a user-defined function, the planner has no way to express the function scan on the remote side. The function is always executed locally, and the join must be executed locally as well, which forces large foreign relations to be fetched back to the coordinator even when the join cardinality would be small.
Architecturally the problem sits at the intersection of three subsystems:
-
Path generation for joins involving non-base relations. postgres_fdw's existing pushdown machinery (
foreign_join_ok,deparseSelectStmtForRel) assumes both sides of a join are base foreign relations (or already-pushed-down joins) belonging to the same user mapping / server. AFUNCTIONRTE is neither a base rel nor a foreign rel — it has noForeignScanplan node and noPgFdwRelationInfo. Integrating it requires the "asymmetric join" concept (local-side relation pushed into a remote join), a topic previously explored by Pyhalov in the asymmetric partition-wise join and "pushdown of constants/VALUES" patch series. -
Deparsing a function expression as a remote FROM item. The deparser must emit a
ROWS FROM (...)/ function-call clause that is safe to execute remotely. That requires: (a) the function must be known to exist with identical semantics on the remote (the usualcheck_shippable/is_shippablegating viaextensionsoption), (b) its argument expressions must themselves be shippable, and (c) the function must not depend on executor-only state such asParamnodes referring to outer plan values, because those cannot be resolved inside the remote SQL text. -
Plan-tree bookkeeping across
set_plan_references. Once the planner rewrites varnos intoOUTER_VAR/INNER_VARand compacts the rangetable, the FDW must still be able to reconstruct which function RTE was folded into whichForeignScan. This is the "tricky part" the author explicitly calls out.
What the Patch Does
From the two messages we see three concrete design choices that have solidified by the March 2026 revision:
1. Restricting which functions are pushable
The patch now forbids pushdown for:
- Functions returning sets of composite/complex types (i.e.
RECORDor named row types in set-returning context). - Functions whose argument expressions contain
Paramnodes.
The composite-type restriction is a deparser/tuple-descriptor safety measure: a remote ROWS FROM (f(...)) result must be projected into a TupleDesc that matches the local expectation, and composite-returning SRFs expand into multiple columns whose types and names come from the function's declared signature. Round-tripping that through a remote query — where the remote server may resolve the composite type differently, especially if the function is polymorphic or returns RECORD with a column definition list — is fragile. Disallowing it is the conservative correct choice for a first commit.
The Param-in-arguments restriction matters because function RTE arguments are evaluated once per scan rescan, typically driven by outer tuples in a nested loop (LATERAL). If those arguments contain PARAM_EXEC nodes, the value lives in the executor's es_param_exec_vals and cannot be textually inlined into remote SQL at plan time. Supporting this would require parameterized remote execution (sending parameter values on each rescan), which is a substantially larger project touching postgresGetForeignPlan, create_cursor, and the prepared-statement reuse path. Excluding it keeps the patch tractable.
2. Storing function metadata in ForeignScan.fdw_private
Pyhalov and Gleb Kashkin moved the per-function bookkeeping into the ForeignScan's private data. Previously (presumably) it lived on PgFdwRelationInfo or a side structure keyed by RTI, which breaks after set_plan_references renumbers and removes unused RTEs. By serializing the information into fdw_private (which goes through _copyForeignScan/_outForeignScan and survives setrefs intact as a List * of Nodes), the executor-time deparse and the EXPLAIN path can still identify the functions that were folded into the scan.
The "correct rti mappings after setrefs" comment implies they keep either the original RTIs or a mapping list, and re-resolve them during postgresBeginForeignScan / explain. This is the same pattern used for pushed-down joins, where fs_relids records the original relids.
3. Decoupling from the asymmetric-join patch
The author explicitly removed changes that were only needed for asymmetric join. This is strategically important: the asymmetric-join patch has been in review limbo for several cycles, and tying function-scan pushdown to it would block both. By narrowing scope to cases that work within the existing join-pushdown framework (i.e. joins between foreign tables where a function RTE appears as an additional FROM item that can be deparsed inline), the patch becomes independently committable. It targets PostgreSQL 20.
Key Technical Insights
-
Shippability of functions is already partially modeled via the
extensionsFDW option (is_shippableinshippable.c). This patch extends the concept from "appears in a WHERE clause" to "appears as a FROM-clause generator," which requires the additional guarantee that the function produces the same rows in the same order/structure on the remote side — hence the tighter restrictions on return type. -
Result-type matching is the silent killer for function pushdown. Even for
generate_series(int,int)the remote must returnint4, notint8, matching the local TupleDesc. The patch presumably relies on the function'sprorettypebeing a scalar base type resolved identically on both ends (a reasonable assumption when the function is built-in or comes from a shipped extension). -
Interaction with LATERAL: Function RTEs are frequently LATERAL. Forbidding
Paramarguments effectively forbids most LATERAL function scans that reference outer foreign columns — which is unfortunate because that is one of the most valuable pushdown cases (SELECT ... FROM foreign_t, LATERAL unnest(foreign_t.arr)). A future extension would handle this via remote-sideLATERALdeparse, which is representable in SQL but requires the LATERAL reference to resolve to a Var of a relation already on the remote side of the same subquery — tractable but out of scope here. -
Minimal churn to core: By keeping the information in
fdw_private, no new fields onForeignScanorPlannerInfoare required, which keeps the core-vs-contrib boundary clean and improves the patch's chances of review.
Participants and Weight of Opinion
- Alexander Pyhalov (Postgres Pro) is the patch author and a recurring contributor to postgres_fdw pushdown work (asymmetric join, aggregate pushdown, join pushdown improvements). His domain expertise on this code path is substantial; multiple of his postgres_fdw patches have been committed.
- Gleb Kashkin (Postgres Pro) co-developed the fdw_private restructuring — likely responsible for the setrefs RTI-mapping fix.
- Álvaro Herrera is referenced as having provided feedback in August 2025 (the patch is a response to his review). As a senior committer with deep planner/executor familiarity, his sign-off would carry significant weight for commit.
The two visible messages are both rebase/update posts from the author; no substantive technical disagreement appears in this excerpt. The direction — narrow scope, conservative restrictions, decouple from asymmetric join — reads as a direct response to reviewer guidance.
Outlook
The patch is positioned for PG20 and has been reduced to a self-contained feature. The main review risks that remain:
- Proving the restriction set is actually sufficient (e.g., VOLATILE functions, functions with
SETOF recordbut fixed column list via coldeflist, polymorphic functions instantiated to composite). - Cost modeling — a pushed-down function scan's rowcount/cost should still reflect local
prorows, but join selectivity on the remote may differ. - EXPLAIN output clarity: showing the function call inside the remote SQL rather than as a separate Function Scan node.