Deep Technical Analysis: Generic Plans and "Initial" Pruning
The Core Problem
When PostgreSQL executes prepared statements against partitioned tables using generic plans, performance degrades linearly with partition count due to AcquireExecutorLocks(). This function, called during CheckCachedPlan() in the plan cache validation path, locks every relation referenced in the plan tree — including all partitions — even though runtime "initial" partition pruning will eliminate most of them before any scan occurs.
For a table with 2048 partitions where a query like SELECT * FROM partitioned_table WHERE key = $1 targets exactly one partition, the system still acquires 2048+ locks on every execution of the generic plan. This is pure overhead: the locks on 2047 partitions serve no purpose because those partitions are immediately pruned away during executor startup.
Why This Matters Architecturally
The problem sits at the intersection of three subsystems:
-
Plan Cache (
plancache.c): Validates cached plans by re-acquiring locks on all referenced relations, checking for invalidation messages that would mark the plan stale. -
Runtime Partition Pruning: A two-phase system — "initial" pruning (using PARAM_EXTERN values known at executor startup) and "exec" pruning (using PARAM_EXEC values computed during execution). Initial pruning can eliminate partitions before any scan begins.
-
Executor Startup (
ExecutorStart/InitPlan): Builds the PlanState tree and opens relations. Currently happens after all locks are acquired.
The fundamental tension is that pruning requires parameter values and some executor machinery, but locking currently precedes all executor work. Bridging this gap without violating invariants about plan immutability, lock ordering, and relation validity is the central challenge.
Evolution of Proposed Solutions
Approach 1: Pruning Inside AcquireExecutorLocks() (v1-v5, Dec 2021 – Mar 2022)
The initial approach performed initial pruning within AcquireExecutorLocks() itself, using the parameter values to determine which partitions to skip. This required:
- Creating a temporary execution context for evaluating pruning expressions
- Walking the plan tree to identify which nodes survive pruning
- Locking only the surviving partitions
Key concern (Robert Haas): This creates a situation where parts of the plan tree reference relations that were never locked. If DDL (e.g., DROP INDEX) concurrently modifies a pruned-but-unlocked partition, the plan contains references to non-existent objects. While we won't execute those plan nodes, code like EXPLAIN or auto_explain might inspect them.
Second concern (Robert Haas): Performing pruning twice (once for locking, once during executor init) risks getting different answers if an IMMUTABLE-labeled function is actually VOLATILE, potentially leading to the executor operating on unlocked relations.
Approach 2: ExecutorPrep Phase (v4-v8, Feb – Mar 2022)
Robert suggested creating a new executor "prep" phase that would:
- Walk the plan tree before full initialization
- Determine which relations need locks (collecting RT indexes)
- Perform initial pruning and record results in a
PlanPrepOutputstructure - Pass this structure through to
ExecutorStart()so pruning isn't repeated
This evolved through several iterations with naming changes (ExecutorPrep → ExecutorGetLockRels) and structural refinements.
Andres Freund's concern: Added overhead even when pruning doesn't help — extra allocations (makeNode, palloc0, AllocSetContextCreate) that affect already-profiled bottlenecks for simple queries.
Approach 3: PartitionPruneInfo in PlannedStmt (v9-v16, Apr – May 2022)
David Rowley proposed a cleaner architecture:
- Move
PartitionPruneInfofrom Append/MergeAppend nodes into a flat list inPlannedStmt - Index into this list from the Append/MergeAppend nodes
- During
AcquireExecutorLocks(), iterate over this list to perform pruning - Store results alongside the plan for executor reuse
This eliminated the need for a full plan tree walk and the plan_tree_walker infrastructure. The PlannedStmt.minLockRelids bitmapset tracked which relations must always be locked (unprunable ones), and pruning results determined which additional partitions to lock.
Tom Lane's skepticism: Questioned whether the optimization matters under real-world conditions, arguing that if pruning saves enough to be worthwhile, the plan cache would already prefer custom plans. Also noted the patches were "desperately undercommented."
Approach 4: Locks in ExecutorStart() (v32-v52, Jan 2023 – Aug 2024)
Tom Lane proposed a fundamentally different architecture: eliminate AcquireExecutorLocks() entirely and integrate lock acquisition into executor startup. The executor would:
- Lock relations as it descends the plan tree during ExecInitNode()
- After each lock acquisition, check if the CachedPlan has been invalidated
- If invalidated, abort initialization and signal the caller to replan
This required:
- A new
EXEC_FLAG_GET_LOCKSeflags bit - ExecInitNode() functions returning NULL on invalidation detection
- ExecEndNode() handling partially-initialized PlanState trees
- All ExecutorStart() callers implementing a replan-and-retry loop
Challenges encountered:
- Partially initialized PlanState trees were difficult to clean up safely
- The PORTAL_MULTI_QUERY case required moving ExecutorStart() from PortalRun() to PortalStart(), which broke snapshot handling and CommandCounterIncrement() ordering
- Lock ordering became plan-shape-dependent, introducing potential deadlocks
- ForeignScan and CustomScan End callbacks might be called without Begin having run
Approach 5: Hybrid — Pruning Before ExecInitNode() (v53-v57, Sep 2024 – Jan 2025)
Amit combined insights from multiple approaches:
- Perform initial pruning as a separate step in InitPlan(), before ExecInitNode()
- Lock only unpruned partitions at that point
- Check CachedPlan validity after locking
- If invalid, replan using a new
GetSingleCachedPlan()function
This was committed in PostgreSQL 18 as commit 525392d57, along with preparatory commits (bb3ec16e, d47cbf47, cbc12791).
The Revert (May 2025)
Tom Lane discovered critical flaws in the committed UpdateCachedPlan() mechanism:
- Rule expansion: DO ALSO rules can change the number of PlannedStmts in stmt_list, breaking the assumption that list structure is preserved
- Concurrent iteration: Code iterating stmt_list could see an inconsistent state
- Validity flag: Setting
plan->is_valid = trueafter replanning was unsafe due to possible intervening invalidation events
The commit was reverted, but preparatory infrastructure (moving PartitionPruneInfo to PlannedStmt, performing initial pruning before ExecInitNode(), tracking unpruned relids) was retained.
Approach 6: ExecutorPrep() Called from GetCachedPlan() (v58-v9, Nov 2025 – Mar 2026)
The current approach (post-revert) revisits the idea of performing pruning during plan cache validation:
ExecutorPrep()factors out permission checks, range table initialization, and initial pruning fromInitPlan()GetCachedPlan()callsExecutorPrep()to determine surviving partitions, then locks only those- Results (an EState) are passed via
CachedPlanPrepDatato ExecutorStart() - The CachedPlan itself is never modified
Key refinement (v10, Mar-Apr 2026): Moved lock acquisition out of GetCachedPlan() entirely, making callers responsible via AcquireExecutorLocks() as an external function. This avoids the layering violation of GetCachedPlan() doing too much. Multi-statement CachedPlans (from rules) fall back to conservative locking.
Key Design Decisions and Tradeoffs
1. Plan Immutability
Plans in the plan cache are strictly read-only. Any approach that modifies the plan tree or CachedPlan state from within the executor violates this invariant. The reverted patch's UpdateCachedPlan() crossed this line.
2. Pruning Consistency
Initial pruning must produce the same result both when deciding what to lock and when deciding what to execute. Running it twice is both wasteful and theoretically unsafe. All successful designs carry the pruning result forward rather than recomputing it.
3. Lock Ordering
The system must maintain deterministic lock ordering to avoid deadlocks. Executor-driven locking makes ordering dependent on plan shape (join order), which is unstable across replans. Locking in plan cache validation or a dedicated prep phase maintains a more predictable order.
4. Multi-Statement Plans
DO ALSO rules can create CachedPlans containing multiple PlannedStmts. Statement N+1's pruning expressions might depend on state modified by statement N (via CommandCounterIncrement). This forces conservative locking for multi-statement plans.
5. Extension Compatibility
ExecutorStart_hook is widely used. Any design that makes standard_ExecutorStart() capable of "failing" forces extension authors to handle a new code path. The final approach avoids this by performing pruning/locking before the hook is called.
Current State (v10/v11, 2026)
The patch series is structured as:
- Move lock acquisition out of GetCachedPlan() — callers now own locking responsibility
- Refactor initial pruning setup — ExecDoInitialPruning() handles both setup and pruning
- Introduce ExecutorPrep() — factors out range table init, permissions, and pruning from InitPlan()
- Pruning-aware locking for single-statement cached plans —
ExecutorPrepAndLock()performs pruning, then locks only survivors
Performance results show the optimization eliminates the O(N) partition scaling problem:
- At 1024 partitions: ~5,450 TPS → ~20,500 TPS (277% improvement)
- At 4096 partitions: ~1,560 TPS → ~17,294 TPS (1008% improvement)
- No regression for custom plans or non-partitioned queries
Unresolved Issues
-
Costing: The planner doesn't account for runtime pruning benefits when comparing generic vs. custom plan costs, causing
plan_cache_mode = autoto still prefer custom plans even when generic would be faster with pruning. -
Parallel workers: Workers need the leader's pruning results to avoid re-evaluating (and potentially getting different answers). The current patch includes a debug-only verification that worker results match leader results.
-
Non-portal callers: SPI and SQL function paths don't yet use pruning-aware locking, falling back to conservative locking.