2026-06-04 · claude-opus-4-6

Deep Technical Analysis: Deep Copy with Mutation in PostgreSQL Tree Walkers

Core Problem

PostgreSQL's query processing pipeline frequently needs to create modified copies of expression trees (parse trees, query trees, etc.). The current approach requires two separate full tree traversals: first a copyObject() to deep-copy the entire tree, then a mutation pass (via expression_tree_mutator() or similar) to modify specific nodes in the copy. This pattern appears throughout the codebase in rewriting, partition management, and view handling.

The fundamental architectural tension is:

copyObject() performs a complete deep copy of all Node structures and non-Node data (C strings, arrays).
expression_tree_mutator() / query_tree_mutator() traverse and can transform Node structures, but they do NOT deep-copy non-Node leaf data (plain C strings, scalar arrays). They only shallow-copy (FLATCOPY) the immediate node and recurse into child Nodes.

This means mutators cannot substitute for copyObject() when the caller needs a fully independent copy — the non-Node fields would still alias the original. Robert Haas discovered this the hard way.

Why This Matters Architecturally

Performance Costs

Double traversal: Walking a potentially large expression tree twice (once to copy, once to mutate) doubles function call overhead, stack manipulation, and instruction cache pressure.
Redundant memory allocation: copyObject() allocates nodes that will be immediately replaced during the mutation pass. These intermediate allocations waste memory and fragment the heap.
Cache locality degradation: The two-pass approach means the mutated tree's memory layout doesn't follow traversal order. A single copy-and-mutate pass would produce nodes allocated consecutively in traversal order, improving cache line utilization in subsequent accesses.

Code Complexity

The copy-then-mutate pattern is spread across multiple subsystems:

Rewrite rules (ChangeVarNodes after copyObject in rewriteTargetView)
Partition management (map_partition_varattnos called twice after copyObject in CreateTriggerFiringOn)
Target list manipulation (ReplaceVarsFromTargetList after copy and ChangeVarNodes)

Proposed Solutions

Robert Haas's Implicit Proposal

A "fully-deeply-copying mutator" that combines copyObject() semantics with mutation in a single pass. This would deep-copy everything (including non-Node data like C strings) while simultaneously applying transformations to nodes as they're copied. The mutator callback would receive nodes and either return them transformed or let the default deep-copy proceed.

Ashutosh Bapat's Proposal: Parameterized copyObject

Add a boolean flag to copyObject() controlling whether subtrees are recursively copied:

copyObject(node, true) = current behavior (full deep copy)
copyObject(node, false) = copy only the node itself plus non-Node data (strings, arrays), but NOT recurse into child Node fields

This would replace the FLATCOPY macro used inside mutators with a proper copyObject(node, false) call. Implementation would modify the auto-generated _copy* functions to conditionally define COPY_NODE_FIELD as COPY_SCALAR_FIELD when the flag is false.

Limitation: This doesn't directly solve the two-pass problem — it's more about making the mutator's internal copying more correct/complete. It doesn't merge the copy and mutation into one pass.

David Rowley's Pragmatic Assessment

Rather than pursuing a complex refactoring, Rowley suggests:

Benchmark first: Add extra copyObject() calls at the identified sites to see if the additional traversal produces measurable slowdown. If not, the optimization isn't worth the code complexity.
Recognize compounding complexity: In cases like CreateTriggerFiringOn where map_partition_varattnos() is called twice, a single-pass solution would need to perform both attribute remappings simultaneously, requiring non-trivial refactoring of the remapping logic.
Memory layout benefits: Acknowledges that single-pass copy-mutate would produce better memory locality due to consecutive allocation in traversal order, assuming palloc doesn't reuse freelist chunks.

Key Design Tradeoffs

Approach	Advantage	Disadvantage
Current (copy + mutate)	Simple, separation of concerns, each function has clear semantics	Double traversal, wasted allocations, poor cache locality
Combined copy-mutate	Single traversal, better memory layout, fewer allocations	More complex code, harder to compose multiple transformations, each mutation site needs custom logic
Parameterized copyObject	Cleaner internal mutator implementation	Doesn't solve the fundamental two-pass issue

Unresolved Questions

Is this actually measurable? Rowley's suggestion to benchmark by adding extra copies is practical but hasn't been executed.
Composability: When multiple transformations are needed (e.g., two map_partition_varattnos calls), combining them into a single pass requires merging logically separate operations, potentially creating maintenance burdens.
Code generation implications: PostgreSQL auto-generates copy/equal/read/write functions from node definitions. Any fundamental change to copy semantics needs to integrate with this generation infrastructure.
Where does non-Node data live? The original observation — that mutators don't deep-copy C strings — is a subtle correctness concern. If a mutator returns a modified tree that shares string pointers with the original, and the original is later freed, dangling pointers result.

Assessment

This thread identifies a real but potentially minor performance issue. The pattern of copy-then-mutate is architecturally inelegant and theoretically wasteful, but the actual performance impact is unclear. The discussion died without a patch or benchmark results, suggesting the community views this as a "nice to have" optimization rather than a pressing need. The most actionable next step (per Rowley) would be empirical measurement to determine if the extra traversals are even detectable in realistic workloads.

deep copy with mutation?

Latest Update