[Patch]Add Graph* node support to expression_tree_mutator

First seen: 2026-04-21 15:56:44+00:00 · Messages: 6 · Participants: 4

Latest Update

2026-05-06 · opus 4.7

Missing Graph Node Handlers in expression_tree_mutator_impl()

Core Problem

PostgreSQL's SQL/PGQ (Property Graph Queries) implementation introduces several new parse/plan node types to represent GRAPH_TABLE ... MATCH ... COLUMNS (...) constructs:

The tree-traversal infrastructure in nodeFuncs.c exposes two symmetric facilities: expression_tree_walker() (read-only traversal) and expression_tree_mutator() (traversal that may substitute nodes, used heavily by the rewriter, planner, subquery pullup, and various parse-analysis transformations). Symmetry between the walker and the mutator is an invariant that PostgreSQL relies on: any node type that can appear in an expression tree must be recognized by both, otherwise transformation passes will fail with "unrecognized node type: N" coming from the default: branch of the switch in the mutator.

When the graph-table feature was added, the walker was taught about all three node types but the mutator was not. Because range_table_mutator_impl() descends into RangeTblEntry::graph_table and RangeTblEntry::graph_table_columns via MUTATE(), any code path that mutates an RTE containing a GRAPH_TABLE eventually hits the missing cases. The reporter's trigger is a GRAPH_TABLE subquery inside a HAVING clause:

SELECT 1 FROM hv GROUP BY id
HAVING (SELECT COUNT(*) FROM GRAPH_TABLE(gh MATCH (a) COLUMNS (a.id AS x)))

HAVING forces subquery processing during parse analysis / transformation, which invokes the mutator on the sub-RTE and trips over node 106 (T_GraphPattern). The user-visible symptom is a bare ERROR: unrecognized node type: 106.

Why the Omission Is Architecturally Significant

The initial reaction (documented in Ashutosh Bapat's first response) was skepticism: the name expression_tree_mutator suggests it handles only Expr-rooted nodes, and graph pattern nodes could be thought of as "structural" nodes that should be rewritten into regular expressions before any generic mutator sees them. But that mental model is wrong for two reasons Ashutosh identifies:

  1. The mutator is already called during the transformation phase, before any rewrite to lower-level operators happens. GRAPH_TABLE nodes legitimately persist in the tree at the point where mutators run.
  2. expression_tree_mutator_impl() has always handled non-Expr structural nodes — List, TableFunc, RangeTblRef, etc. — so the "expression only" framing is historical misnaming, not a real contract.

Given that, the correct fix is simply to restore walker/mutator symmetry.

Patch Evolution and Technical Details

The patch went through three substantive refinements:

1. Initial patch — add the three missing switch cases

The v1 patch added case T_GraphPattern, case T_GraphElementPattern, case T_GraphPropertyRef to expression_tree_mutator_impl(), each doing FLATCOPY of the node followed by MUTATE on the relevant sub-trees.

2. Review concern: shallow-copied char *elvarname

Ashutosh flagged that GraphPropertyRef.elvarname is a char * that FLATCOPY copies by pointer only — the mutated node shares the string buffer with the original. He resolved his own concern by noting that XmlExpr.name and other existing nodes do the same; the convention in the mutator is that immutable scalar strings are not duplicated. This is consistent with the broader PostgreSQL style where node-tree copies assume string constants are effectively immortal (palloc'd in the same memory context or longer-lived).

3. Missing labelexpr and quantifier handling on GraphElementPattern

The more substantive defect Ashutosh found: the walker also did not descend into GraphElementPattern.labelexpr (a boolean expression) or GraphElementPattern.quantifier (an IntList). Both omissions needed fixing:

Ashutosh also extended raw_expression_tree_walker() to cover labelexpr, keeping raw-tree and analyzed-tree traversal consistent. He updated the regression test so the query exercises the labelexpr path.

4. Robert Haas's addition: T_GraphLabelRef

Robert independently hit the same bug and pointed out one more missing case: T_GraphLabelRef belongs in the "Primitive node types with no expression subnodes" fall-through group of the mutator (where nodes are simply FLATCOPY-ed and returned), mirroring its placement in the walker. Ashutosh's final patch also added T_GraphPropertyLabel to that same primitive-node group.

5. Commit-time polish

Peter Eisentraut committed the patch, noting two cleanups:

Key Technical Insights

  1. Walker/mutator symmetry is a hard invariant. Any patch introducing a new Node type that can appear in expression trees must touch both functions, plus raw_expression_tree_walker() if the node appears pre-analysis, plus copyfuncs/equalfuncs/outfuncs/readfuncs. Missing the mutator is a latent bug that only surfaces when a specific transformation pass runs over the node — here, HAVING-clause subquery processing.

  2. Mutator handling of Int/Oid lists is not optional. Even though walkers can ignore them, mutators must copy them so that the output tree is fully independent of the input. This is easy to overlook because there's nothing "interesting" to mutate.

  3. The "primitive node" fall-through group in both walker and mutator is the right home for label-reference nodes that carry only scalar payloads — avoiding the trap of writing no-op cases that look like they do something.

  4. range_table_mutator_impl() is the caller chain that exposed this. Graph tables live inside RTEs, so any RTE-level mutation (subquery pullup, flatten_join_alias_vars, etc.) routes through these new node types. This is why HAVING with a GRAPH_TABLE subquery is a reliable trigger but a plain top-level GRAPH_TABLE might not be.

  5. GRAPH_TABLE nodes persist through transformation, contrary to the initial intuition that they would be lowered to ordinary expressions early. The fix acknowledges that the graph-pattern node types are first-class citizens of the expression-tree infrastructure during parse analysis and planning.

Participant Roles

Broader Significance

This thread is a small but representative example of a recurring class of bug in PostgreSQL: feature additions that introduce new Node types but under-cover the tree-traversal infrastructure. The graph-table feature is new enough that these gaps are still being shaken out. The fix's cleanliness — and the fact that two independent developers hit it within a week — suggests there may be value in a systematic audit of all T_Graph* nodes against every traversal function (copyfuncs, equalfuncs, outfuncs, readfuncs, expression_tree_walker, expression_tree_mutator, raw_expression_tree_walker, planstate_tree_walker, etc.) before the feature stabilizes.