Missing Graph Node Handlers in expression_tree_mutator_impl()
Core Problem
PostgreSQL's SQL/PGQ (Property Graph Queries) implementation introduces several new parse/plan node types to represent GRAPH_TABLE ... MATCH ... COLUMNS (...) constructs:
T_GraphPattern— the overall MATCH patternT_GraphElementPattern— an individual vertex/edge pattern element (with alabelexprboolean expression and aquantifierIntList)T_GraphPropertyRef— reference to a property of a bound graph element (carrieschar *elvarname)T_GraphPropertyLabel/T_GraphLabelRef— primitive label-reference nodes with no expression subnodes
The tree-traversal infrastructure in nodeFuncs.c exposes two symmetric facilities: expression_tree_walker() (read-only traversal) and expression_tree_mutator() (traversal that may substitute nodes, used heavily by the rewriter, planner, subquery pullup, and various parse-analysis transformations). Symmetry between the walker and the mutator is an invariant that PostgreSQL relies on: any node type that can appear in an expression tree must be recognized by both, otherwise transformation passes will fail with "unrecognized node type: N" coming from the default: branch of the switch in the mutator.
When the graph-table feature was added, the walker was taught about all three node types but the mutator was not. Because range_table_mutator_impl() descends into RangeTblEntry::graph_table and RangeTblEntry::graph_table_columns via MUTATE(), any code path that mutates an RTE containing a GRAPH_TABLE eventually hits the missing cases. The reporter's trigger is a GRAPH_TABLE subquery inside a HAVING clause:
SELECT 1 FROM hv GROUP BY id
HAVING (SELECT COUNT(*) FROM GRAPH_TABLE(gh MATCH (a) COLUMNS (a.id AS x)))
HAVING forces subquery processing during parse analysis / transformation, which invokes the mutator on the sub-RTE and trips over node 106 (T_GraphPattern). The user-visible symptom is a bare ERROR: unrecognized node type: 106.
Why the Omission Is Architecturally Significant
The initial reaction (documented in Ashutosh Bapat's first response) was skepticism: the name expression_tree_mutator suggests it handles only Expr-rooted nodes, and graph pattern nodes could be thought of as "structural" nodes that should be rewritten into regular expressions before any generic mutator sees them. But that mental model is wrong for two reasons Ashutosh identifies:
- The mutator is already called during the transformation phase, before any rewrite to lower-level operators happens. GRAPH_TABLE nodes legitimately persist in the tree at the point where mutators run.
expression_tree_mutator_impl()has always handled non-Exprstructural nodes —List,TableFunc,RangeTblRef, etc. — so the "expression only" framing is historical misnaming, not a real contract.
Given that, the correct fix is simply to restore walker/mutator symmetry.
Patch Evolution and Technical Details
The patch went through three substantive refinements:
1. Initial patch — add the three missing switch cases
The v1 patch added case T_GraphPattern, case T_GraphElementPattern, case T_GraphPropertyRef to expression_tree_mutator_impl(), each doing FLATCOPY of the node followed by MUTATE on the relevant sub-trees.
2. Review concern: shallow-copied char *elvarname
Ashutosh flagged that GraphPropertyRef.elvarname is a char * that FLATCOPY copies by pointer only — the mutated node shares the string buffer with the original. He resolved his own concern by noting that XmlExpr.name and other existing nodes do the same; the convention in the mutator is that immutable scalar strings are not duplicated. This is consistent with the broader PostgreSQL style where node-tree copies assume string constants are effectively immortal (palloc'd in the same memory context or longer-lived).
3. Missing labelexpr and quantifier handling on GraphElementPattern
The more substantive defect Ashutosh found: the walker also did not descend into GraphElementPattern.labelexpr (a boolean expression) or GraphElementPattern.quantifier (an IntList). Both omissions needed fixing:
labelexpr: a genuine boolean expression tree containingColumnRef/BoolExprat raw-parse time and ordinary booleanExprnodes after analysis. It must be WALKed and MUTATEd; skipping it means rewriter substitutions (e.g.,Varrenumbering during subquery pullup, or constant folding) silently miss these subtrees.quantifier: anIntList. Walkers conventionally skip bare Int/Oid lists (they contain no Node children worth visiting), but mutators must still copy them, otherwise the mutated tree shares list storage with the original. This is the same convention used for other IntList/OidList members across the mutator.
Ashutosh also extended raw_expression_tree_walker() to cover labelexpr, keeping raw-tree and analyzed-tree traversal consistent. He updated the regression test so the query exercises the labelexpr path.
4. Robert Haas's addition: T_GraphLabelRef
Robert independently hit the same bug and pointed out one more missing case: T_GraphLabelRef belongs in the "Primitive node types with no expression subnodes" fall-through group of the mutator (where nodes are simply FLATCOPY-ed and returned), mirroring its placement in the walker. Ashutosh's final patch also added T_GraphPropertyLabel to that same primitive-node group.
5. Commit-time polish
Peter Eisentraut committed the patch, noting two cleanups:
- Reformatted the test query per earlier style suggestion.
- Reordered the switch cases so walker and mutator present nodes in the same order — a low-cost maintenance win that makes future divergences easier to spot by diffing the two functions.
Key Technical Insights
-
Walker/mutator symmetry is a hard invariant. Any patch introducing a new Node type that can appear in expression trees must touch both functions, plus
raw_expression_tree_walker()if the node appears pre-analysis, plus copyfuncs/equalfuncs/outfuncs/readfuncs. Missing the mutator is a latent bug that only surfaces when a specific transformation pass runs over the node — here, HAVING-clause subquery processing. -
Mutator handling of Int/Oid lists is not optional. Even though walkers can ignore them, mutators must copy them so that the output tree is fully independent of the input. This is easy to overlook because there's nothing "interesting" to mutate.
-
The "primitive node" fall-through group in both walker and mutator is the right home for label-reference nodes that carry only scalar payloads — avoiding the trap of writing no-op cases that look like they do something.
-
range_table_mutator_impl()is the caller chain that exposed this. Graph tables live inside RTEs, so any RTE-level mutation (subquery pullup,flatten_join_alias_vars, etc.) routes through these new node types. This is whyHAVINGwith a GRAPH_TABLE subquery is a reliable trigger but a plain top-level GRAPH_TABLE might not be. -
GRAPH_TABLE nodes persist through transformation, contrary to the initial intuition that they would be lowered to ordinary expressions early. The fix acknowledges that the graph-pattern node types are first-class citizens of the expression-tree infrastructure during parse analysis and planning.
Participant Roles
- Satya Narlapuram — reporter and author of the initial patch; provided a minimal reproducer.
- Ashutosh Bapat (committer, significant optimizer/partitioning experience) — did the substantive review, identified the
labelexpr/quantifiergap, extended coverage toraw_expression_tree_walker, and produced the revised patches. His domain familiarity with node-tree traversal conventions drove the real depth of the fix. - Robert Haas (major committer) — independently encountered the bug and added the
T_GraphLabelRefobservation, reinforcing that this was affecting real work, not just a theoretical corner case. - Peter Eisentraut (major committer, owner of much of the SQL/PGQ work in PostgreSQL) — committed the final patch with minor cosmetic fixes, including reordering switch cases for walker/mutator symmetry. His ownership of the graph-table feature makes him the natural committer.
Broader Significance
This thread is a small but representative example of a recurring class of bug in PostgreSQL: feature additions that introduce new Node types but under-cover the tree-traversal infrastructure. The graph-table feature is new enough that these gaps are still being shaken out. The fix's cleanliness — and the fact that two independent developers hit it within a week — suggests there may be value in a systematic audit of all T_Graph* nodes against every traversal function (copyfuncs, equalfuncs, outfuncs, readfuncs, expression_tree_walker, expression_tree_mutator, raw_expression_tree_walker, planstate_tree_walker, etc.) before the feature stabilizes.