Analysis: Unknown-type Literal Resolution in GRAPH_TABLE ... COLUMNS
Core Problem
PostgreSQL's type system has a peculiar intermediate type — UNKNOWNOID — used for string literals whose type hasn't yet been inferred from context. For example, in SELECT 'val1', the literal 'val1' initially has type unknown; the parser/analyzer is expected to resolve it to a concrete type (usually text) before execution, because many downstream operations (ORDER BY comparisons, UNION type unification, output formatting via type output functions) cannot operate on the pseudo-type unknown.
The workhorse for this late resolution on SELECT target lists is resolveTargetListUnknowns(), invoked from transformSelectStmt(). It walks the target list and coerces any remaining unknown-typed entries to text — but only when pstate->p_resolve_unknowns is true. That flag is significant: in contexts like INSERT ... SELECT or set-operation arms, unknown resolution is deliberately deferred so type unification with the target columns / peer arms can drive the choice.
The bug report concerns SQL/PGQ's GRAPH_TABLE construct (new in the SQL:2023 property graph feature landing in PG). Its parse-analysis routine, transformRangeGraphTable(), processes the COLUMNS (...) clause by calling transformExpr() and assign_list_collations(), but it never calls resolveTargetListUnknowns(). Consequently, a query like:
SELECT * FROM GRAPH_TABLE (g, MATCH ... COLUMNS ('val1' AS c)) ORDER BY c;
leaks unknown-typed columns out of the graph-table subquery, producing downstream failures (ORDER BY has no comparison operator for unknown, UNION cannot unify, and the client wire protocol can't format the value).
Why This Matters Architecturally
GRAPH_TABLE is essentially a new relation-producing construct that must obey the same invariants as any other range-table entry: its output columns must have fully resolved types by the end of parse analysis. Failure to run the same "cleanup" step that SELECT target lists get means the rest of the planner/executor sees a malformed RTE. This is a classic instance of a new parse-analysis path forgetting to call one of the implicit contract-enforcing steps that transformSelectStmt() performs at the tail end.
Proposed Fix
The patch adds a call to resolveTargetListUnknowns() in transformRangeGraphTable() on the columns target list. Two iterations of the patch:
- v1: called
resolveTargetListUnknowns()afterassign_list_collations(). - v2 (after Junwang Zhao's review): moved the call to before
assign_list_collations(), mirroring the ordering intransformSelectStmt()(resolveTargetListUnknowns→assign_query_collations). Functionally it may not matter — collation assignment on a freshly coercedtextnode produces the same result — but consistency with the existing pattern aids readability and guards against future divergence if collation logic starts depending on concrete types.
Ashutosh Bapat's review raised the key subtlety: the fix must respect pstate->p_resolve_unknowns. If GRAPH_TABLE appears in a context where unknown resolution should be deferred (e.g., nested in set operations or an INSERT source), blindly coercing to text would be wrong. The unconditional call would override the caller's policy. Ashutosh also collapsed the test battery (ORDER BY, UNION, output) into a single query that exercises all three pathways, and added a test where p_resolve_unknowns is false to verify the deferred case.
Peter Eisentraut's Concern
Peter (committer, and the person who has been shepherding the SQL/PGQ work) reports two problems with the latest submission:
- Patch application failure — the subject line
[PATCH 5/5]suggests this is the tail of an unpublished local patch series, so context (probably otherGRAPH_TABLEfixes) is missing from the public thread. - Test no longer reproduces on master — applying only the test manually, output matches expected. This implies either (a) another commit incidentally fixed the symptom (possibly by changing where resolution happens in a parent context), or (b) the test is insufficiently targeted and needs to isolate the
GRAPH_TABLERTE as the sole producer of the unknown column, without any outer context that would causeresolveTargetListUnknownsto be invoked on the wrapping SELECT.
The latter is the more likely explanation: if the test query is something like SELECT c FROM GRAPH_TABLE(... COLUMNS ('val1' AS c)), the outer SELECT's target list would contain a Var referencing column c, and that Var's type comes from the RTE. If the RTE's column was still unknown but the Var gets resolved via the outer resolveTargetListUnknowns... actually no, Vars carry a fixed vartype set when the Var is built from the RTE, so the outer pass can't help. The more plausible scenario is that another in-flight GRAPH_TABLE patch in the series already moved or added the resolution step, making this particular fix redundant — hence Peter's request for either a reproducing test on plain master or confirmation that the fix is subsumed.
Key Technical Insights
resolveTargetListUnknownsvs.assign_list_collationsordering: the established convention intransformSelectStmtis resolution-then-collation. The v2 patch alignsGRAPH_TABLEwith this.p_resolve_unknownsis load-bearing: any new target-list-like context must consult it rather than unconditionally coercing. This is the same flag that makesSELECT 'x' UNION SELECT 1::texttype-unify correctly.- GRAPH_TABLE columns are a target-list-equivalent: they produce the output tuple descriptor of the RTE, so they need the full target-list finalization protocol, not just expression transformation + collation.
- Patch-series hygiene: submitting a tail patch (
5/5) without the preceding context makes independent verification impossible, which is why Peter couldn't apply it.
Status
As of Peter's last message, the patch is in limbo pending either (a) rebased resubmission with a test that demonstrably fails on master, or (b) acknowledgment that a parallel fix obsoleted it.