2026-05-18 · claude-opus-4-6

Technical Analysis: Misleading Error Message in ProcessUtilitySlow T_CreateStatsStmt

Core Problem

This thread begins with a seemingly simple observation about a misleading error message but evolves into a significant architectural refactoring discussion about how CREATE STATISTICS is processed in PostgreSQL's utility command pipeline.

The Surface Issue

When a user writes:

CREATE STATISTICS alt_stat2 ON a, b FROM tftest(1);

where tftest is a table-returning function, the error message returned is:

ERROR: only a single relation is allowed in CREATE STATISTICS

This is misleading because:

The user is providing a single relation (just the wrong kind)
The actual problem is that a table function isn't a plain table name — it's not about cardinality of relations

The Deeper Architectural Issues

Upon investigation, multiple deeper problems were identified:

Redundant relation resolution: The relation name is resolved (via RangeVarGetRelid) twice — once in ProcessUtilitySlow() and again inside CreateStatistics(). This is both wasteful and potentially dangerous (CVE-2014-0062 pattern: resolving the same non-fully-qualified name twice might yield different results due to concurrent DDL).
ProcessUtilitySlow doing too much work: The case T_CreateStatsStmt block in ProcessUtilitySlow performs parse analysis (transformStatsStmt) and relation opening, which violates the design intent of ProcessUtilitySlow as merely a dispatching switch.
Double locking: Both ProcessUtilitySlow and CreateStatistics acquire ShareUpdateExclusiveLock on the same relation, which is redundant if we restructure the code to open the relation only once.
The error check itself is mischaracterized: As Peter Eisentraut pointed out, the !IsA(rel, RangeVar) check doesn't examine relation kind (relkind) at all — it checks whether the FROM clause entry is a table name vs. some other grammar production (VALUES, JOIN, table function, XMLTABLE, etc.). The error message should reflect this distinction.

Proposed Solutions

1. Error Message Fix (Committed by Álvaro)

The immediate fix changed the error message to better reflect what the check actually validates:

ERROR: cannot create statistics on specified relation
DETAIL: CREATE STATISTICS only supports tables, materialized views, foreign tables, and partitioned tables.

Tom Lane argued against listing "partitioned tables" separately since they're generally subsumed under "tables." This was accepted.

2. Peter Eisentraut's Grammar-Level Fix (Committed)

Peter proposed that the error could be eliminated entirely by tightening the grammar from:

FROM from_list

to:

FROM qualified_name_list

This would reject non-table-name entries at parse time rather than requiring a runtime check. He also proposed a wording change making the error about "table names in the FROM clause." Tom Lane agreed on the wording but insisted on keeping ERRCODE_FEATURE_NOT_SUPPORTED rather than a syntax error code, since multi-relation statistics is intentionally left as syntax space for future features.

3. The Refactoring Patch (Under Discussion)

The larger refactoring (iterated through v1–v5+ by jian he, with significant v4 input from Álvaro) restructures the pipeline:

Move parse analysis into CreateStatistics(): Instead of ProcessUtilitySlow calling transformStatsStmt and then CreateStatistics, the transform is done inside CreateStatistics itself.
Open relation only once: The relation is opened with ShareUpdateExclusiveLock in CreateStatistics and the Relation object is passed directly to transformStatsStmt.
Simplify ATPostAlterTypeParse(): Since CreateStatistics() now handles transformation internally, the special-case code in ATPostAlterTypeParse that calls transformStatsStmt separately becomes unnecessary.

Tradeoff: Error Position Reporting

jian he identified that the refactoring loses error position information in one edge case:

ALTER TABLE t ALTER COLUMN a SET DATA TYPE text;

When this triggers re-validation of a statistics expression like (a + 1 IS NOT NULL), the current code reports the error position in the ALTER TABLE statement. After refactoring, the position is lost. Álvaro argued this is acceptable because the position was misleading anyway — it points to a location in the ALTER TABLE statement that has nothing to do with the operator error.

4. Future Design Question: Pre/Post Transform Nodes

Álvaro's latest message (May 2026) raises whether CreateStatsStmt should be split into two separate node types: one for the pre-transform state (raw parser output) and one for the post-transform state. This is a pattern used elsewhere in PostgreSQL (e.g., RawStmt vs. planned statements) and would make the data flow clearer, avoiding the stmt->transformed flag pattern.

Key Technical Insights

The `!IsA(rel, RangeVar)` Check

The grammar for CREATE STATISTICS ... FROM from_list accepts the full from_list production, which can include JOINs, subqueries, VALUES, XMLTABLE, and table functions. The IsA(rel, RangeVar) check is a runtime guard that rejects everything except plain table names. This was intentionally designed to leave grammar space for future multi-relation statistics support.

Lock Semantics

When ProcessUtilitySlow resolves the name to an OID and acquires ShareUpdateExclusiveLock, and then CreateStatistics does relation_open(relid, ShareUpdateExclusiveLock) again, the second call is a no-op (same lock level already held). Tom Lane pointed out that the second call should use NoLock to make explicit that we expect the lock to already be held. The refactoring eliminates this issue entirely.

The CVE-2014-0062 Pattern

Resolving a name to OID twice without holding a lock continuously between the two resolutions can be exploited: an attacker could rename objects between the two resolutions to cause the second resolution to target a different object. The refactoring closes this (theoretical) window by resolving once and propagating the OID/Relation forward.

Multi-Relation Statistics (Hypothetical Future)

The thread touches on how multi-relation extended statistics might work. jian he argues that pg_statistic_ext would need to store all associated relation OIDs (not just one stxrelid) for expression deparsing to work. This is relevant because the grammar intentionally accepts from_list (not just a single table name) in anticipation of this feature.

Glossary Improvement (Side Topic)

The thread also led to an improvement of the "relation" entry in the PostgreSQL glossary. Tom Lane proposed clearer wording distinguishing the mathematical meaning ("a set of tuples" — the origin of "relational database") from the PostgreSQL-specific meaning ("an SQL object with a pg_class entry"). Álvaro committed this improvement.

misleading error message in ProcessUtilitySlow T_CreateStatsStmt

Latest Update

Technical Analysis: Misleading Error Message in ProcessUtilitySlow T_CreateStatsStmt

Core Problem

The Surface Issue

The Deeper Architectural Issues

Proposed Solutions

1. Error Message Fix (Committed by Álvaro)

2. Peter Eisentraut's Grammar-Level Fix (Committed)

3. The Refactoring Patch (Under Discussion)

Tradeoff: Error Position Reporting

4. Future Design Question: Pre/Post Transform Nodes

Key Technical Insights

The `!IsA(rel, RangeVar)` Check

Lock Semantics

The CVE-2014-0062 Pattern

Multi-Relation Statistics (Hypothetical Future)

Glossary Improvement (Side Topic)

misleading error message in ProcessUtilitySlow T_CreateStatsStmt

Latest Update

Technical Analysis: Misleading Error Message in ProcessUtilitySlow T_CreateStatsStmt

Core Problem

The Surface Issue

The Deeper Architectural Issues

Proposed Solutions

1. Error Message Fix (Committed by Álvaro)

2. Peter Eisentraut's Grammar-Level Fix (Committed)

3. The Refactoring Patch (Under Discussion)

Tradeoff: Error Position Reporting

4. Future Design Question: Pre/Post Transform Nodes

Key Technical Insights

The !IsA(rel, RangeVar) Check

Lock Semantics

The CVE-2014-0062 Pattern

Multi-Relation Statistics (Hypothetical Future)

Glossary Improvement (Side Topic)

The `!IsA(rel, RangeVar)` Check