[PATCH] Fix duplicate errmsg in ALTER TABLE SPLIT PARTITION

First seen: 2026-04-21 04:34:11+00:00 · Messages: 18 · Participants: 4

Latest Update

2026-05-06 · opus 4.7

Overview

This thread, opened by Ayush Tiwari in April 2026, begins as a trivial cosmetic fix to transformPartitionCmdForSplit() in parse_utilcmd.c — correcting a case where two errmsg() calls appear back-to-back in the same ereport(ERROR, ...) invocation. It evolves through review into a broader cleanup of the error-reporting hygiene around the ALTER TABLE ... SPLIT/MERGE PARTITION code paths, and ultimately surfaces a more interesting structural question about whether PostgreSQL's ereport() machinery should defend against this class of bug at all.

The Core Problem

PostgreSQL's error reporting API (ereport() / errmsg() / errdetail() / errhint()) is a variadic-style macro framework where each auxiliary function appends or overwrites fields in the ErrorData struct being constructed for the current error. Crucially, calling errmsg() twice in the same ereport() silently overwrites the first message — the user sees only the second, and the first is lost entirely. This is exactly what had happened in transformPartitionCmdForSplit():

ereport(ERROR,
        errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
        errmsg("can not split non-DEFAULT partition \"%s\"", ...),
        errmsg("new partition cannot be DEFAULT because ..."),   /* wins */
        parser_errposition(...));

The second call was semantically meant to be the reason (a detail), not a replacement primary message. The user-visible effect was that the primary error lost the partition name context and effectively presented as a detail masquerading as an errmsg.

This matters architecturally for two reasons:

  1. Style guide compliance. PostgreSQL's message style guide mandates that errmsg() be a short primary line (lowercase start, no period), errdetail() a full sentence (capitalized, period-terminated), and errhint() actionable guidance. The original code violated the three-tier contract.
  2. Translatability. Each errmsg()/errdetail() is a separate translation unit. Two errmsg()s effectively orphan a translation string (it would be translated but never shown), and the translator has no way to know the intended relationship between the two.

Scope Creep Driven by Review

John Naylor (a committer, acting as the senior reviewer here) pushed the patch well beyond the one-line fix by pointing out several adjacent issues in the same feature's code:

  1. "can not" vs. "cannot" — The codebase overwhelmingly uses "cannot" as one word; the SPLIT/MERGE PARTITION code had five occurrences of the two-word form.
  2. Confusing primary messages — e.g., "can not split to partition \"X\" together with partition \"Y\"" is almost unparseable to a user. The underlying condition is that two of the new partition bounds are not adjacent (there is a gap or an overlap between X and Y).
  3. Duplicated error text in regression testspartition_split.sql/partition_merge.sql contain -- ERROR: <verbatim error text> comments above each expected-failure statement. This doubles the maintenance cost of any wording change: change the C code, change the .out file (expected), and change the SQL comment. Naylor's observation that "this isn't done anywhere else" is correct — most regress tests either use -- should fail or simply rely on the .out diff.
  4. Grammar in an errhint: "To split DEFAULT partition one of the new partition must be DEFAULT" — missing articles and incorrect plural agreement.
  5. Parser message: "DEFAULT partition should be one" is almost nonsensical English; it was rewritten as "cannot specify more than one DEFAULT partition".

The Key Design Decision: How to Structure the Adjacency Error

The most substantive design debate was over how to phrase the non-adjacency error produced by check_two_partitions_bounds_range() in partbounds.c. Three positions emerged:

Naylor's structural argument — that errmsg should consistently reference the partition the DDL acts on, not the incidental pair that failed validation — is a notable API-design principle. It means tools that scan logs by error message can correlate failures by the target object rather than by transient intermediate names.

The Signature Refactor

To let check_two_partitions_bounds_range() produce the "cannot split partition X" errmsg, it needed access to splitPartOid. Ayush's v5 added an explicit Oid splitPartOid parameter alongside the existing bool is_merge, passing InvalidOid from the merge call site. Naylor then observed that the boolean is now derivable: is_merge = !OidIsValid(splitPartOid). v6 removed the redundant flag, using a local derived variable inside the function. This is a small but clean example of eliminating correlated parameters — the two were never independent and the compiler/reader no longer needs to reason about their consistency.

The Meta-Question: Should ereport() Catch This?

The most architecturally interesting moment in the thread came late, when Naylor asked whether "automated tooling can detect cases like this going forward." Jian He responded with a concrete experiment: add an Assert(edata->message == NULL) inside errmsg() itself (before EVALUATE_MESSAGE overwrites it), and correspondingly NULL the pointer in FreeErrorDataContents. His local run showed 373 tests passing with no failures, suggesting that — after this cleanup patch — no call site in the tree does a double errmsg() anymore.

This is a meaningful observation because:

This assertion was not committed as part of this thread but is the most interesting follow-up it produced. It would belong logically in elog.c and would need a buildfarm sweep to verify no extensions rely on the double-call (mis)behavior.

Participant Weight

Implications

The final patch set (v6, two patches) lands:

The thread is a good illustration of how a one-line bug fix in PostgreSQL often surfaces a cluster of related issues in a single feature — SPLIT/MERGE PARTITION was added relatively recently (PG17) and its error-reporting hygiene had not yet been through the kind of style-guide scrutiny older code has accumulated. It also demonstrates the project's standard practice of splitting mechanical/cosmetic changes (test comments) from semantic ones (message wording) into separate patches to make review and potential revert easier.