[PATCH] Fix duplicate errmsg in ALTER TABLE SPLIT PARTITION

2026-05-06 · opus 4.7

Overview

This thread, opened by Ayush Tiwari in April 2026, begins as a trivial cosmetic fix to transformPartitionCmdForSplit() in parse_utilcmd.c — correcting a case where two errmsg() calls appear back-to-back in the same ereport(ERROR, ...) invocation. It evolves through review into a broader cleanup of the error-reporting hygiene around the ALTER TABLE ... SPLIT/MERGE PARTITION code paths, and ultimately surfaces a more interesting structural question about whether PostgreSQL's ereport() machinery should defend against this class of bug at all.

The Core Problem

PostgreSQL's error reporting API (ereport() / errmsg() / errdetail() / errhint()) is a variadic-style macro framework where each auxiliary function appends or overwrites fields in the ErrorData struct being constructed for the current error. Crucially, calling errmsg() twice in the same ereport() silently overwrites the first message — the user sees only the second, and the first is lost entirely. This is exactly what had happened in transformPartitionCmdForSplit():

ereport(ERROR,
        errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
        errmsg("can not split non-DEFAULT partition \"%s\"", ...),
        errmsg("new partition cannot be DEFAULT because ..."),   /* wins */
        parser_errposition(...));

The second call was semantically meant to be the reason (a detail), not a replacement primary message. The user-visible effect was that the primary error lost the partition name context and effectively presented as a detail masquerading as an errmsg.

This matters architecturally for two reasons:

Style guide compliance. PostgreSQL's message style guide mandates that errmsg() be a short primary line (lowercase start, no period), errdetail() a full sentence (capitalized, period-terminated), and errhint() actionable guidance. The original code violated the three-tier contract.
Translatability. Each errmsg()/errdetail() is a separate translation unit. Two errmsg()s effectively orphan a translation string (it would be translated but never shown), and the translator has no way to know the intended relationship between the two.

Scope Creep Driven by Review

John Naylor (a committer, acting as the senior reviewer here) pushed the patch well beyond the one-line fix by pointing out several adjacent issues in the same feature's code:

"can not" vs. "cannot" — The codebase overwhelmingly uses "cannot" as one word; the SPLIT/MERGE PARTITION code had five occurrences of the two-word form.
Confusing primary messages — e.g., "can not split to partition \"X\" together with partition \"Y\"" is almost unparseable to a user. The underlying condition is that two of the new partition bounds are not adjacent (there is a gap or an overlap between X and Y).
Duplicated error text in regression tests — partition_split.sql/partition_merge.sql contain -- ERROR: <verbatim error text> comments above each expected-failure statement. This doubles the maintenance cost of any wording change: change the C code, change the .out file (expected), and change the SQL comment. Naylor's observation that "this isn't done anywhere else" is correct — most regress tests either use -- should fail or simply rely on the .out diff.
Grammar in an errhint: "To split DEFAULT partition one of the new partition must be DEFAULT" — missing articles and incorrect plural agreement.
Parser message: "DEFAULT partition should be one" is almost nonsensical English; it was rewritten as "cannot specify more than one DEFAULT partition".

The Key Design Decision: How to Structure the Adjacency Error

The most substantive design debate was over how to phrase the non-adjacency error produced by check_two_partitions_bounds_range() in partbounds.c. Three positions emerged:

Ayush's v2 proposal: Fold the information into errmsg ("cannot split non-adjacent partitions X and Y"), keep a specific errdetail naming which bound didn't match, and drop the errhint as redundant.
Jian He's counter: Keep the errhint (which references the user-facing SQL syntax ALTER TABLE ... MERGE PARTITIONS) because the detail is what's redundant, not the hint. The hint's value is that it tells the user what SQL-level invariant they violated, not just which pair of bounds failed.
Naylor's resolution (which prevailed): Keep all three fields but fix each at its appropriate level:
- errmsg should name the old partition being acted upon — symmetric between SPLIT and MERGE — rather than pairs of new partitions. This required threading splitPartOid into the helper.
- errdetail should be a grammatically complete sentence with articles on both bounds: "The lower bound of partition X is not equal to the upper bound of partition Y."
- errhint should disambiguate "old partition bounds" (for MERGE) vs "new partition bounds" (for SPLIT).

Naylor's structural argument — that errmsg should consistently reference the partition the DDL acts on, not the incidental pair that failed validation — is a notable API-design principle. It means tools that scan logs by error message can correlate failures by the target object rather than by transient intermediate names.

The Signature Refactor

To let check_two_partitions_bounds_range() produce the "cannot split partition X" errmsg, it needed access to splitPartOid. Ayush's v5 added an explicit Oid splitPartOid parameter alongside the existing bool is_merge, passing InvalidOid from the merge call site. Naylor then observed that the boolean is now derivable: is_merge = !OidIsValid(splitPartOid). v6 removed the redundant flag, using a local derived variable inside the function. This is a small but clean example of eliminating correlated parameters — the two were never independent and the compiler/reader no longer needs to reason about their consistency.

The Meta-Question: Should ereport() Catch This?

The most architecturally interesting moment in the thread came late, when Naylor asked whether "automated tooling can detect cases like this going forward." Jian He responded with a concrete experiment: add an Assert(edata->message == NULL) inside errmsg() itself (before EVALUATE_MESSAGE overwrites it), and correspondingly NULL the pointer in FreeErrorDataContents. His local run showed 373 tests passing with no failures, suggesting that — after this cleanup patch — no call site in the tree does a double errmsg() anymore.

This is a meaningful observation because:

It implies the bug class is small enough to be eradicated and then enforced by an assertion in debug builds.
The cost is near-zero (one pointer check in an already expensive error path).
It would also catch double errdetail() / errhint() if extended, which are equally latent bugs — harder to notice because both may be grammatically plausible, but equally incorrect.
Jian He's hedge about HAVE_PG_INTEGER_CONSTANT_P reflects a real concern: ereport_domain has two code paths depending on whether the compiler supports __builtin_constant_p with sufficient semantics. The assertion would need to hold on both, and his experiment only covered one.

This assertion was not committed as part of this thread but is the most interesting follow-up it produced. It would belong logically in elog.c and would need a buildfarm sweep to verify no extensions rely on the double-call (mis)behavior.

Participant Weight

John Naylor — a committer — clearly drives the direction of the patch. His review expands scope, rejects weak wording choices, and enforces the style guide. His opinion is decisive on all wording questions.
Ayush Tiwari — the patch author — is responsive and iterates quickly (six revisions in ~two weeks), taking direction well.
Jian He — an active reviewer on many patches — provides a counterbalancing opinion on errhint/errdetail tradeoffs and contributes the most technically interesting idea (the Assert in errmsg).
Yuchen Li — essentially a +1 drive-by review.

Implications

The final patch set (v6, two patches) lands:

0001: Message-text and structural fixes across parse_utilcmd.c, partbounds.c, tablecmds.c; signature change to check_two_partitions_bounds_range().
0002: Test cleanup removing duplicated error text from .sql files.

The thread is a good illustration of how a one-line bug fix in PostgreSQL often surfaces a cluster of related issues in a single feature — SPLIT/MERGE PARTITION was added relatively recently (PG17) and its error-reporting hygiene had not yet been through the kind of style-guide scrutiny older code has accumulated. It also demonstrates the project's standard practice of splitting mechanical/cosmetic changes (test comments) from semantic ones (message wording) into separate patches to make review and potential revert easier.