StringInfo fixes, v19 edition. Plus a few oddities — Technical Analysis
Core Problem
This thread addresses two intertwined concerns that recur each PostgreSQL development cycle:
-
Misuse of
appendStringInfo()/appendPQExpBuffer()variants — these are printf-style functions that invokevsnprintf()internally. When callers pass a plain literal with no format arguments (or a literal"%s"with a single string), they pay unnecessary CPU cost for format parsing whenappendStringInfoString()/appendStringInfoChar()would be far cheaper. This is a recurring "whack-a-mole" problem — Rowley previously cleaned it up in commits8461424fd(v17) and928394b66(v18), and is now doing the v19 round. -
Construction of translatable messages from fragments — discovered incidentally in
append_tuple_value_detail()(introduced in 48efefa6ca for logical replication conflict reporting), where calls like_("."),_(": "),_(", ")were being concatenated to build conflict detail messages. This violates PostgreSQL's message style guide because translators cannot meaningfully localize disconnected punctuation fragments — the same"."token has no stable meaning across contexts, and some languages (e.g., French) have entirely different spacing/punctuation conventions (e.g.," : "with a non-breaking space before the colon).
Patch Structure and Disposition
Rowley proposed three patches:
- 0001 — An instrumentation/detection patch that wraps
appendStringInfoas a macro forcing at least one vararg argument, plusStaticAssertchecks catching constant format strings like"hello"or bare"%s". Not intended for commit, but used to locate all offending callsites. Required collateral edits (e.g., inxml.c,WRITE_*_FIELDmacros in outfuncs.c) to keep the tree compiling. - 0002 — Speculative: touches the weird translation-fragment code in
append_tuple_value_detail(), plus minorHELP0("\n")→HELPC('\n')conversions in psql and an fe-auth-oauth.c tweak. Rowley was explicitly unsure about committing this. - 0003 — Safe mechanical cleanup of v19-era regressions (code from
627acc3ca,41d69e6dc, and others). Pushed on 2026-04-13.
Key Technical Debate: Make the Misuse Impossible
Andres Freund's proposal is the architecturally significant contribution. Rather than repeatedly hand-auditing, make the compiler enforce correct usage:
extern void appendStringInfoInternal(StringInfo str, const char *fmt, ...)
pg_attribute_printf(2, 3);
#define appendStringInfo(str, fmt, ...) \
appendStringInfoInternal(str, fmt, __VA_ARGS__)
With C99/GCC variadic macros, appendStringInfo(buf, "matched") fails to compile because __VA_ARGS__ is empty and expands to a trailing comma with nothing after. The error message is imperfect but actionable.
Rowley's counter-concerns:
- The wrapper approach still doesn't catch
appendStringInfo(buf, "%s", str)— that needs theStaticAssert+__builtin_constant_p()approach, which isn't portable to all supported compilers. Risk: CI passes on one toolchain, breaks another post-commit. - For
libpq'sappendPQExpBuffer, the ABI export makes macro wrapping harder (external consumers link against the symbol directly). - Collateral damage: the macro change forced ugly workarounds in node-output macros (
WRITE_CHAR_FIELD,WRITE_FLOAT_FIELD) which, as Andres pointed out, should themselves be usingappendStringInfoString()since node-tree output is performance-sensitive. Rowley agreed these were genuine oversights but hesitated to fix them post-freeze since they aren't v19-new code.
No consensus was reached on committing the macro enforcement — this remains a v20 opportunity.
The Translation Fragment Fix
Tom Lane's reply anchored the design: the helper append_tuple_value_detail() should produce (v1, v2, v3) with untranslated SQL-syntax punctuation (parentheses, commas are universal SQL notation), and callers should use complete translatable sentences with a single %s placeholder:
appendStringInfo(&err_detail,
_("Could not find the row to be updated: %s.\n"),
tuple_buf.data);
Vignesh picked up the implementation. Several iterations refined edge cases:
- Hou Zhijie caught over-parenthesization:
get_tuple_desc()already emits parenthesized values (replica identity (a)=(10)), so wrapping the whole thing again produced(replica identity (a)=(10)). - Kuroda Hayato raised the French colon spacing issue — the colon and surrounding whitespace must be inside the translatable string, because French requires
" : "(narrow no-break space before) and other languages may omit colons entirely. This forced moving from a single template string to per-case if/else branches where each branch owns its punctuation, an acceptable tradeoff. - Peter Smith noted a
first-flag ambiguity: the "insufficient privileges" fallback usedfirst==trueas its trigger, butfirstonly guarantees no iterations produced output, not that privileges were the cause. Also a blank-line regression inCT_UPDATE_DELETED. Vignesh dropped the privileges fallback on Amit Kapila's advice (keep scope minimal) and fixed the whitespace. - Peter also proposed normalizing message wording ("Updating the row..." → "The row to be updated was...") across all conflict types. Vignesh (correctly, per project norms) deferred this as scope creep.
Kuroda clarified this is not a backpatch candidate — it's a translation-quality improvement, not a bug fix. Amit Kapila pushed the final v4 on 2026-05-04.
Architectural Implications
-
Recurring maintenance burden: The thread is the third consecutive release requiring a StringInfo cleanup sweep. This argues strongly for Andres's compile-time enforcement, even if the error messages are mediocre. The cost of one ugly diagnostic beats annual manual audits across millions of lines.
-
Translation contract: PostgreSQL's style rule "don't construct messages from parts" is not pedantry — it's a hard requirement driven by the fact that word order, punctuation spacing, and even presence of separators vary per locale. The
_(".")pattern is particularly pernicious becausegettextwill merge identical msgids across the whole catalog, so one"."translation has to satisfy every caller — an impossibility. -
Node-tree output hot path: Andres's observation that
WRITE_CHAR_FIELD/WRITE_FLOAT_FIELDuseappendStringInfowhereappendStringInfoStringwould suffice is the performance-relevant nugget buried here. Node serialization runs on every plan cache miss, every parallel worker handoff, and every catalog dump involving stored expression trees. Converting these avoids per-callvsnprintftraversal.
Participant Weight
- Tom Lane (senior committer) — laid down the design principle that ended further debate on the translation-fragment question.
- Andres Freund (senior committer) — proposed the structural fix (compile-time enforcement) and flagged the highest-impact performance misses.
- David Rowley (committer, optimizer expertise) — owns this cleanup work cycle-over-cycle; authored and pushed 0003.
- Amit Kapila (committer, logical replication area owner) — gatekeeper for the
conflict.cchanges (code originated in 48efefa6ca, logical-rep conflict detection); pushed the final translation fix and enforced scope discipline. - Vignesh C — picked up and iterated the patch.
- Hou Zhijie, Kuroda Hayato, Peter Smith — Fujitsu reviewers, each catching a distinct class of issue (semantic correctness, i18n correctness, logic/cosmetic).