Technical Analysis: Message Wording/Formatting Cleanup Patches
Core Problem
This thread addresses a collection of internationalization (i18n), style consistency, and naming issues discovered during Japanese translation of PostgreSQL error messages. While individually minor, these issues reflect deeper architectural concerns about how PostgreSQL constructs user-facing messages and maintains naming consistency across subsystems.
The four patches target distinct problems:
-
Property graph object descriptions not translatable —
getObjectDescription()forPropgraphElementRelationIdbuilds object names incrementally viaappendStringInfo(), which hardcodes English word order. Languages like Japanese have fundamentally different syntactic structures (SOV vs. SVO), making incremental string construction untranslatable without restructuring the format string. -
Inconsistent quoting in libpq protocol error —
fe-protocol3.cuses backquotes around a parameter name, violating PostgreSQL's established convention of double quotes for identifiers/names in error messages. This matters for consistency in message catalogs and automated log parsing. -
Missing trailing period in HINT message — A trivial but important style violation in
be-secure-openssl.cthat breaks the uniform punctuation convention for hint messages. -
Inconsistent naming: "datachecksum" vs "datachecksums" — The data checksums background worker subsystem uses both singular and plural forms inconsistently across process names, comments, and code, creating confusion about the intended naming convention.
Proposed Solutions and Design Decisions
Patch 0001: Translatable Property Graph Descriptions
The fix replaces incremental appendStringInfo() calls with a single format string containing all substitution parameters. This is a well-known i18n pattern — translators need complete sentences/phrases to reorder components for their target language. The architectural lesson is that any code constructing user-visible text must treat the entire phrase as an atomic translation unit.
Patch 0002: Quoting Convention Fix
Daniel Gustafsson suggested going further than just fixing backquotes to double quotes — the hardcoded parameter name _pq_.test_protocol_negotiation should be passed as a %s argument rather than embedded in the format string. This serves two purposes: (1) it matches the established pattern for identifier references in messages, and (2) it makes the message more maintainable if the parameter name changes. Jacob Champion agreed and pushed the fix.
Patch 0003: Missing Period in HINT
Daniel further improved the hint beyond just adding a period, making the text more helpful by adding "configure SSL in" to clarify the second alternative:
- errhint("If ssl_sni is enabled then add configuration to \"%s\", else \"%s\"",
+ errhint("If ssl_sni is enabled then add configuration to \"%s\", else configure SSL in \"%s\".",
Patch 0004: Naming Consistency Debate
This patch generated the most substantive discussion, revealing a naming design tension:
- Original patch proposal: Standardize on singular "datachecksum" everywhere
- Daniel Gustafsson's position (as subsystem author): "DataChecksumsXXX" was deliberately chosen because the feature is
data_checksums(plural GUC). The singular "checksum" is used only where referring to a single entity (e.g., the cluster state). He agreed to rename user-facing "datachecksum launcher" → "datachecksums launcher" but pushed back on the file rename. - Tomas Vondra's position: Use plural "data checksums" almost everywhere since the feature manipulates all checksums in a cluster, consistent with the GUC name. Supported renaming
datachecksum_state.c→datachecksums_state.c. - Resolution: Daniel pushed the worker/launcher rename to plural but held off on the file rename, indicating uncertainty about whether that change improves clarity.
Architectural Significance
Internationalization Architecture
The property graph issue (0001) illustrates a systemic risk in PostgreSQL's message construction patterns. Any code that builds descriptive strings incrementally — common in object description functions — is potentially untranslatable. The gettext system requires complete format strings to allow translators to reorder components. This is particularly important as PostgreSQL adds new object types (like property graph elements) that need descriptive messages.
Naming Convention Governance
The datachecksums debate highlights the lack of a formal naming policy for background worker subsystems. When a feature spans multiple files, process names, injection points, and GUC parameters, inconsistency is almost inevitable without explicit conventions. The resolution (partial rename, defer file rename) shows the pragmatic approach — fix user-visible inconsistencies first, defer internal changes that might cause unnecessary churn.
Message Style as API Contract
PostgreSQL's error messages are effectively a stable interface consumed by monitoring tools, log analyzers, and human operators. The quoting convention (double quotes for identifiers), punctuation rules (trailing periods on hints), and naming consistency all contribute to a parseable, predictable message format.