Annual pgindent Run: Tooling Hygiene and Formatting Improvements
Context and Purpose
PostgreSQL maintains strict code formatting conventions enforced via pgindent, a customized wrapper around BSD indent. Once per development cycle — traditionally shortly before the beta1 branch of a new major version — the project performs a tree-wide reindentation run. This serves several architectural purposes:
- Normalizes formatting drift that inevitably accumulates from hundreds of committers over a year of development, since pgindent is not run on every commit.
- Establishes a clean baseline before beta, minimizing subsequent churn that would complicate back-patching.
- Refreshes auxiliary generated artifacts —
typedefs.list, OID renumbering, Perl tidy, copyright years — which depend on the current state of the tree.
Tom Lane's message here is the traditional "prepping" announcement that kicks off this cycle for the v19 (2026) development series, timed after minor-release dust settles to avoid conflicting with back-patch activity.
Technical Mechanics
typedefs.list Synchronization
src/tools/pgindent/typedefs.list is critical to pgindent's operation because BSD indent cannot parse C declarations without knowing which identifiers are typedef names (versus variable names). Misclassifying a typedef causes incorrect spacing around pointer declarations, cast expressions, and variable declarations.
The canonical list is generated by the buildfarm by extracting DWARF debug symbols from built binaries across multiple platforms — this catches platform-specific typedefs (Windows, BSD variants) that a single developer's build would miss. The in-tree file is maintained manually by committers adding new typedefs as they go, but Tom notes this is "not perfect" — the delta between manual maintenance and buildfarm-authoritative state represents accumulated drift.
Multiline Comment Formatting (Patch [1])
The change enforces that multiline comment blocks begin with an empty first line:
/* isn_out → /*
* ... * isn_out
*/ * ...
*/
This is stylistically uniform with the vast majority of existing PostgreSQL comments, but has been inconsistently applied. The significance is not functional but maintenance-related: uniform comment structure makes automated tooling (comment extraction, documentation generation) and human diff review more predictable. The 99% figure Tom cites indicates this is a low-risk mechanical change.
Space After Comma Before Ellipsis (Patch [2])
Two distinct beneficiaries:
Variadic function declarations: const char *fmt,...) → const char *fmt, ...). This aligns with standard C style guides and PostgreSQL's own convention for other comma-separated lists. BSD indent historically treated ... as a special token and elided the space.
Designated initializers: .color = RBTBLACK,.left = RBTNIL → .color = RBTBLACK, .left = RBTNIL. This is arguably a more substantive readability win — the dense form makes it genuinely harder to parse field boundaries, especially in larger struct initializers. The RBTree sentinel example is illustrative; similar patterns appear throughout the tree in places like node tag initialization and GUC tables.
Tom's assessment that "I don't see any places where it makes anything worse" is the key approval criterion — pgindent changes are accepted based on Pareto-improvement: they must not regress any existing well-formatted code.
Architectural Significance
While superficially cosmetic, the pgindent run and its associated tool changes have real engineering implications:
- Back-patching friction: A major reformatting creates a "wall" in
git blame and git log -p history. Doing it once per cycle at a predictable time (pre-beta) concentrates the pain and allows tooling (e.g., git blame --ignore-revs-file) to work around it cleanly.
- Extension authors: pgindent rules are the de facto standard for out-of-tree extensions; improvements propagate outward.
- Review burden reduction: Consistent mechanical formatting means human reviewers can focus on semantic changes rather than style nits during patch review.
Process Notes
The bundled work items — pgindent, pgperltidy, renumber_oids.pl, copyright.pl — are batched for efficiency. renumber_oids.pl is particularly important: during development, newly-assigned catalog OIDs tend to cluster in ad-hoc ranges that developers pick; renumbering compacts them into the proper permanent range (typically moving development OIDs from 8000+ down to the next available production range) before the catalog is frozen at beta1. After beta1, OIDs become part of the on-disk ABI for pg_upgrade and cannot be renumbered.
This message is essentially a committer's heads-up — Tom Lane, as the de facto steward of pgindent and release mechanics for decades, is the natural driver. The email serves as both a proposal and an advance-notice: committers with pending patches should land them before the reindent to avoid merge conflicts, and those with opinions on the formatting changes have a window to object before they're applied tree-wide.