Analysis: Making NumericVar Storage Semantics Explicit
Context and Architectural Background
PostgreSQL's numeric type has two distinct representations that the code juggles constantly:
- The packed
Numericdatum — the on-disk/varlena form with a compact header and digit array, subject to TOAST. NumericVar— the in-memory working representation used throughoutnumeric.cfor arithmetic, comparison, aggregation, and formatting. It carriesndigits,weight,sign,dscale, and a pointerdigitsinto aNumericDigitbuffer, withbuftracking the actual allocation base (becausedigitsmay point pastbufto leave room for a carry digit).
Over roughly two decades of incremental evolution, numeric.c accumulated several implicit conventions around how the digit storage behind a NumericVar is managed:
- A freshly
init_var'd variable has no live digits. init_var_from_num()makes theNumericVaralias (borrow) digits directly out of a detoastedNumericdatum — read-only, no ownership.alloc_var()replaces any current storage with a heap-allocated writable buffer owned by the variable.- Static constants (
const_zero,const_one, etc.) expose read-only digit arrays that are aliased intoNumericVars during computation. - Some hot paths use an on-stack
NumericDigitarray (viaNUMERIC_LOCAL_NDIG/NUMERIC_LOCAL_NMAXstyle inline buffers) to avoidpallocfor small operands. set_var_from_var,strip_var, and the arithmetic result-placement code all have to be careful about aliasing between input and output variables (e.g.,add_var(x, y, x)).
None of this is codified in the types. It lives in comments, function naming, and programmer discipline. The result is code that is correct but fragile: any patch touching numeric storage must reconstruct the invariants from scratch, and the rules around when free_var is safe, when a buffer may be written to, and when digits point into foreign memory are easy to get wrong.
What the Patch Proposes
Chenhui Mo proposes a refactor-only patch (explicitly disclaiming algorithmic or performance changes) that reifies these implicit contracts into an explicit state machine on NumericVar.
New fields on NumericVar
capacity— physical size of the owned digit buffer (distinct fromndigits, which is logical length).state— one ofNUMVAR_EMPTY,NUMVAR_BORROWED,NUMVAR_OWNED_INLINE,NUMVAR_OWNED_HEAP.borrow_kind— whenstate == NUMVAR_BORROWED, discriminatesNUMVAR_BORROW_EXTERNAL(aliasing a packedNumericor static constant) fromNUMVAR_BORROW_INLINE(aliasing the inline buffer of a siblingBufferedNumericVar).inline_buf— back-pointer or flag indicating association with a fixed-size inline home.
New wrapper: BufferedNumericVar
A stack-allocatable composite of a NumericVar plus a fixed-size inline digit array. It serves two roles:
- Provide writable inline storage that avoids heap allocation for common-sized operands.
- Act as a workspace to hold a detoasted
Numericwhose digits are then borrowed by aNumericVarinNUMVAR_BORROW_INLINEstate.
The author emphasizes BufferedNumericVar is not a new SQL-visible representation — it is purely an internal helper, analogous to how StringInfoData wraps a growable buffer.
Helper stratification
The patch reorganizes helpers into four layers with disciplined responsibilities:
- Raw storage helpers — only allocate/free
NumericDigitarrays. - Storage-state helpers — perform legal transitions (e.g., borrowed → owned-heap via copy-on-write, empty → owned-inline).
- Variable-level helpers —
init,copy,acquire_buffer,release. - Result preparation / finalization — handle alias-safe output placement for arithmetic, so
add_var(a, b, a)patterns are expressed through the type system rather than ad-hocset_var_from_varjuggling.
Integration scope
Crucially, the patch does not merely add infrastructure. It rewires the top-level SQL entry points — cmp_numerics, numeric_hash, numeric_add, numeric_sub, numeric_mul, numeric_div — to drive the new model. The inner kernels (add_var, sub_var, mul_var, div_var) are not rewritten; their result-placement preludes and epilogues are migrated to the new helpers, leaving the numeric algorithms themselves untouched. Transcendental functions (sqrt, ln, log, power) are only adjusted minimally for compatibility.
Design Tradeoffs and Likely Points of Contention
This is a submission that will almost certainly draw heavy scrutiny from the numeric-code maintainers (Dean Rasheed, Tom Lane, Jeff Davis, historically Tomas Vondra). The following tensions are immediate:
1. Growth of NumericVar's footprint
Adding capacity, state, borrow_kind, and inline_buf fields enlarges sizeof(NumericVar). NumericVar is allocated extensively on the stack in aggregation and per-row paths. Even modest growth can matter — and the information is arguably redundant with what can be deduced from existing fields (buf == NULL already signals no owned allocation; comparing digits against buf tells you about the carry-digit offset). Reviewers will ask whether the explicitness is worth the bytes.
2. "Borrowed" semantics vs. existing conventions
The current code already has a working convention: if you didn't call alloc_var/set_var_from_* that allocates, you don't own the digits. Making this explicit via state fields is valuable for readers but adds a runtime invariant that every helper must maintain. Getting this wrong (e.g., a helper that forgets to transition state) silently regresses to the old implicit situation while adding overhead.
3. BufferedNumericVar vs. the existing inline-buffer pattern
numeric.c already has patterns like NumericDigit local_digits[NUMERIC_LOCAL_NMAX] plus manual pointer fix-up. Introducing a new wrapper type means churning every call site that used the ad-hoc pattern. Reviewers will weigh whether a typed wrapper is cleaner than the existing idiom, or merely different.
4. Refactor-only patches are a hard sell
PostgreSQL's community has a well-known bias against large refactors that do not fix a bug or enable a concrete follow-up. The author anticipates this by framing the work as enabling "future internal numeric optimizations," but without a concrete follow-up patch demonstrating the payoff, acceptance is historically unlikely. Compare to prior numeric work (e.g., Dean Rasheed's div_var_fast improvements, the numeric sum accumulator rewrite) — those landed because they delivered measurable wins, with cleanup as a side effect.
5. Regression test inflation
The patch adds tests up to NUMERIC(1000, 1000) and uses digest-based verification for large outputs. PostgreSQL regression tests avoid both very long expected output and non-deterministic/opaque verification. Digest-based checks specifically tend to be rejected because they obscure failures. This will likely be trimmed significantly in review.
6. Risk/reward for a hot path
cmp_numerics and numeric_hash are hit on every numeric comparison in sorts, hash joins, and hash aggregation. Any additional branching on state in these paths — even predictable branches — is a regression risk. The author claims no measurable regression, but reviewers will demand their own benchmarks on sort- and hash-heavy workloads.
Key Technical Insights
-
The real invariant being formalized is: a
NumericVar's digit storage is in exactly one of {none, aliased-read-only, owned-writable}, and write operations are legal only in the owned state. This is essentially a manually-enforced linear/affine type discipline, and the patch tries to make the discriminator explicit. -
Copy-on-write transition is the natural operation this model enables: a borrowed
NumericVarthat is about to be mutated should transparently upgrade to owned-heap (or owned-inline if it fits). The patch's "storage-state helpers" layer is where this lives. This is the main functional (vs. documentation) benefit. -
Alias-safe result placement — handling
op_var(x, y, x)orop_var(x, y, y)— is currently handled by defensive copying inside each kernel. A typed result-preparation helper can centralize this, and is a legitimate cleanup even independent of the state model. -
Separation of
Numeric(datum) fromNumericVar(working form) is reinforced by making the "borrowed from packed datum" state a first-class thing. This could pay off if future work wants to, e.g., do comparisons without ever materializing aNumericVarat all (a direction some have explored for JIT / expression evaluation).
Likely Trajectory
As a solo first post from an apparently new contributor, proposing a large refactor to extremely mature code with no accompanying optimization payload, this patch faces an uphill path. Probable reviewer responses:
- Request to split further: the alias-safe result-placement helpers could land independently and are the least controversial piece.
- Push back on
BufferedNumericVaras redundant with existing inline-buffer patterns. - Reject digest-based regression output.
- Ask for a concrete follow-up patch showing the optimization enabled (e.g., avoiding a
pallocin a hot path, or a copy-on-write win measurable in a benchmark). - Scrutinize
sizeof(NumericVar)impact.
The refactor is internally coherent and the author clearly understands the code. Whether it lands will depend almost entirely on whether a follow-up demonstrates the payoff, and whether the core-team numeric maintainers (particularly Dean Rasheed) see the explicit state model as genuinely easier to maintain or as additional bookkeeping for no net gain.