[PATCH] contrib/xml2: guard against signed integer overflow in parse_params

First seen: 2026-05-04 11:26:50+00:00 · Messages: 2 · Participants: 2

Latest Update

2026-05-06 · opus 4.7

Analysis: Proposed Overflow Guard in contrib/xml2 parse_params

Core Problem (as framed by submitter)

The submitter, Varik Matevosyan, proposed a defensive patch against signed-integer overflow in contrib/xml2's parse_params routine. The specific code pattern under scrutiny is the conventional PostgreSQL idiom of growing a dynamically-sized array by doubling a capacity counter (max_params *= 2 or similar), where the loop terminates because AllocSizeIsValid (or the palloc call itself) rejects the oversized request before the signed integer wraps to a negative value.

The submitter argued two things:

  1. Relying on signed-integer overflow behavior is technically undefined behavior (UB) in C, even if in practice AllocSizeIsValid catches the oversized allocation first.
  2. The current safety is "incidental" — it works only because the MaxAllocSize ceiling happens to be crossed before overflow occurs.

The submitter acknowledged up front that the overflow is unreachable in current builds (since text input is bounded by MaxAllocSize, bounding nparams below the doubling threshold), framing this as a stylistic/robustness cleanup to align with "explicit overflow-checking idiom used elsewhere in the tree."

Tom Lane's Rejection and the Architectural Reasoning

Tom Lane (core committer, final authority on code-style matters across the tree) rejected the patch with a concise but architecturally important explanation:

  1. The UB argument is incorrect in practice. AllocSizeIsValid rejects any allocation request >= 1GB (i.e., MaxAllocSize). Because array growth doubles the capacity, the allocation-size check will fail one iteration before the signed integer counter itself could overflow INT_MAX (~2GB worth of element count for typical element sizes, and much earlier for larger structs). Thus the overflow path is genuinely unreachable — not merely unlikely.

  2. This is the intentional design of MaxAllocSize. Tom pointed to the comment in utils/memutils.h explaining that MaxAllocSize is deliberately set to 0x3fffffff (1GB − 1) rather than 2GB precisely to provide this headroom: it guarantees that size calculations involving doubling or modest multiplication cannot overflow a signed 32-bit integer before the allocator itself rejects the request. In other words, the "incidental" safety the submitter described is actually a deliberate invariant baked into the memory context API contract.

  3. Consistency argument inverted. The submitter claimed the patch would make parse_params more consistent with the rest of the tree. Tom countered that the opposite is true: "many many places in our tree" use exactly this idiom, so adding explicit pre-overflow checks here would make parse_params an outlier, not a conformer.

Key Architectural Insight: MaxAllocSize as a Safety Contract

The deeper lesson here is that MaxAllocSize = 0x3fffffff is not an arbitrary limit — it is a load-bearing invariant that callers throughout the codebase implicitly rely on. The 1GB cap (versus a hypothetical 2GB cap which would still fit in a signed int32) provides a guaranteed ~1GB margin before INT_MAX, which is enough to absorb:

This means idiomatic PostgreSQL growth loops of the form:

if (nparams >= max_params) {
    max_params *= 2;
    params = repalloc(params, max_params * sizeof(...));
}

are safe by construction without explicit overflow checks, as long as the element size is reasonable and the per-allocation cap is respected by repalloc/palloc. Adding belt-and-suspenders checks clutters the code and obscures this well-understood invariant.

Tradeoffs and Implications

Outcome

The patch was rejected on first review with no support from other hackers. The thread is effectively closed after a single committer response. No code change will be made.