Fix mismatched deallocation functions

First seen: 2026-05-06 23:26:42+00:00 · Messages: 1 · Participants: 1

Latest Update

2026-05-07 · opus 4.7

Core Problem: Allocator/Deallocator Mismatches in Frontend Code

PostgreSQL's frontend utility code (src/fe_utils, src/bin, libpq, etc.) exposes two parallel families of memory management wrappers declared in src/include/common/fe_memutils.h:

  1. pg_-prefixed functions: pg_malloc, pg_malloc0, pg_realloc, pg_strdup, pg_free. These are thin wrappers around the libc allocator (malloc/realloc/free) that add error handling — typically calling exit(1) on OOM so callers don't have to check return values.

  2. p-prefixed functions: palloc, palloc0, pstrdup, pnstrdup, pfree, repalloc. In the backend these are the MemoryContext-aware allocators. In the frontend, however, these are also wrappers around libc malloc/free (since there is no MemoryContext machinery in frontend binaries), provided to allow code shared between backend and frontend to compile unchanged.

Because both families ultimately call malloc/free under the hood in frontend builds, mixing them (e.g., pnstrdup() paired with free(), or pg_malloc() paired with pfree()) is functionally harmless today. However, this invariant is implicit and fragile, and it creates two concrete problems:

Why the mismatches matter

  1. Blocks __attribute__((malloc)) annotations. GCC's malloc attribute has a two-argument form, __attribute__((malloc(deallocator))), that tells the compiler which deallocator is valid for a given allocator. When enabled, GCC emits -Wmismatched-dealloc warnings when a pointer returned from one allocator family is passed to a deallocator associated with a different family. Tristan's motivating example is:

    char *text_copy = pnstrdup(rl_line_buffer + start, end - start);
    ...
    free(text_copy);   // warning: mismatched allocation function
    

    To adopt the malloc attribute cluster-wide (which enables better diagnostics, dead-store elimination, and leak detection in static analyzers), mismatches must first be eliminated.

  2. Cognitive overhead during code review. A reader auditing a frontend code path cannot tell, at a glance, whether pfree(x) is safe without chasing the provenance of x across function boundaries. Establishing the discipline "allocator family must match deallocator family" makes local reasoning sound.

  3. Future-proofing. If the frontend p* wrappers were ever to gain real behavior (e.g., a minimal context allocator for frontend tools, or instrumentation/poisoning in debug builds), the mismatches would silently become bugs.

The Proposed Solution

The patch is a mechanical refactor generated by Coccinelle (spatch), a semantic patch tool for C. A .cocci script encodes transformation rules of the form "if a pointer returned by X is passed to Y, rewrite Y to Y'". The author ran:

spatch --sp-file allocators.cocci --allow-inconsistent-paths --in-place .

The --allow-inconsistent-paths flag is notable: it tells Coccinelle to proceed even when it cannot prove a single consistent control-flow path for every match. For allocator/deallocator pairing this is acceptable because the transformation is local (rewriting a call site based on the type/origin of its argument), but it does mean a human review of the diff is required to catch spurious rewrites.

The approach is attractive because:

Key Technical Insights

The two-family frontend allocator design is historical, not accidental

The p-family exists in the frontend specifically so that files compiled into both postgres and frontend binaries (e.g., things under src/common, parser pieces, some catalog code) can call palloc/pfree unconditionally. The pg_-family is the "native" frontend API. The mismatches accumulated organically because, absent compiler enforcement, the distinction is invisible at runtime.

__attribute__((malloc(deallocator))) as a forcing function

This is the real architectural lever. Once allocator functions are annotated like:

extern void pg_free(void *);
extern void *pg_malloc(size_t) __attribute__((malloc(pg_free)));
extern void pfree(void *);
extern void *palloc(size_t) __attribute__((malloc(pfree)));

the compiler statically enforces the pairing. This turns an undocumented convention into a checked invariant, and it gives static analyzers (Coverity, clang-analyzer, GCC's -fanalyzer) much better leak-detection precision.

Scope: frontend only

Nothing in the described change touches the backend palloc/MemoryContext implementation. The backend already has well-defined semantics (palloc ↔ pfree, both tied to CurrentMemoryContext). The cleanup is purely about the frontend shim layer.

Risk profile

The patch is low-risk by construction:

Participant Dynamics

Only the original poster, Tristan Partin, has weighed in at the time of this thread snapshot. Tristan has contributed build-system and tooling work to PostgreSQL (notably around Meson) and has a track record of infrastructure/hygiene patches, which fits the nature of this proposal. No committer has yet responded, and the follow-up patch adding the malloc attribute annotations — which is the actual payoff of this cleanup — is promised but not yet posted.

Likely Points of Debate (anticipated)

  1. Whether to collapse the two families entirely. A reviewer might ask: if p* and pg_* do the same thing in the frontend, why keep both? The answer is backend code sharing, but someone may propose making frontend p* simply #defines to pg_*.
  2. Coccinelle script maintenance. Should the .cocci script be committed to the tree (e.g., under src/tools/) so the invariant can be re-checked periodically, similar to pgindent?
  3. Churn vs. benefit. Large mechanical diffs complicate git blame archaeology. The benefit (enabling __attribute__((malloc))) is concrete, which should justify the churn, but a committer may want the annotation patch posted alongside so the payoff is visible.