2026-06-01 · claude-opus-4-6

Make Memory Checking / Sanitizing Infrastructure Better

Core Problem

PostgreSQL's internal memory management system (aset.c, slab.c, generation.c) includes debug-mode instrumentation to detect memory corruption — buffer overruns, use-after-free, and similar issues. However, the current infrastructure has significant gaps that allow real memory corruption bugs to go undetected, particularly in automated testing environments (CI, fuzzing, TAP tests).

The fundamental architectural tension is between PostgreSQL's custom memory allocator design (which pools allocations within larger malloc'd blocks) and the assumptions of external memory-checking tools like AddressSanitizer (ASan) and Valgrind, which operate at the system malloc level. Because PostgreSQL sub-allocates from AllocBlock-sized chunks, ASan only sees the block-level boundaries, not individual palloc boundaries. This means a read past the end of a 32-byte palloc into adjacent space within the same AllocBlock is invisible to ASan.

Why This Matters Architecturally

PostgreSQL's memory context system is the foundation of all memory management in the backend. Every buffer, tuple, expression result, and plan node lives in a memory context. Undetected corruption in these allocations can manifest as:

Silent data corruption in query results
Crashes far removed from the actual bug (making debugging extremely difficult)
Security vulnerabilities (heap buffer overflows)
Bugs that pass CI but cause production failures

The weakness is particularly acute for power-of-two allocations, which are extremely common in PostgreSQL (due to palloc rounding behavior and common data structure sizes). These are precisely the allocations that currently get NO sentinel protection.

Detailed Technical Analysis of Each Issue

Issue 1: Missing Sentinels for Power-of-Two Allocations

In aset.c's size-class freelist design, allocations are rounded up to the next power-of-two size class. If the requested size already equals a power-of-two, adding a sentinel byte would push it into the next size class (doubling memory usage for that allocation). The current code avoids this by simply not placing a sentinel for these allocations.

Proposed fix: Add the sentinel space after determining the size class, similar to how the per-allocation chunk header (MemoryChunk / formerly AllocChunkData) space is handled. The sentinel bytes become part of the "overhead" that's absorbed into the size class, rather than driving size class selection.

Issue 2: Single-Byte Sentinel is Insufficient

A 1-byte sentinel at requested_size offset only catches overwrites that land exactly at that byte. Due to C alignment rules and common access patterns (e.g., writing a 4-byte int or 8-byte pointer), many overruns will stride over the sentinel entirely. An 8-byte or 16-byte sentinel would catch aligned writes of any fundamental C type.

Issue 3: WARNING Instead of Crash is Counterproductive

The current behavior of issuing a WARNING when corruption is detected is problematic in three ways:

Delayed failure: The process continues running with corrupted memory, eventually crashing in an unrelated location
Invisible in automated testing: TAP tests and background workers don't monitor WARNING output, so corruption goes unreported in CI
Unfuzzable: Fuzzing relies on crashes (signals) to detect bugs; WARNINGs don't trigger the sanitizer/fuzzer's error detection

The complication with using PANIC is the recursive error path: AllocSetCheck() runs during AllocSetReset(), which can be called during error handling. If ErrorContext itself is corrupt, issuing PANIC triggers errstart() → MemoryContextReset(ErrorContext) → AllocSetCheck() → another PANIC, creating infinite recursion. Andres suggests emitting WARNINGs then crashing post-hoc, which avoids this recursion.

Issue 4: ASan Integration at Sub-Allocation Granularity

PostgreSQL already has Valgrind annotations (VALGRIND_MAKE_MEM_DEFINED, VALGRIND_MAKE_MEM_NOACCESS, etc.) throughout the allocator code. The proposal is to generalize these annotations to also call ASan's manual poisoning API (__asan_poison_memory_region / __asan_unpoison_memory_region).

This would give ASan sub-block granularity: when a chunk is freed back to a freelist, the memory is poisoned; when allocated, only the requested size is unpoisoned. This provides most of Valgrind's precision at a fraction of the runtime cost (ASan typically adds 2x overhead vs Valgrind's 20-50x).

The limitation is that ASan cannot track uninitialized reads (only Valgrind/MSAN can do that), but it catches use-after-free and buffer overflows effectively.

Issue 5: Per-Allocation malloc Mode for Maximum Sanitizer Effectiveness

A mode where every palloc becomes its own malloc (bypassing freelist pooling entirely) would give ASan/Valgrind perfect precision, as each allocation would have its own red zones and quarantine behavior. This is conceptually similar to running with a "dumb allocator" and is the gold standard for memory debugging, at massive performance cost. The suggestion for slab to use "something slightly different" acknowledges that slab's fixed-size design has different characteristics.

Design Tradeoffs

Sentinel size vs memory overhead: Larger sentinels waste more memory in debug builds but catch more bugs
Crash vs WARNING: Crashing gives better debuggability but requires careful handling of error-path recursion
ASan poisoning granularity vs performance: Sub-block poisoning adds overhead to every palloc/pfree but dramatically improves detection
Per-allocation malloc mode: Maximizes tool effectiveness but makes performance testing impossible in this mode

Current Status

This is an early-stage RFC (Request for Comments). Andres has prototype patches for issues 1-4 but is soliciting architectural feedback before polishing. No patches have been posted to the list yet.

Make memory checking / sanitizing infrastructure better

Latest Update

Incremental Update: Peter Eisentraut's Response