Fix incorrect size check in statext_dependencies_deserialize

First seen: 2026-05-19 14:29:56+00:00 · Messages: 1 · Participants: 1

Latest Update

2026-05-20 · claude-opus-4-6

Fix Incorrect Size Check in statext_dependencies_deserialize

Core Problem

The function statext_dependencies_deserialize() in PostgreSQL's extended statistics subsystem contains a bug in its sanity check that validates the size of incoming serialized bytea data before deserialization. The issue is a semantic mismatch between what the validation macro expects and what it's being given.

Technical Details

PostgreSQL's extended statistics system supports functional dependencies (CREATE STATISTICS ... (dependencies)), which track probabilistic relationships between columns. These statistics are serialized into the pg_statistic_ext_data catalog and deserialized when needed by the planner.

During deserialization in statext_dependencies_deserialize():

  1. The code reads ndeps (the number of dependency entries) from the serialized header.
  2. It then performs a sanity check to ensure the bytea is at least large enough to contain the claimed data.
  3. The bug: The check uses SizeOfItem(ndeps), which computes the size of a single dependency item that has ndeps attributes. This is semantically wrong — ndeps here represents the count of dependency entries, not the number of attributes in one entry.
  4. The fix: It should use MinSizeOfItems(ndeps), which correctly computes header_size + ndeps * minimum_single_item_size — i.e., the minimum total size needed to hold ndeps minimally-sized dependency items (each with the minimum number of attributes, which is 2).

Why This Matters Architecturally

The sanity check exists to protect against catalog corruption or invalid data causing out-of-bounds memory reads during deserialization. With the incorrect check:

The practical impact is limited because:

  1. The data originates from PostgreSQL's own serialization code under normal circumstances.
  2. The subsequent per-item deserialization loop has its own bounds (it reads nattributes per item and validates individually).

However, the check is defense-in-depth against catalog corruption, and having it be incorrect defeats its purpose.

Proposed Solution

The patch is minimal and surgical:

This aligns the function's behavior with statext_ndistinct_deserialize(), which already correctly uses MinSizeOfItems for its analogous check. The author correctly identifies this as a copy-paste or typo error rather than an intentional design choice, supported by the inconsistency with the ndistinct code path.

Consistency Argument

The extended statistics subsystem has parallel serialization/deserialization paths for:

The ndistinct path already uses MinSizeOfItems correctly, making this clearly a bug in the dependencies path rather than a deliberate architectural choice.

Risk Assessment

This is an extremely low-risk fix: