[PATCH] Remove obsolete tupDesc assignment in extended statistics

First seen: 2026-05-28 15:58:39+00:00 · Messages: 1 · Participants: 1

Latest Update

2026-06-01 · claude-opus-4-6

Technical Analysis: Remove Obsolete tupDesc Assignment in Extended Statistics

Core Problem

This patch addresses a code hygiene issue in PostgreSQL's extended statistics subsystem, specifically in the lookup_var_attr_stats() function. The function currently assigns a tupDesc (tuple descriptor) to VacAttrStats entries that are created for expressions in extended statistics objects. The patch author argues this assignment is vestigial — a remnant from an earlier code state that no longer serves a functional purpose.

Technical Context

Extended Statistics Architecture

PostgreSQL's extended statistics (introduced in PG10, with expression support added in PG14) allows users to create statistics objects that capture cross-column correlations, MCV lists, and n-distinct values across multiple columns and/or expressions. The relevant code path is:

  1. lookup_var_attr_stats() — Looks up or creates VacAttrStats entries for variables referenced in a statistics object
  2. make_build_data() — Constructs the data matrix used for building statistics (MCV, ndistinct, dependencies)
  3. examine_expression() — Creates VacAttrStats entries specifically for expression-based statistics
  4. statext_mcv_build() — Builds multi-column MCV lists

The Specific Issue

In lookup_var_attr_stats(), when processing expressions (as opposed to plain column references), the code copies vacatts[0]->tupDesc into the newly created VacAttrStats entry. The rationale was presumably that downstream consumers (like statext_mcv_build()) might need a tuple descriptor to interpret values.

However, the current code path reveals this is dead/obsolete:

Why This Matters Architecturally

  1. Misleading code: The assignment suggests a dependency that doesn't exist, making future maintenance harder
  2. Fragile implicit coupling: Copying vacatts[0]->tupDesc (the tuple descriptor of the first attribute) into an expression's stats entry is semantically wrong — expressions don't belong to any single relation's tuple format in the same way columns do
  3. Defense against future bugs: If future code incorrectly relies on this field being set for expression entries, it would be using a tuple descriptor that doesn't actually describe the expression's output, potentially leading to subtle data corruption or crashes

Proposed Solution

The patch is minimal:

Risk Assessment

This is a low-risk cleanup patch:

Classification

This is a code cleanup / dead code removal patch. It does not change behavior, improve performance, or fix a user-visible bug. Its value is in improving code clarity and maintainability for the extended statistics subsystem.