Dump Statistics Bug: Cross-Index Attribute Name Leakage in pg_dump
Background and Architectural Context
PostgreSQL 18 introduced the ability for pg_dump to dump and restore planner statistics (pg_statistic / pg_statistic_ext data) alongside schema and data. This is a significant feature because prior to it, post-restore databases required a manual ANALYZE before the planner could produce reasonable plans — a painful operational gap during major-version upgrades and logical migrations. The --statistics-only mode and the companion pg_restore_relation_stats() / pg_restore_attribute_stats() functions form the plumbing.
For expression indexes, the situation is more subtle than for plain indexes. An expression index (e.g., btree(upper(source_system::text))) gets its own row in pg_statistic keyed by the index's OID and an implicit attribute numbering (the expression columns become attnum 1..N on the index relation itself). Dumping these stats therefore requires pg_dump to know the "attribute names" of the index — which for expression indexes are synthesized/derived column names rather than references to the underlying table columns.
The Core Bug
The reporter (Maksim Melnikov, Postgres Pro) observed that when a table has:
- An expression index (e.g.,
test_table_stats_source_system_textonupper(source_system::text)), AND - A following plain index on real columns (e.g., unique index on
(source_system, source_id, model_name)),
pg_dump --statistics-only fails with:
pg_dump: error: could not find index attname "source_system"
The diagnostic is telling: pg_dump is trying to resolve the table column name source_system against the expression index's attribute list (where the only attribute is the synthesized expression column, not source_system). In other words, state belonging to a previously-processed expression index is leaking into the processing of the subsequent plain index.
The root cause, as Melnikov identifies and Korotkov confirms, lies in how indAttNames / nindAttNames (the per-index cached attribute name array) are populated during the index iteration in the stats-dumping path. If the code path only initializes these fields for indexes with expressions (because only those have stats rows to dump), then a following plain index — which reuses the same IndxInfo-like carrier or a shared buffer — inherits stale pointers from the prior expression index. When downstream code then tries to match the plain index's real column names against that stale attribute-name array, the lookup fails.
Proposed Fix
Melnikov attached a "simple patch" (not included inline in the quoted messages, but described). The fix almost certainly does one of:
- Unconditionally initialize
indAttNames/nindAttNamesfor every index during the gather phase, not just those with expressions, OR - Explicitly clear/reset these fields per-index iteration so that stale data from a prior iteration cannot be read.
The first approach is more defensive and matches standard pg_dump idioms where IndxInfo fields should be fully populated at read time rather than lazily filled.
Korotkov's Response and the Reproduction Question
Alexander Korotkov (committer, author of much of the statistics-dump infrastructure and a core developer) acknowledges the bug on inspection of the code — "an index with no expression can get its indAttNames and nindAttNames from the previous index" — which validates the reporter's diagnosis at the source-code level.
However, Korotkov cannot reproduce the user-visible error. His key observation is architecturally important:
dumpRelationStats_dumper()only iterates indexes with pg_stats entry, and those are indexes with expressions.
This is the crux: if the iteration filter in dumpRelationStats_dumper() skips non-expression indexes entirely (because they have no synthesized pg_statistic rows of their own — their stats live on the base table's columns), then the stale-pointer bug Melnikov identified should be latent rather than observable. The plain index would never reach the code that reads indAttNames.
This mismatch between "bug is real in code" and "cannot reproduce user error" suggests one of several possibilities:
- The reporter was running a slightly different git commit where the iteration filter was broader (or absent).
- Statistics on non-expression indexes can exist in some circumstances (e.g., partial indexes, covering indexes with
INCLUDEexpressions, or when extended statistics interact). - The error path is through a different code site than Korotkov is inspecting — perhaps
getIndexes()or the dependency-resolution phase that runs beforedumpRelationStats_dumper().
Korotkov's request for the exact commit and reproduction steps is the correct committer instinct: before committing a "defensive" fix, confirm the actual failure path so the fix targets the right call site and so a regression test can be written.
Implications and Tradeoffs
Even if the user-visible error turns out to be reproducible only on a specific commit or edge case, the underlying code pattern — conditionally populating per-iteration fields without clearing them — is a latent bug. The defensive fix is cheap and should go in regardless. The more interesting question Korotkov implicitly raises is test coverage: the stats-dump feature evidently lacks a regression test exercising the combination of an expression index followed by a plain index on the same table. That gap should be closed alongside the fix.
The ~2.5 month gap between report (Feb 25) and first committer response (May 10) is longer than ideal for a --statistics-only bug, but not unusual for a targeted bug report against a newly-landed feature; it likely waited for Korotkov to cycle back to stats-dump issues.
Open Questions at Thread End
- Exact git commit used by Melnikov.
- Whether the database needed any specific fill/ANALYZE sequence to trigger the error (Korotkov asks explicitly).
- Whether the fix should be purely defensive (clear fields) or structural (rework how
IndxInfostats metadata is populated).