Analysis: Are Extended Statistics Intentionally Not Propagated to Partitions?
Core Problem
This thread raises a fundamental architectural question about how PostgreSQL's extended statistics system interacts with declarative partitioning. The issue is straightforward but has significant planning implications:
When a user creates extended statistics (CREATE STATISTICS) on a partitioned parent table, the statistics object is NOT recursively created on child partitions — unlike CREATE INDEX, which does recurse.
Why This Matters Architecturally
PostgreSQL's query planner, after performing partition pruning, plans scans against individual child partitions. When it does so, it consults the statistics available on those child relations. If extended statistics (e.g., multivariate n-distinct, MCV lists, or functional dependencies) exist only on the parent but not on the children, the planner cannot leverage them for child partition scans. This leads to:
- Suboptimal cardinality estimates on partition scans where column correlations exist
- Manual operational burden — DBAs must explicitly CREATE STATISTICS on every partition, including newly created ones (via attach or automatic range partition creation)
- Behavioral inconsistency with CREATE INDEX, which established a precedent of recursive propagation in PostgreSQL 11
Technical Details of the Gap
The code path difference is clear:
- CREATE INDEX →
DefineIndex()insrc/backend/commands/indexcmds.cexplicitly checks if the target relation is partitioned and recurses into each partition, creating child indexes that reference the parent index. - CREATE STATISTICS →
CreateStatistics()insrc/backend/commands/statscmds.chas no such recursion. It creates the statistics object solely on the specified relation.
This means there is no catalog relationship (akin to pg_inherits for indexes via pg_index.indisvalid / partition index linkage) that ties a parent statistics object to child statistics objects.
The Cross-Table Statistics Consideration
The thread references a prior discussion where Tom Lane explained that statistics objects are not necessarily tied to a single table, and that cross-table statistics are a long-term design direction. This architectural vision explains why statistics objects have their own owner (independent of the table owner) — they're conceptually more like standalone schema objects than table-bound metadata.
However, this design philosophy creates tension with the partition propagation use case. If statistics objects are meant to eventually span multiple tables, the "recurse into partitions" semantic becomes less natural — a partition-propagated statistic would be a different pattern than a cross-table statistic.
Possible Design Approaches (Not Yet Proposed as Patches)
-
Full recursion (like indexes): CREATE STATISTICS on a partitioned table automatically creates equivalent statistics on all partitions, and new partitions inherit them on ATTACH. This would require:
- Catalog changes to track parent-child statistics relationships
- DDL propagation logic in
statscmds.c - Handling of ATTACH/DETACH PARTITION for statistics
- ANALYZE coordination to populate statistics on children
-
Planner-side solution: The planner could consult the parent's extended statistics when planning child partition scans, similar to how some partition-level optimizations already reference parent metadata. This is architecturally complex given how
get_relation_statistics()works. -
Status quo with documentation: Simply document that users must create statistics per-partition manually, and perhaps provide tooling or event triggers to automate it.
Current Status
This is an inquiry thread with no patch proposed. The poster is seeking confirmation of whether this is an intentional design decision or an unimplemented feature. No committer has responded yet in this thread. The question is well-formed and identifies a genuine gap between the partitioning infrastructure's handling of indexes versus statistics.
Implications for Users
Until this is addressed, users of partitioned tables who need extended statistics must:
- Manually create statistics on each partition
- Remember to create statistics on newly attached partitions
- Potentially use event triggers or external tooling to automate propagation