Overview
This thread proposes exposing autovacuum/autoanalyze skip events—cases where the worker cannot acquire the required lock on a relation and gives up—as first-class statistics in pg_stat_all_tables, rather than leaving them buried in server log messages. The design conversation quickly broadens to cover (a) whether timestamps are worth their shared-memory cost, (b) whether manual VACUUM/ANALYZE (SKIP_LOCKED) should be counted too, (c) an API-shape question for the reporting function, and (d) a tangential but architecturally significant idea from Michael Paquier to split PgStat_StatTabEntry so indexes get their own stats kind.
The Core Problem
When an autovacuum worker picks up a relation from its work list it takes a ConditionalLockRelationOid() — it must not block, because blocking would stall the whole per-database worker. If the lock is unavailable (e.g. an ACCESS EXCLUSIVE DDL is in flight, or another autovac worker already has it), the relation is skipped and a LOG line is emitted. The same pattern applies to manual VACUUM (SKIP_LOCKED) / ANALYZE (SKIP_LOCKED), which emit WARNINGs.
The operational consequence is invisible bloat: a hot table can be chronically skipped (e.g. interaction with long-running transactions or frequent DDL) and the DBA has no efficient signal short of scraping logs. Because the existing pg_stat_all_tables already exposes last_autovacuum, last_autoanalyze, autovacuum_count, etc., the natural place to surface skip events is alongside them—making the "why isn't this table being vacuumed?" diagnosis a single view query.
Yugo Nagata's initial patch adds four fields:
last_skipped_autovacuum/last_skipped_autoanalyze(timestamps)skipped_autovacuum_count/skipped_autoanalyze_count(counters)
Design Decisions and Tradeoffs
Timestamps vs counters only
Sami Imseih pushed back on timestamps initially: in a high-churn workload the timestamp gets overwritten on every skip, so it doesn't help historical analysis any more than a delta-over-time on the counter. His preferred pattern is dashboards that diff counters over a scraping interval.
Nagata's counter-argument—endorsed by Michael Paquier—is that the timestamp answers a different question: "when was autovacuum last attempting this table?" Without it, a flat counter cannot distinguish "autovacuum is trying hard but always losing the lock race" from "autovacuum stopped noticing this table at all." Paquier explicitly noted the value for databases with many relations of widely varying sizes, where the cadence between skips is diagnostic. Paquier's opinion carried weight here (he is a stats-subsystem committer) and timestamps were retained.
Manual VACUUM/ANALYZE coverage
The v1 patch only instrumented autovacuum. Sami argued—correctly—that scripts using SKIP_LOCKED for throttled maintenance are a real production pattern and deserve equal observability. Accepting this doubled the column count (separate auto vs manual counters/timestamps), which triggered the next tradeoff.
The "too many columns in pg_stat_all_tables" concern and the index-split detour
With 8 new columns on the horizon, Sami raised view bloat. Michael Paquier used this to advance a longer-standing cleanup: PgStat_StatTabEntry is today shared between tables and indexes, but almost all the vacuum/analyze fields are table-only, while indexes only really use numscans, lastscan, tuples_returned, tuples_fetched, stat_reset_time. He proposed a new variable-sized stats kind (PGSTAT_KIND_INDEX in Nagata's paraphrase) dedicated to indexes.
Architecturally the benefit is twofold:
- Shared-memory footprint: every index in the cluster currently carries storage for ~a dozen table-only fields it never uses. With
dshash-backed pgstat in shared memory (post-v15), this is not free. - Schema hygiene: the view definitions would stop cross-referencing fields that are always zero for one relkind.
This is flagged as v20 material and is not part of this patch—but it is the strategic backdrop for accepting further column growth in the table-oriented view.
API shape of the reporting function
Nagata's v2 unified reporting into pgstat_report_skipped_vacuum_analyze(Oid relid, bool vacuum, bool analyze, bool autovacuum). Paquier objected that three independent booleans in a row is a classic bug magnet at call sites (easy to swap, easy to pass a constant-true that silently aggregates into the wrong counter). He recommended a bitmask. Nagata converted to bits8 flags; Sami then suggested four orthogonal flags (SKIPPED_VACUUM, SKIPPED_ANALYZE, SKIPPED_AUTOVACUUM, SKIPPED_AUTOANALYZE) to flatten the nested if (AUTOVAC) { if (VACUUM) ... } logic in the report path. Nagata agreed, moving the vac/analyze vs auto/manual composition into the callers.
How to get the Oid before the lock attempt (manual path)
This is the subtlest correctness issue in the patch. To report a skip you need a relid, but in the manual-VACUUM path the relid is normally obtained via RangeVarGetRelidExtended() with the lock folded in. If the lock isn't available, the existing code returns InvalidOid and emits the warning with only the relation name.
Nagata's v3/v4 approach: when VACOPT_SKIP_LOCKED is set, call RangeVarGetRelid(vrel->relation, NoLock, false) first to capture the Oid, then ConditionalLockRelationOid(); if the lock fails, report the skip using the pre-captured Oid. SearchSysCacheExists1() is used after successful locking to handle a drop racing between the two steps.
Sami objected to this structure in v5 review: calling ConditionalLockRelationOid() on a relid obtained with NoLock opens a small race window where the Oid may have been recycled to an unrelated object by the time the conditional lock is taken. His v6 counter-proposal keeps the existing RangeVarGetRelidExtended() (with rvr_opts for skip-locked behavior and AccessExclusiveLock—note: this should be the normal vacuum lock level, likely ShareUpdateExclusiveLock; the quoted diff text may be imprecise) as the primary path, and only falls back to a second RangeVarGetRelid() call inside the skip branch solely to obtain an Oid for reporting. This is safer because the failure path does not attempt to act on a possibly-stale Oid—it only reports it. This v6 structure appears to be what Nagata accepts at the end of the visible thread.
Asymmetry on partitioned/inherited parents
Sami flagged a real semantic wart: vacuum_count on a partitioned parent is always zero because VACUUM never processes the parent relation directly—it only dispatches to leaves. But SKIP_LOCKED on the parent must still be reported somewhere, and we cannot recursively enumerate partitions without locking (which defeats SKIP_LOCKED). Consensus: report the skip on the parent row and document the asymmetry. Nagata drafted doc wording:
"When a manual vacuum or analyze on a parent table in an inheritance or partitioning hierarchy is skipped, the statistics are recorded only for the parent table, not for its children."
Naming
Two naming patterns were in tension:
last_skipped_autovacuum/skipped_autovacuum_count(modifier-first, matcheslast_vacuum,last_autovacuum)autovacuum_last_skip/autovacuum_skip_count(noun-first, matchesslotsync_last_skip,slotsync_skip_count)
Sami's final vote (2026-05-04) was for last_skipped_* on the grounds of local consistency with the existing last_vacuum/last_autovacuum fields in the same view. This is the right call: consistency within a single view matters more than cross-view pattern matching.
Scope expansion that was deferred
Sami floated an autovacuum_started_count so that started − completed would detect in-flight autovacuum failures (corrupt indexes, checksum failures, SIGTERM from the launcher, etc.). Nagata acknowledged this captures a strictly different signal ("never started" via skip, vs "started but died" via started-minus-completed) but treated it as out of scope. This is a good factoring—that feature lands in its own patch.
Key Technical Insights
- Skip is a first-class lifecycle state for autovacuum, not an error. The lack of stats coverage was an accident of history; the log-only signal is hostile to DBAs and monitoring stacks.
- Timestamp + counter are complementary, not redundant. The counter tells you how much; the timestamp (even if overwritten) tells you autovacuum is still trying. Distinguishing "stuck" from "forgotten" requires both.
- Oid-before-lock is a race hazard. The safe pattern (v6) is: attempt the normal locked lookup; on failure, do a separate unlocked Oid lookup only for reporting, never for acting.
- Parent/child asymmetry in partition VACUUM is intrinsic, not a patch flaw—
SKIP_LOCKEDsemantics prevent traversal into children without locks. - The patch is a wedge for a larger refactor. Column-count pressure on
pg_stat_all_tablesis accelerating the long-discussed split of index stats into their ownPgStat_StatEntrykind—a change with meaningful shared-memory and code-clarity benefits. - Bitmask flags beat multiple booleans in pgstat reporting APIs for the usual reason: call-site legibility and reduced risk of silent misattribution between counter buckets.
Disposition
Targeted at PostgreSQL v20 (explicitly acknowledged by Paquier and Nagata as too late for v19). By the end of the visible thread the patch has converged on: 8 columns (auto × manual × vacuum × analyze, timestamps and counters each), bitmask-flag API, last_skipped_* naming, v6's safer Oid-lookup structure, isolation test updates to vacuum-skip-locked.spec, and documented parent-table asymmetry.