Monthly Summary: Avoid calling SetMatViewPopulatedState if possible (May 2026)
Overview
A focused optimization patch to eliminate unnecessary pg_class catalog updates during REFRESH MATERIALIZED VIEW. The thread progressed from initial submission through a substantive technical challenge to a partial resolution, with the key debate centering on the actual scope of the optimization's benefit.
The Problem
SetMatViewPopulatedState() is called unconditionally at the end of ExecRefreshMatView() to mark pg_class.relispopulated = true. For already-populated materialized views (the overwhelmingly common case), this performs a semantically no-op heap_update that nonetheless produces MVCC bloat in pg_class — a catalog read by virtually every query during planning. The submitter demonstrated this via ctid churn on the matview's pg_class row across repeated REFRESH MATERIALIZED VIEW CONCURRENTLY calls.
The Proposed Fix
Guard the SetMatViewPopulatedState(matviewOid, true) call with a check of the current relispopulated flag. If already true, skip the catalog write entirely. This follows the established PostgreSQL pattern of "avoid no-op catalog updates" (as seen in ATExecChangeOwner and various ALTER paths).
Key Technical Debate
David Geier raised a fundamental objection: REFRESH MATERIALIZED VIEW already rewrites pg_class via finish_heap_swap() (updating relfilenode, relpages, reltuples, etc.), so skipping one additional write shouldn't eliminate ctid churn. This surfaced a critical distinction:
- CONCURRENTLY mode: Implements refresh as DELETE + INSERT against the existing heap — does not swap relfilenode.
SetMatViewPopulatedState()is the only pg_class write in this path. The patch fully eliminates per-refresh pg_class bloat. - Non-CONCURRENTLY mode: Goes through
finish_heap_swap()which writes pg_class regardless. The patch reduces dead tuples from two to one per refresh (eliminating the redundant secondheap_update), but doesn't eliminate bloat entirely.
The submitter confirmed the CONCURRENTLY mechanism and asserted the optimization has value "in all cases" — technically correct since even in non-CONCURRENTLY mode it prevents generating a second dead pg_class tuple within the same transaction.
Current Status
The patch is in review. Geier concedes the principle ("Avoiding the bloat seems generally reasonable") but pushed for clearer explanation of why the demonstration worked. The submitter responded with the CONCURRENTLY mechanism explanation. No committer has weighed in. The ball is in Geier's court for either acceptance or further requests (e.g., non-CONCURRENTLY ctid demonstration, or a suggestion to unify both pg_class writes in the non-CONCURRENTLY path).
Open Questions
- Should the non-CONCURRENTLY path be refactored to fold
relispopulatedinto the existingfinish_heap_swap()pg_class write, achieving a single write per refresh? - Whether two
heap_updatecalls on the same tuple within one transaction can be coalesced by HOT (likely not). - Should this be backpatched? (Likely no — optimization, not bug fix.)