Technical Analysis: Rebuilding CHECK Constraints After Generated Column SET EXPRESSION
Core Problem
When ALTER TABLE ... ALTER COLUMN ... SET EXPRESSION changes the generation expression of a virtual or stored generated column, PostgreSQL must validate that existing rows still satisfy all constraints that depend on the column's value. Commit f80bedd52b1 ("Allow ALTER COLUMN SET EXPRESSION on virtual columns with CHECK constraints") introduced logic to rebuild dependent CHECK constraints by calling RememberAllDependentForRebuilding(), which scans pg_depend for objects that have a dependency on the specific column's attnum.
The gap is architectural: whole-row Var references (Var.varattno = 0) do not create per-column dependencies in pg_depend. A CHECK constraint like CHECK (tab IS NOT NULL) records a dependency on the relation (attnum 0), not on any individual column. Since RememberAllDependentForRebuilding() only looks for dependencies matching the target column's attnum, these whole-row constraints are silently skipped. The result is a data integrity violation: SET EXPRESSION succeeds, changing the generation formula, but existing rows may now violate the whole-row CHECK constraint.
This is the same class of bug that affects ALTER TABLE DROP COLUMN and ALTER COLUMN SET DATA TYPE with whole-row references — those cases are being addressed in separate commitfest entries (patches 5988 and 6055).
Why This Matters Architecturally
PostgreSQL's dependency tracking system (pg_depend) is the backbone of DDL safety. The system assumes that if an object depends on a column, there's a recorded dependency edge from the object to (relation_oid, attnum). Whole-row Vars break this assumption because they represent an implicit dependency on every column but are recorded only as a relation-level dependency. Any DDL operation that uses pg_depend to find affected objects must separately account for whole-row references, creating a recurring pattern of bugs whenever new DDL operations are added that modify column semantics.
The affected object types go beyond CHECK constraints:
- CHECK constraints — whole-row expressions like
CHECK (row IS NOT NULL)won't be revalidated - Indexes — expression indexes or partial index predicates using whole-row Vars need rebuilding after the generation expression changes
- Row-Level Security policies — RLS policies with whole-row references in their USING/WITH CHECK clauses need to be flagged as incompatible (since policies can't simply be "rebuilt" the way constraints and indexes can)
- Triggers — BEFORE triggers with whole-row
WHENclauses are already blocked byCreateTriggerFiringOn, and AFTER triggers with whole-row references work correctly because they fire after the row is materialized, so no special handling is needed
Proposed Solutions
v1 (Ayush Tiwari) — Conservative "Rebuild All" Approach
The initial patch takes a simple approach: during SET EXPRESSION, remember all CHECK constraints on the relation for rebuilding, regardless of whether they contain whole-row references. This is safe because RememberConstraintForRebuilding() already deduplicates, so constraints found both via the per-column scan and the "all constraints" scan are only rebuilt once.
Tradeoff: Simple and correct, but rebuilds constraints that don't need rebuilding (e.g., a CHECK on an unrelated column). The performance cost is typically negligible since constraint validation is dominated by the table scan, and rebuilding an extra constraint just adds an expression evaluation per row.
v2 (Jian He) — Targeted Whole-Row Detection Across Object Types
The v2 patch is significantly more comprehensive. Rather than rebuilding everything, it introduces a function RememberWholeRowDependentForRebuilding() that:
- Queries
pg_dependfor objects dependent on the relation at attnum=0 - For each dependent object (CHECK constraint, index, policy), inspects the actual expression tree using
pull_varattnos()and checksbms_is_member(0 - FirstLowInvalidHeapAttributeNumber, expr_attrs)to confirm whole-row Var presence - For CHECK constraints and indexes: adds them to the rebuild lists (
changedConstraintOids,changedIndexOids) - For RLS policies: raises an ERROR, since there's no mechanism to "rebuild" a policy — the user must drop and recreate it
The patch also introduces exprContainWholeRow() in var.c as a utility for checking whole-row Var existence (though reviewers debated naming conventions).
Key design decision: The v2 patch correctly identifies that RLS policies require different handling than constraints/indexes. A policy with a whole-row reference that includes a generated column cannot be automatically revalidated; the ALTER must be rejected.
Important Nuances Identified in Review
-
RememberAllDependentForRebuilding(attnum=0)— Jian He points out this approach (querying for relation-level dependencies) would catch whole-row objects but would also pick up all relation-level dependencies including ones that don't involve whole-row Vars, potentially causing unexpected side effects with RLS policies that have both column-level and relation-level dependency entries. -
Locking correctness — Ayush correctly identified that
GetRelAssociatedPolicies()openspg_dependwithRowExclusiveLockwhen it only performs reads;AccessShareLockis sufficient and avoids unnecessary lock contention. -
STORED vs VIRTUAL test error — Zsolt Parragi caught that a test case in
generated_virtual.sqlincorrectly creates aSTOREDgenerated column, which is a copy-paste error from the stored column test file.
Design Tensions
-
Breadth vs. precision: v1 rebuilds all CHECK constraints (simple, potentially over-broad); v2 inspects expression trees to rebuild only those with whole-row Vars (precise, more complex code). The community leans toward v2's precision, especially since the pattern extends to indexes and policies.
-
Helper function factoring: Ayush suggested extracting
expr_has_wholerow_var()to reduce code duplication across the CHECK/index-expr/index-pred paths. Jian He declined, arguing the function is already expected to be large given it handles three distinct object types, and a helper wouldn't meaningfully improve readability. This is a minor style disagreement. -
Error message quality for policies: The v1-era error message said a policy "depends on column X", which is misleading for whole-row references. The v2 patch improves this to mention "whole row references", though the final wording ("ALTER TABLE / SET EXPRESSION is not supported for generated columns in tables that are part of a policy definition") is somewhat verbose and could be further refined.
-
Naming conventions:
exprContainWholeRowdoesn't follow PostgreSQL's establishedunderscore_lowercasenaming style (cf.pull_varattnos,contain_var_clause). The review suggestsexpr_contains_wholerow_var. Similarly,GeRelAssociatedPolicieshas a typo (missing 't').
Relationship to Broader Work
This patch is part of an ongoing effort to close whole-row Var gaps across DDL operations. The same fundamental issue — pg_depend not tracking per-column dependencies for whole-row references — affects:
ALTER TABLE DROP COLUMN(commitfest patch 5988)ALTER COLUMN SET DATA TYPE(commitfest patch 6055)ALTER COLUMN SET EXPRESSION(this patch)
All three require the same pattern: scan for relation-level dependencies, inspect expression trees for whole-row Vars, then either rebuild the dependent object or reject the DDL. A long-term architectural fix would be to change how pg_depend records whole-row dependencies (e.g., recording a dependency on every column, or adding a special "all columns" marker), but that would be a much larger change to the dependency system.