Technical Analysis: Avoiding Table Rewrite When Setting Column Type to Constrained Domain
Core Problem
When PostgreSQL executes ALTER TABLE ... ALTER COLUMN ... SET DATA TYPE, it traditionally performs a full table rewrite — creating a new physical copy of the entire table with the column stored in the new type. This is expensive for large tables (acquiring AccessExclusiveLock, copying all rows, rebuilding indexes).
However, there's an important class of type changes where a rewrite is unnecessary: changing a column to a constrained domain over the same base type. For example:
CREATE DOMAIN positive_int AS INT CHECK(VALUE > 0) NOT NULL;
ALTER TABLE big_table ALTER COLUMN x SET DATA TYPE positive_int;
Here, the physical storage representation is identical (int4 → int4). The only semantic difference is that new domain constraints must be validated against existing data. This validation requires a table scan (to check each row against the domain's CHECK constraints), but NOT a table rewrite (no data needs to be physically transformed or copied).
The distinction matters enormously for large tables: a scan is O(n) reads, while a rewrite is O(n) reads + O(n) writes + index rebuilds + WAL generation.
Architectural Context
PostgreSQL's ALTER TABLE pipeline has three phases:
- ATPrepAlterColumnType (Phase 1): Determines what coercion expression is needed, calls
ATColumnChangeRequiresRewrite()to decide if a rewrite is necessary. - ATRewriteTables (Phase 2): Orchestrates table rewrites.
- ATRewriteTable (Phase 3): Actually scans/rewrites the table, evaluating
NewColumnValueexpressions.
The key data structure is NewColumnValue, which holds the expression to transform old column values into new ones. The existing code treats the presence of any NewColumnValue entry as implying a table rewrite is needed.
Proposed Solution (Evolution Across Versions)
v1-v2: coerce_to_domain / scan_only flag
The initial approach added a boolean to NewColumnValue indicating that only a scan (not a rewrite) is needed. The field was first called coerce_to_domain, then renamed to scan_only for generality.
v3: need_compute flag (inverted semantics)
Renamed to need_compute — when true, the expression result is written back to the tuple (rewrite); when the coercion is purely a domain constraint check, the expression is evaluated only for its side effect (raising an error if constraints are violated), and the existing tuple data is left unchanged.
v5-v6: Refined scope
After realizing that ArrayCoerceExpr (casting array element types) genuinely requires a rewrite, the patch narrowed its optimization: only when the new type is a constrained domain directly over the old base type (where ATColumnChangeRequiresRewrite() already returns false) does it apply the scan-only path. The expression tree must be a simple CoerceToDomain wrapping a Var reference to the original column.
Key Implementation Details
-
Expression evaluation without writing back: In the scan-only path,
ExecEvalExpr()is called on theCoerceToDomainexpression. If domain constraints are violated, this raises an ERROR (aborting the ALTER). If it succeeds, the return value is discarded — the tuple is not modified. -
Volatility check: The patch restricts scan-only behavior to domains where all constraints are non-volatile. A volatile constraint (e.g.,
CHECK(VALUE > random(...)) could produce different results on re-evaluation, so a rewrite might be semantically necessary to "lock in" the validated state. This is somewhat debatable (see below). -
Tuple descriptor manipulation: The patch temporarily swaps the tuple descriptor during constraint evaluation because the expression expects to operate on the old column type, but the scan context may already reflect metadata changes.
-
Unconstrained domain edge case: Viktor identified that for
CREATE DOMAIN mydom AS int(no constraints), the patch'stab->newvalscheck causes an unnecessary scan where previously nothing was done.
Key Technical Disagreements and Open Issues
Volatility Requirement
Viktor questioned why volatile constraints require a rewrite: "A rewrite doesn't make things more deterministic right?" This is a valid point — if the constraint uses random(), it's non-deterministic whether you check during a scan or during a rewrite. The author's conservative stance avoids subtle correctness issues but may be overly restrictive.
USING Expression Bug
Viktor discovered a critical correctness bug: when a USING clause provides a constant (e.g., USING 5), the scan-only path was incorrectly applied, leaving the original value unchanged instead of writing the constant. This is because the optimization must ONLY apply when the expression is purely a domain constraint check over the existing value — any actual value transformation requires a rewrite.
ExecEvalExpr vs ExecEvalExprNoReturn
The code uses (void) ExecEvalExpr(...) rather than ExecEvalExprNoReturn() because the expression was compiled with ExecInitExpr (which sets up EEOP_DONE_RETURN as the final step). Using ExecEvalExprNoReturn would be incorrect for an expression compiled this way. This was poorly explained in comments and confused reviewers.
Unconstrained Domain Regression
For unconstrained domains (no CHECK, no NOT NULL), the previous behavior was to skip both rewrite and scan entirely (just a catalog update). The patch's addition of || tab->newvals to the ATRewriteTable trigger condition may cause an unnecessary full-table scan for these cases.
Design Tradeoffs
| Aspect | Conservative (current patch) | Aggressive alternative |
|---|---|---|
| Volatile constraints | Force rewrite | Allow scan-only |
| USING clause | Force rewrite if expression isn't identity coercion | Could try to detect pure-constraint expressions |
| Unconstrained domains | Need fix to avoid unnecessary scan | Skip entirely |
| Domain over arrays | Force rewrite | Complex analysis of element coercion |
Current Status (as of v6)
The patch is functional for the core case (constrained domain over same base type, no USING override). Outstanding issues:
- The USING constant bug was reported but addressed in v6
- Unconstrained domain regression identified but not confirmed fixed
- Test cleanup needed (spelling, unnecessary error output)
- The patch needs verification that partitioned tables behave correctly
- Solai confirmed basic functional correctness with testing up to 1M rows