Adding a Stored Generated Column Without Long-Lived Locks
The Core Problem
When adding a stored generated column to an existing table in PostgreSQL, the operation requires a full table rewrite under AccessExclusiveLock. For tables of nontrivial size, this means the table is completely inaccessible for the duration of the rewrite — potentially minutes or hours for large tables. This is a significant operational pain point in production environments where downtime windows are minimal or nonexistent.
Currently, PostgreSQL supports:
SET EXPRESSION— change the expression of an existing generated columnDROP EXPRESSION— turn a generated column into a regular column
But the reverse operation — turning a regular column into a stored generated column — does not exist. This means there's no incremental migration path to add a stored generated column without taking a heavyweight lock for the entire rewrite duration.
Proposed Solution: Two-Patch Approach
Patch 1: ALTER TABLE t ALTER COLUMN c ADD GENERATED ALWAYS AS (expr) STORED
This introduces a new DDL command that converts an existing column into a stored generated column. By default, this still performs a table rewrite (overwriting existing data with the computed expression), but it provides the syntactic foundation for the optimization in patch 2.
Key implementation details:
- The command is placed in
AT_PASS_SET_EXPRESSION, which is the same pass used bySET EXPRESSION. This ordering matters because generated column expressions can depend on other columns, and the pass ordering ensures dependencies are handled correctly. - Partitioning/inheritance semantics mirror
DROP EXPRESSION: the command must be applied to the entire inheritance tree at once (cannot be applied toONLYthe parent or directly to individual partitions). This follows the precedent set in commit8bf6ec3ba3a44448817af47a080587f3b71bee08. - The implementation uses
RememberAllDependentForRebuilding(called before updatingpg_attributefor early validation) and leveragesAddRelationNewConstraintsto obtain cooked default definitions rather than rebuilding them separately.
Patch 2: Skip Rewrite When a Valid Check Constraint Proves Correctness
The key optimization: if a validated CHECK constraint of the form (c = <expr>) exists on the table, and <expr> structurally matches the generated column expression, the table rewrite can be skipped entirely. The constraint serves as proof that all existing rows already satisfy the generated column's expression.
Matching criteria:
- The constraint must use a mergejoinable equality operator for the column's type (this ensures it's a true equality semantics, not a user-defined
=operator with unusual behavior) - The expressions are matched structurally (exact AST match), not semantically — they must be identical in form
The Online Migration Pattern
With both patches, users can perform the following multi-step online migration:
ALTER TABLE t ADD COLUMN c <type>— add a nullable column (lightweight lock)- Add a trigger:
SET c = exprfor new/updated rows (ensures new writes are correct) ALTER TABLE t ADD CONSTRAINT chk CHECK (c = expr) NOT VALID(no validation yet)- Backfill existing rows at controlled pace (batched updates, no long lock)
ALTER TABLE t VALIDATE CONSTRAINT chk(ShareUpdateExclusiveLock, not AccessExclusive)ALTER TABLE t ALTER COLUMN c ADD GENERATED ALWAYS AS (expr) STORED— skips rewrite due to valid constraint- Drop the trigger and constraint (cleanup)
This pattern is well-established in the Rails/Django ecosystem for adding NOT NULL constraints or changing column types, and this patch extends it to stored generated columns.
Design Decisions and Tradeoffs
Why STORED is Required Explicitly
The author deliberately restricts this to stored generated columns only. Virtual generated columns don't have the same operational problem since they don't store data and don't require a rewrite. Requiring STORED makes it explicit to the user that the command (absent the optimization) would overwrite existing data.
Structural vs. Semantic Expression Matching
Using structural matching (exact AST comparison) is conservative but safe. Two expressions that are semantically equivalent but syntactically different (e.g., a + b vs b + a) won't match. This avoids complex semantic equivalence checking while still being useful for the intended workflow where the user controls both the constraint and the generated expression.
Mergejoinable Operator Requirement
Requiring the = operator to be mergejoinable ensures it has proper equality semantics. This prevents edge cases where a user-defined equality operator might not truly prove that c contains the value of expr.
Depth Limitation with Partition Trees
The author notes a pre-existing limitation: DROP EXPRESSION doesn't work with partition trees deeper than one level. Rather than scope-creeping the patch to fix this, the new command inherits the same limitation, with a separate fix proposed independently.
Syntax Alternatives Considered
Three syntax forms were discussed:
ALTER COLUMN c ADD GENERATED ALWAYS AS (expr) STORED(chosen — mirrorsADD GENERATED AS IDENTITY)SET GENERATED ALWAYS AS (expr) STOREDADD EXPRESSION (expr) STORED(mirrors existingDROP EXPRESSION)
The author chose form 1 for consistency with identity column syntax but notes no strong attachment to any particular form.
Current Status
As of v4, the patches are rebased against current master with refinements:
RememberAllDependentForRebuildingmoved earlier for better validation ordering- Leverages return value of
AddRelationNewConstraintsinstead of redundant reconstruction - Cleaned up spurious includes and reorganized tests
The patch is seeking its first reviewer and has been proposed for the in-person commitfest at pgconf.dev. It is explicitly not targeting the current release cycle given proximity to feature freeze.