2026-06-04 · claude-opus-4-6

Deep Technical Analysis: Soft DROP TABLE / Recoverable Drops for PostgreSQL

Core Problem

DROP TABLE in PostgreSQL is an irreversible DDL operation that permanently removes the relation's heap files, indexes, sequences, toast tables, and catalog entries. The recovery options after an accidental drop are limited to:

Point-in-Time Recovery (PITR) — requires a standby or base backup, replaying WAL to a point before the drop. This is cluster-wide, disruptive, and can take hours for large databases.
Logical backup restoration — requires pg_dump snapshots that may be stale, and restoring a single table from a logical dump of a large database is operationally painful.

Both recovery mechanisms are disproportionately expensive relative to the simplicity of the mistake. The proposal argues that a "soft delete" layer — intercepting DROP TABLE and preserving the relation in a reserved schema — would provide a lightweight, non-disruptive first line of defense analogous to features in Oracle (Recycle Bin / FLASHBACK TABLE), IBM Db2 (dropped table recovery via rollforward), and Snowflake (UNDROP TABLE).

Proposed Architecture

Mechanism: ProcessUtility_hook Interception

The design builds on Peter Eisentraut's pgtrashcan extension, which uses PostgreSQL's ProcessUtility_hook to intercept DROP TABLE utility commands and rewrite them as ALTER TABLE ... SET SCHEMA "Trash". This is architecturally significant because:

It operates entirely at the utility command processing layer, not in the storage engine or WAL subsystem.
ALTER TABLE ... SET SCHEMA is a metadata-only operation (updates pg_class.relnamespace) — the heap files, indexes, and TOAST data remain untouched on disk.
It's implementable as a contrib extension without core patches, though with limitations.

Key Design Decisions

1. OID-Derived Naming for Uniqueness

Tables moved to the reserved schema are renamed to trash_$<OID> (e.g., trash_$16384). This solves the collision problem: if a user drops table foo, recreates it (gets a new OID), and drops it again, both versions coexist in the trash schema with distinct names. OIDs are unique within a cluster's lifetime (until wraparound, which is a potential edge case worth noting). The name is also guaranteed to fit within NAMEDATALEN (63 bytes), avoiding truncation issues.

2. Metadata Catalog Table

A catalog table inside the reserved schema records:

Original relation name and schema
OID
Drop timestamp
dropped_by user
Object type
JSONB metadata for dependent objects

This is necessary because ALTER TABLE ... SET SCHEMA doesn't preserve the original namespace context, and dependent objects (views, FK constraints, RLS policies) that cannot ride along with the schema move need their DDL recorded for potential reconstruction.

3. CASCADE Handling

This is the most architecturally complex aspect. When DROP TABLE t CASCADE is issued:

Indexes and owned sequences: Move with the table (they follow ALTER TABLE ... SET SCHEMA automatically via dependency tracking).
Foreign keys, views, materialized views, rules, RLS policies: Cannot be safely moved. The proposal captures their DDL definitions (via pg_get_constraintdef(), pg_get_viewdef(), etc.) as JSONB metadata, then permanently drops them.

This is a significant tradeoff: CASCADE recovery is partial. The table data is preserved, but the dependent object graph is only preserved as reconstructable DDL strings, not as live catalog objects. This means restoration of CASCADE-dropped tables requires replaying DDL, which may fail if the environment has changed (e.g., referenced columns altered, functions dropped).

4. Trigger Preservation

Triggers survive the schema move because pg_trigger.tgrelid references the relation by OID, not by schema-qualified name. However, trigger function bodies (pg_proc) may contain hardcoded schema-qualified references to other tables. After restore, these could reference stale or nonexistent objects. The proposal acknowledges this as an open design question.

5. Session/System-wide GUC Bypass

A GUC (e.g., soft_drop.enabled = off) allows bypassing the trash mechanism. This is critical for:

Automated maintenance scripts that intentionally drop tables (partition rotation, temp table cleanup)
Performance-sensitive DDL operations where the metadata bookkeeping adds overhead
Compatibility with existing tooling that expects standard DROP TABLE semantics

Architectural Concerns and Open Questions

Extension vs. Core Feature

The proposal explicitly asks whether this should be a contrib extension or a core feature. This is a fundamental design question with significant implications:

As an extension (via ProcessUtility_hook):

Cannot intercept all code paths that drop relations (e.g., DROP SCHEMA CASCADE, DROP OWNED BY, temp table cleanup on session end)
Cannot prevent concurrent DDL from breaking the trash schema invariants without core locking support
The hook approach has ordering issues with other extensions that also hook ProcessUtility
Cannot integrate with transactional DDL — if the ALTER TABLE SET SCHEMA is inside a transaction that rolls back, the trash state is inconsistent

As a core feature:

Would require changes to performDeletion() / RemoveRelations() in src/backend/commands/tablecmds.c
Could properly integrate with the dependency system (pg_depend)
Could handle transactional semantics correctly
Much higher bar for acceptance; significant maintenance burden

Storage and Bloat Implications

Tables in the trash schema still consume storage. Without an automated purge mechanism (e.g., based on retention time or total trash size), the trash schema can grow unboundedly. The proposal doesn't yet describe:

A background worker for time-based purging
Storage quota management
Integration with pg_stat_user_tables for monitoring trash consumption
Vacuum/autovacuum behavior for trashed tables (should they continue to be vacuumed?)

Replication and Logical Decoding

The proposal doesn't address:

How this interacts with logical replication (the ALTER TABLE SET SCHEMA will be replicated; subscribers may not have the trash schema)
Physical replication (transparent, since it's just catalog changes)
Logical decoding / CDC systems that track schema changes

Partitioned Tables

Mentioned as a "misc" extension point. Partitioned tables add complexity because:

ALTER TABLE ... SET SCHEMA on a partitioned table moves the parent but partitions have their own pg_class entries
Partition relationships (pg_inherits) are OID-based so should survive, but detached partitions in different schemas create edge cases
DROP TABLE on a partition vs. the parent has different semantics

Concurrency and Locking

The proposal doesn't discuss:

What lock level is needed for the trash move (likely AccessExclusiveLock, same as DROP TABLE)
Race conditions between concurrent drops and restores
What happens if a restore conflicts with a table that now exists with the same name in the target schema

Assessment

This is a design proposal / RFC without a patch. The idea addresses a genuine operational pain point, and the Oracle/Db2/Snowflake precedent demonstrates industry demand. However, as a PostgreSQL contribution, it faces several challenges:

The extension approach is inherently limited — ProcessUtility_hook cannot cover all drop paths, and transactional DDL semantics are difficult to preserve.
Core integration has a very high bar — the PostgreSQL community tends to prefer mechanisms that compose well with existing features (replication, dependencies, transactional DDL).
The CASCADE metadata-as-JSONB approach is fragile — storing DDL strings for later replay is inherently schema-version-dependent and can silently produce incorrect results.
No patch yet — community engagement typically requires a working implementation to evaluate.

The most likely path forward would be as a contrib extension (similar to pg_stat_statements or auto_explain), with the understanding that it provides best-effort recovery rather than guaranteed consistency.

[DESIGN] Soft DROP TABLE, recoverable drops for PostgreSQL

Latest Update