Deep Technical Analysis: Soft DROP TABLE / Recoverable Drops for PostgreSQL
Core Problem
DROP TABLE in PostgreSQL is an irreversible DDL operation that permanently removes the relation's heap files, indexes, sequences, toast tables, and catalog entries. The recovery options after an accidental drop are limited to:
- Point-in-Time Recovery (PITR) — requires a standby or base backup, replaying WAL to a point before the drop. This is cluster-wide, disruptive, and can take hours for large databases.
- Logical backup restoration — requires
pg_dumpsnapshots that may be stale, and restoring a single table from a logical dump of a large database is operationally painful.
Both recovery mechanisms are disproportionately expensive relative to the simplicity of the mistake. The proposal argues that a "soft delete" layer — intercepting DROP TABLE and preserving the relation in a reserved schema — would provide a lightweight, non-disruptive first line of defense analogous to features in Oracle (Recycle Bin / FLASHBACK TABLE), IBM Db2 (dropped table recovery via rollforward), and Snowflake (UNDROP TABLE).
Proposed Architecture
Mechanism: ProcessUtility_hook Interception
The design builds on Peter Eisentraut's pgtrashcan extension, which uses PostgreSQL's ProcessUtility_hook to intercept DROP TABLE utility commands and rewrite them as ALTER TABLE ... SET SCHEMA "Trash". This is architecturally significant because:
- It operates entirely at the utility command processing layer, not in the storage engine or WAL subsystem.
ALTER TABLE ... SET SCHEMAis a metadata-only operation (updatespg_class.relnamespace) — the heap files, indexes, and TOAST data remain untouched on disk.- It's implementable as a contrib extension without core patches, though with limitations.
Key Design Decisions
1. OID-Derived Naming for Uniqueness
Tables moved to the reserved schema are renamed to trash_$<OID> (e.g., trash_$16384). This solves the collision problem: if a user drops table foo, recreates it (gets a new OID), and drops it again, both versions coexist in the trash schema with distinct names. OIDs are unique within a cluster's lifetime (until wraparound, which is a potential edge case worth noting). The name is also guaranteed to fit within NAMEDATALEN (63 bytes), avoiding truncation issues.
2. Metadata Catalog Table
A catalog table inside the reserved schema records:
- Original relation name and schema
- OID
- Drop timestamp
dropped_byuser- Object type
- JSONB metadata for dependent objects
This is necessary because ALTER TABLE ... SET SCHEMA doesn't preserve the original namespace context, and dependent objects (views, FK constraints, RLS policies) that cannot ride along with the schema move need their DDL recorded for potential reconstruction.
3. CASCADE Handling
This is the most architecturally complex aspect. When DROP TABLE t CASCADE is issued:
- Indexes and owned sequences: Move with the table (they follow
ALTER TABLE ... SET SCHEMAautomatically via dependency tracking). - Foreign keys, views, materialized views, rules, RLS policies: Cannot be safely moved. The proposal captures their DDL definitions (via
pg_get_constraintdef(),pg_get_viewdef(), etc.) as JSONB metadata, then permanently drops them.
This is a significant tradeoff: CASCADE recovery is partial. The table data is preserved, but the dependent object graph is only preserved as reconstructable DDL strings, not as live catalog objects. This means restoration of CASCADE-dropped tables requires replaying DDL, which may fail if the environment has changed (e.g., referenced columns altered, functions dropped).
4. Trigger Preservation
Triggers survive the schema move because pg_trigger.tgrelid references the relation by OID, not by schema-qualified name. However, trigger function bodies (pg_proc) may contain hardcoded schema-qualified references to other tables. After restore, these could reference stale or nonexistent objects. The proposal acknowledges this as an open design question.
5. Session/System-wide GUC Bypass
A GUC (e.g., soft_drop.enabled = off) allows bypassing the trash mechanism. This is critical for:
- Automated maintenance scripts that intentionally drop tables (partition rotation, temp table cleanup)
- Performance-sensitive DDL operations where the metadata bookkeeping adds overhead
- Compatibility with existing tooling that expects standard
DROP TABLEsemantics
Architectural Concerns and Open Questions
Extension vs. Core Feature
The proposal explicitly asks whether this should be a contrib extension or a core feature. This is a fundamental design question with significant implications:
As an extension (via ProcessUtility_hook):
- Cannot intercept all code paths that drop relations (e.g.,
DROP SCHEMA CASCADE,DROP OWNED BY, temp table cleanup on session end) - Cannot prevent concurrent DDL from breaking the trash schema invariants without core locking support
- The hook approach has ordering issues with other extensions that also hook
ProcessUtility - Cannot integrate with transactional DDL — if the
ALTER TABLE SET SCHEMAis inside a transaction that rolls back, the trash state is inconsistent
As a core feature:
- Would require changes to
performDeletion()/RemoveRelations()insrc/backend/commands/tablecmds.c - Could properly integrate with the dependency system (
pg_depend) - Could handle transactional semantics correctly
- Much higher bar for acceptance; significant maintenance burden
Storage and Bloat Implications
Tables in the trash schema still consume storage. Without an automated purge mechanism (e.g., based on retention time or total trash size), the trash schema can grow unboundedly. The proposal doesn't yet describe:
- A background worker for time-based purging
- Storage quota management
- Integration with
pg_stat_user_tablesfor monitoring trash consumption - Vacuum/autovacuum behavior for trashed tables (should they continue to be vacuumed?)
Replication and Logical Decoding
The proposal doesn't address:
- How this interacts with logical replication (the
ALTER TABLE SET SCHEMAwill be replicated; subscribers may not have the trash schema) - Physical replication (transparent, since it's just catalog changes)
- Logical decoding / CDC systems that track schema changes
Partitioned Tables
Mentioned as a "misc" extension point. Partitioned tables add complexity because:
ALTER TABLE ... SET SCHEMAon a partitioned table moves the parent but partitions have their ownpg_classentries- Partition relationships (
pg_inherits) are OID-based so should survive, but detached partitions in different schemas create edge cases DROP TABLEon a partition vs. the parent has different semantics
Concurrency and Locking
The proposal doesn't discuss:
- What lock level is needed for the trash move (likely
AccessExclusiveLock, same asDROP TABLE) - Race conditions between concurrent drops and restores
- What happens if a restore conflicts with a table that now exists with the same name in the target schema
Assessment
This is a design proposal / RFC without a patch. The idea addresses a genuine operational pain point, and the Oracle/Db2/Snowflake precedent demonstrates industry demand. However, as a PostgreSQL contribution, it faces several challenges:
- The extension approach is inherently limited — ProcessUtility_hook cannot cover all drop paths, and transactional DDL semantics are difficult to preserve.
- Core integration has a very high bar — the PostgreSQL community tends to prefer mechanisms that compose well with existing features (replication, dependencies, transactional DDL).
- The CASCADE metadata-as-JSONB approach is fragile — storing DDL strings for later replay is inherently schema-version-dependent and can silently produce incorrect results.
- No patch yet — community engagement typically requires a working implementation to evaluate.
The most likely path forward would be as a contrib extension (similar to pg_stat_statements or auto_explain), with the understanding that it provides best-effort recovery rather than guaranteed consistency.