2026-05-25 · claude-opus-4-6

ERROR during COMMIT PREPARED Can Leave Orphaned Locks

Problem Statement

This thread identifies a subtle but serious bug in PostgreSQL's two-phase commit (2PC) implementation where an ERROR occurring during the callback phase of COMMIT PREPARED or ROLLBACK PREPARED can leave orphaned locks in shared memory that are only clearable by a server restart.

Technical Deep Dive

The Two-Phase Commit Lifecycle

PostgreSQL's two-phase commit protocol (used for distributed transactions and implemented via PREPARE TRANSACTION / COMMIT PREPARED) maintains global transaction state (GlobalTransaction or gxact) in shared memory. The critical flow during COMMIT PREPARED is:

Write the commit WAL record (making the commit durable)
Mark gxact->valid = false (preventing anyone else from trying to commit/rollback)
Run post-commit callbacks (release locks, update relation caches, etc.)
Clean up shared memory state

The Race Condition / Failure Window

The bug exists in the window between steps 2 and 3. The code deliberately marks the gxact as invalid before running callbacks:

gxact->valid = false;

This is done intentionally — the comment explains it's a safety measure so that if callbacks fail, no one can attempt to re-commit the already-committed transaction. The gxact remains locked by the current backend so it won't be immediately recycled.

The Failure Scenario

If an ERROR is thrown during callback execution (step 3):

AtAbort_Twophase is invoked as part of the abort handling. Since gxact->valid == false, it simply deletes the gxact entry from shared memory.
Callbacks that haven't run yet are skipped — critically, this includes lock release callbacks that would normally release the transaction's heavyweight locks.
Result: The prepared transaction disappears from pg_prepared_xacts (since the gxact is gone), but its locks persist in pg_locks with no owning transaction.

These orphaned locks are effectively permanent until server restart because:

There's no prepared transaction to commit/rollback (it's already committed in WAL)
There's no backend holding the locks (the executing backend has moved on)
The lock manager has no mechanism to clean up locks without an associated transaction or backend

Why This Is Architecturally Significant

Violates the fundamental guarantee that locks are always associated with and cleaned up by their owning transaction lifecycle
Silent data availability issue: Tables could become permanently locked (blocking DDL or even DML depending on lock type) with no visible cause
Breaks monitoring expectations: pg_prepared_xacts shows nothing wrong, but pg_locks shows phantom locks
Affects HA systems: Any system using 2PC (foreign data wrappers, distributed transaction managers, logical replication in some configurations) is potentially affected

Why Reproduction Is Difficult

The only realistic way to trigger an ERROR during the callback phase is via an out-of-memory (OOM) condition. The callbacks are generally designed to be simple operations (lock releases, shared memory updates) that don't allocate significant memory. This makes the bug extremely rare in practice but theoretically possible under memory pressure.

The author acknowledges this difficulty and provides a test using an injection point — a mechanism to artificially trigger errors at specific code locations — since simulating OOM from Perl TAP tests is impractical.

Potential Solution Directions (Not Yet Proposed)

Several architectural approaches could address this:

PANIC instead of ERROR: If a callback fails, escalate to PANIC since the system is in an inconsistent state anyway. This is heavy-handed but guarantees crash recovery will clean up properly.
Critical section protection: Mark the callback execution phase as a critical section where ERRORs are promoted to PANICs, similar to how WAL insertion is protected.
Deferred cleanup mechanism: Register the lock cleanup work in a way that survives the ERROR, perhaps via a background worker or a persistent queue that's checked on startup.
Retry mechanism: On abort, if we detect we were in the middle of finishing a prepared transaction, re-attempt the remaining callbacks rather than simply deleting the gxact.
Make callbacks non-failable: Ensure all post-commit callbacks are coded to never throw errors (pre-allocate any needed memory, etc.). This is fragile but might be combined with critical section protection.

The most likely community-acceptable fix would be approach #1 or #2 — treating failure during this critical window as unrecoverable and relying on crash recovery to complete the transaction properly on restart.

Assessment

This is a well-identified edge case in a critical subsystem. The reporter demonstrates deep understanding of the 2PC internals and has provided a reproducible test case (via injection points). The issue is real but extremely unlikely to occur in production without contrived conditions. However, for systems where 2PC correctness is essential (financial systems, distributed databases), even this theoretical window is concerning.

The thread is in its initial reporting phase with no responses yet, awaiting review from committers with 2PC expertise (likely Michael Paquier, Simon Riggs, or Heikki Linnakangas based on historical code ownership).

ERROR during COMMIT PREPARED can leave orphaned locks

Latest Update