=?UTF-8?Q?Add_=E2=80=9CFOR_UPDATE_NOWAIT=E2=80=9D_lock_details_t?= =?UTF-8?Q?o_the_log=2E?=

First seen: 2024-09-13 11:49:36+00:00 · Messages: 39 · Participants: 5

Latest Update

2026-05-14 · claude-opus-4-6

Technical Analysis: Adding Lock Failure Details to PostgreSQL Logs (FOR UPDATE NOWAIT)

Core Problem

When PostgreSQL encounters a lock conflict with FOR UPDATE NOWAIT, the operation fails immediately with a generic error message but provides no information about who holds the conflicting lock. This stands in contrast to the FOR UPDATE case where, if log_lock_waits is enabled and deadlock_timeout elapses, the system logs the PID of the lock holder, the transaction XID, and the wait queue.

This asymmetry creates a significant operational blind spot: DBAs using NOWAIT (often in latency-sensitive applications) cannot diagnose lock contention without resorting to external monitoring or pg_locks queries at the moment of failure. The information exists within the lock manager but is simply not surfaced.

Architectural Context

PostgreSQL's lock manager operates at multiple levels:

  1. Heavyweight locks (relation-level, transaction-level) managed in shared memory via LOCK and PROCLOCK structures
  2. Row-level locks implemented via tuple header xmax/infomask bits, with heavyweight XactLockTable waits for conflict resolution

The existing log_lock_waits mechanism works by logging after deadlock_timeout expires during an active wait in ProcSleep(). For NOWAIT, the process never enters ProcSleep()LockAcquireExtended() returns LOCKACQUIRE_NOT_AVAIL immediately when dontWait=true and the lock cannot be granted. This means the existing logging infrastructure is architecturally inapplicable to NOWAIT failures.

Design Evolution

Phase 1: Initial Approach (Refactoring Existing Code)

The original patch attempted to extract the lock holder/waiter collection logic from ProcSleep() into a separate CollectLockHoldersAndWaiters() function and call it from the NOWAIT failure path. This had several problems:

Phase 2: GUC Design Debate

A significant design discussion centered on how to control this logging:

  1. Extend log_lock_waits to an enum type with values: off, on (existing behavior), fail (log on lock failure), all (both)
  2. New dedicated GUC (log_lock_failure)

The initial preference was to extend log_lock_waits, but this was reconsidered because:

Phase 3: SKIP LOCKED Exclusion

A critical design decision was to exclude SKIP LOCKED from the logging feature. The reasoning:

Phase 4: Final Architecture (logLockFailure argument)

The committed solution adds a logLockFailure boolean argument to LockAcquireExtended():

LockAcquireResult
LockAcquireExtended(const LOCKTAG *locktag, LOCKMODE lockmode,
                    bool sessionLock, bool dontWait,
                    bool reportMemoryError, LOCALLOCK **locallockp,
                    bool logLockFailure)

When dontWait=true and lock acquisition fails and logLockFailure=true and the GUC log_lock_failure is enabled:

  1. The partition lock is re-acquired in shared mode
  2. CollectLockHoldersAndWaiters() iterates lock->procLocks to build PID lists of holders and waiters
  3. A LOG-level message is emitted with lock type, mode, holder PIDs, and waiter PIDs
  4. The partition lock is released

This approach:

The Four Lock Failure Paths for Row Locks

The patch covers all four code paths in heap tuple locking where NOWAIT can fail:

  1. heapam.c:4946 — Simple case: single lock holder on tuple
  2. heapam.c:4905 — MultiXact case: multiple SHARE lockers, new UPDATE NOWAIT conflicts
  3. heapam.c:5211 — Wait queue case: existing waiters on tuple
  4. heapam_handler.c:463 — EPQ (EvalPlanQual) recheck path during concurrent updates

Each path calls ConditionalXactLockTableWait()LockAcquireExtended() with the new logLockFailure parameter.

Post-Commit Extension Attempt

After the initial commit, Fujii proposed extending log_lock_failures to cover LOCK TABLE ... NOWAIT, ALTER TABLE ... NOWAIT, etc. However, this hit a design obstacle: ConditionalLockRelationOid() is also used by autovacuum, and distinguishing user-initiated NOWAIT from internal conditional lock attempts would require additional plumbing. This extension was withdrawn pending further design work.

GUC Naming Correction

Peter Eisentraut identified an inconsistency: log_lock_waits uses plural but log_lock_failure was singular. This was corrected to log_lock_failures for consistency, committed as a follow-up.

Key Technical Tradeoffs

Decision Tradeoff
New GUC vs. extending log_lock_waits Clarity vs. GUC proliferation
Shared partition lock for collection Correctness vs. slight contention
Excluding SKIP LOCKED Reduces noise but limits utility
logLockFailure argument approach Simple extension model but adds parameter to core function
Row-level locks only (initial) Limits scope but avoids autovacuum noise