2026-06-01 · claude-opus-4-6

Technical Analysis: pg_stat_lock Blocker Mode Dimension

Core Problem

PostgreSQL's pg_stat_lock view (introduced in recent versions to provide cumulative lock wait statistics) currently aggregates lock contention data only by locktype. This means operators can see that lock waits are occurring on relations, but cannot distinguish what kind of operation is causing those waits without resorting to parsing log_lock_waits output or deploying external sampling extensions like pg_wait_sampling.

The fundamental architectural gap is that lock contention diagnostics lack a critical dimension: the lock mode of the blocker. In production, the difference between waits caused by ShareUpdateExclusiveLock (VACUUM) and AccessExclusiveLock (DDL) is operationally crucial — the former suggests autovacuum tuning issues while the latter points to migration/deployment problems. Today this requires correlating logs or sampling pg_locks in real-time, both of which are lossy and operationally expensive.

Proposed Solution

Architecture

The patch adds a mode column to pg_stat_lock, expanding the statistics aggregation key from [locktype] to [locktype, mode]. This means PgStatShared_Lock structures in shared memory are expanded to cover the cross-product of lock types and lock modes.

Blocker Mode Capture Algorithm

The mode is determined at the point where a lock requester joins the wait queue, under the lock partition LWLock (so no additional locking is required):

Primary rule: Among all modes m where lock->granted[m] > 0 that conflict with the requester's mode, select the strongest (highest-numbered in the lock mode ordering).
Fallback rule: If no held mode conflicts (a pure queue-priority wait), select the strongest mode in lock->waitMask that conflicts.

This is architecturally sound because:

The lock partition LWLock is already held at this point in ProcSleep() / the wait-queue insertion path
lock->granted[] and lock->waitMask are already maintained and available
The "strongest conflicting" heuristic captures the mode whose release is necessary for the waiter to proceed

Shared Memory Cost

The expansion adds approximately 2.3 kB per cluster — one additional dimension of ~16 lock modes across the existing lock type array. This is negligible.

Fast Path Handling

Critically, no new instrumentation is added to the fast-path lock acquisition code. The blocker-mode snapshot logic runs only when a request would otherwise wait, meaning there is zero overhead on the uncontended hot path. This is an important design constraint that preserves the performance characteristics of fast-path locking.

Key Design Decisions and Tradeoffs

1. Blocker Mode vs. Requester Mode vs. Both

The author explicitly considered three alternatives:

Requester mode only: Simpler but less operationally useful — you can often infer what the requester was doing from context
Both modes: Most informative but explodes the view's row count (modes × modes × locktypes) and likely overlaps with pg_wait_sampling use cases
Blocker mode only (chosen): Answers the operational question "what is causing contention" directly

This is a pragmatic middle ground. The blocker mode is the actionable information — knowing that VACUUM is blocking your workload tells you to tune autovacuum, while knowing DDL is blocking tells you to fix your migration strategy.

2. Dual Semantics of the Mode Column

The most architecturally awkward aspect is that the mode column has different semantics depending on which counter is being examined:

For waits/wait_time: mode = the blocker's lock mode
For fastpath_exceeded: mode = the requester's lock mode (because slot exhaustion has no blocker)

The author acknowledges this tension and proposes documenting it rather than splitting views or NULLing values. The column is deliberately named mode (not blocker_mode) to accommodate this dual use. This is a defensible choice — splitting into separate views would complicate monitoring queries, and NULLing loses useful per-mode breakdown of fast-path exhaustion.

3. Chained Wait Attribution

The open question about chained waits reveals a fundamental limitation of per-event attribution in any cumulative statistics system:

TX1 holds AccessShareLock (long SELECT)
TX2 requests AccessExclusiveLock → blocked by TX1 (attributed to AccessShareLock)  
TX3 requests AccessShareLock → blocked by TX2 (attributed to AccessExclusiveLock)

TX3's wait is proximately caused by TX2's AccessExclusiveLock (which causes queue-priority blocking), but ultimately caused by TX1's long SELECT. The patch correctly attributes to the proximate blocker, which is:

Consistent with how pg_stat_lock already works (per-waiter attribution)
The only option that doesn't require expensive transitive-closure computation under the partition LWLock
Individually accurate for each waiter's experience

Walking the full blocker chain would require either holding multiple partition LWLocks or accepting stale data, both unacceptable for a statistics increment path.

Relationship to Existing Infrastructure

The patch leverages GetLockHoldersAndWaiters(), which already computes holder modes for log_lock_waits. This means the core logic is already battle-tested in production — the patch is essentially promoting information that's already computed in one code path into a persistent statistics aggregation.

The implementation sits at the intersection of:

Lock manager (lock.c, proc.c): Where the blocker mode is determined
Cumulative statistics system (pgstat_lock.c): Where the aggregation occurs
System views (pg_stat_lock): Where results are exposed

Potential Concerns for Review

Catalog version bump: Adding a column to a system view requires a catversion bump
pg_stat_reset() behavior: The expanded statistics keys need proper reset handling
Backward compatibility: Monitoring tools querying pg_stat_lock will see schema changes
"Strongest mode" heuristic correctness: When multiple conflicting modes are held simultaneously, "strongest" may not always correspond to the last one to be released — but it's a reasonable approximation without tracking per-holder grant order
Statistics naming: Whether mode adequately communicates the dual semantic without confusing users who expect it to always mean "blocker mode"

[PATCH] pg_stat_lock: add blocker mode dimension

Latest Update