[PATCH] Prevent repeated deadlock-check signals in standby buffer pin waits

First seen: 2026-04-19 05:46:57+00:00 · Messages: 5 · Participants: 4

Latest Update

2026-05-20 · claude-opus-4-6

Technical Analysis: Preventing Repeated Deadlock-Check Signals in Standby Buffer Pin Waits

Core Problem

In PostgreSQL's hot standby recovery conflict resolution, there is a behavioral asymmetry between two conflict paths that share similar architecture. When the startup process on a standby needs to replay WAL that requires exclusive access to a buffer page (e.g., during VACUUM cleanup), it must resolve conflicts with backends that hold buffer pins on that page. This is handled by ResolveRecoveryConflictWithBufferPin().

The Signal Storm Problem

The control flow works as follows:

  1. LockBufferForCleanup() calls ResolveRecoveryConflictWithBufferPin()
  2. A deadlock_timeout timer is armed
  3. When the timeout fires, got_standby_deadlock_timeout is set to true
  4. The function sends PROCSIG_RECOVERY_CONFLICT_BUFFERPIN (the deadlock-check signal) to the conflicting backend
  5. The function returns to LockBufferForCleanup()
  6. The caller loops back, re-enters the function, arms another deadlock_timeout
  7. The signal fires again after the next interval

This creates a pattern where the startup process repeatedly signals the same backend every deadlock_timeout interval (default 1s), even though the backend has already been notified. The patch author demonstrated via strace that this results in 9 SIGUSR1 broadcasts over a 10-second window on unpatched code.

Historical Context and the XXX Comment

The lock-conflict path (ResolveRecoveryConflictWithLock) had the identical problem and was fixed in commit 8900b5a9d59a. The fix there was to add a second ProcWaitForSignal() call after sending the deadlock-check signal, causing the startup process to sleep until the actual conflict resolves (i.e., the backend unpins the buffer) rather than returning to the caller's loop.

The buffer-pin path was knowingly left unfixed, marked with an XXX comment asking whether the same fix should be applied. This patch finally addresses that TODO.

Proposed Solution

The fix is architecturally simple and mirrors the established pattern from the lock-conflict path:

  1. After sending the RECOVERY_CONFLICT_BUFFERPIN_DEADLOCK signal, reset got_standby_deadlock_timeout = false
  2. Call ProcWaitForSignal() to block the startup process
  3. The startup process now waits until UnpinBuffer() wakes it (via ProcSendSignal), rather than looping back to the caller

This ensures only one deadlock-check signal is sent per conflict episode, reducing unnecessary signal traffic and making the behavior consistent with the lock-conflict resolution path.

v2 Additions

The second version also initializes got_standby_delay_timeout = false before enabling the STANDBY_TIMEOUT timer. This is a defensive coding improvement for symmetry with how got_standby_deadlock_timeout is already initialized. While likely harmless in current code (the timeout handler sets it to true, and the flag is only checked after the wait), it prevents potential issues if code paths change in the future.

Architectural Significance

Why This Matters Beyond "Just Signals"

  1. Signal coalescing risk: SIGUSR1 is used as a multiplexed notification mechanism in PostgreSQL. Excessive signals create unnecessary wake-ups and processing overhead in the target backend's signal handler, which must check multiple flag variables.

  2. Recovery conflict escalation: After max_standby_streaming_delay, the standby will cancel the conflicting query. The repeated signaling doesn't change this behavior but adds useless overhead during the wait window.

  3. Consistency principle: Having two parallel code paths (lock conflicts vs. buffer-pin conflicts) with different behaviors for the same logical pattern creates maintenance burden and confusion for developers.

Classification Debate: Bug vs. Improvement

The thread contains an interesting classification discussion:

This classification matters operationally: if it's a bug, it should be backpatched to supported versions; if it's an improvement, it waits for the next major release. The repeated signaling doesn't cause incorrect behavior (the conflict still resolves properly), but it does cause unnecessary work. The "bug" interpretation likely relates to whether the repeated signals could cause visible user-facing issues (e.g., log spam, performance degradation under heavy conflict scenarios).

Testing Approach

The patch author validated using the existing recovery conflict test infrastructure (src/test/recovery/t/031_recovery_conflict.pl) with a specific scenario: