[PATCH] Fix LISTEN startup race with direct advancement

First seen: 2026-05-19 20:37:56+00:00 · Messages: 1 · Participants: 1

Latest Update

2026-05-20 · claude-opus-4-6

Deep Technical Analysis: LISTEN Startup Race with Direct Advancement

The Core Problem

This thread identifies a subtle race condition in PostgreSQL's asynchronous notification system (async.c) introduced by a rework committed in 282b1cd. The race exists in the LISTEN registration path during transaction commit, specifically in the window between two commit phases:

  1. PreCommit_Notify() — where a first-time LISTEN registers the backend in the shared listener data structure and records its queue position (the tail of the notification queue at that moment).
  2. AtCommit_Notify() — where the staged listen action is finalized by setting listening = true in the shared channel map.

The Race Window

Between these two phases, a concurrent session can execute NOTIFY and commit. When SignalBackends() runs on behalf of that NOTIFY, it iterates over registered listeners to determine which backends should be woken. If it encounters the staged listener entry with listening = false, it skips that backend entirely, concluding it is "not yet committed" and therefore not interested.

Why Direct Advancement Makes This Dangerous

The critical architectural detail is direct advancement — a mechanism where a backend's queue read pointer can be moved forward without that backend actually consuming the notification. This is an optimization to avoid waking backends unnecessarily. However, in this race scenario, the combination of:

  1. Skipping the staged listener (because listening = false)
  2. Direct advancement moving the queue pointer past the notification

...means the notification is permanently lost. The backend will never see it because its queue pointer has already advanced past the position where the notification was inserted.

Distinction from the Documented Race

The thread carefully distinguishes this from the documented LISTEN startup race described in listen.sgml:

Proposed Fix

The fix is minimal and elegant: remove the early-continue for listening = false entries in SignalBackends():

-           if (!listeners[j].listening)
-               continue;       /* ignore not-yet-committed listeners */

By treating staged (not-yet-committed) LISTEN entries as possible listeners, SignalBackends() will wake the backend or at minimum not advance its queue pointer past the notification. This converts the false-negative race into a false-positive one (the backend might receive a notification for something it handles during its initial scan), which is the same class of harmless behavior already documented.

Tradeoffs

The fix introduces a potential for spurious wakeups: backends that are in the process of committing a LISTEN (but haven't fully committed yet) may be woken for notifications they haven't technically subscribed to yet. However:

  1. The window is extremely small (between PreCommit and AtCommit phases)
  2. Spurious wakeups are benign — the backend will simply find nothing actionable in its queue scan
  3. The alternative (missed notifications) violates the fundamental contract of LISTEN/NOTIFY

Patch Structure

The submission follows best practices with a three-patch series:

This structure allows reviewers to verify the test fails without the fix and passes with it, and separately validates that the documented behavior is also covered by tests.

Architectural Context

The async.c subsystem manages LISTEN/NOTIFY through a shared-memory circular queue (slru-based) and a shared channel map. The two-phase commit integration (PreCommit/AtCommit split) is necessitated by PostgreSQL's requirement that shared state changes be atomic with respect to the transaction's visibility. The 282b1cd rework likely introduced the direct advancement optimization to reduce IPC overhead, but inadvertently created this correctness gap by not accounting for the intermediate state where a listener is registered but not yet marked active.