Startup process deadlock: WaitForProcSignalBarriers vs aux process

First seen: 2026-04-22 11:21:02+00:00 · Messages: 13 · Participants: 4

Latest Update

2026-06-01 · claude-opus-4-6

Monthly Summary: Startup Process Deadlock — WaitForProcSignalBarriers vs Aux Process (May 2026)

Overview

This thread resolved a long-latent race condition in PostgreSQL's ProcSignalBarrier (PSB) machinery that causes a deadlock during startup. The bug exists in v15–master but was recently unmasked on master by commit 67c20979c (Dec 2025), which added an unconditional EmitProcSignalBarrier call during every StartupXLOG, turning a rare edge case into a reproducible every-boot failure.

The Race Condition

The ProcSignalBarrier protocol uses per-slot pss_pid (occupancy flag) and pss_barrierGeneration (catch-up counter) fields. The critical interleaving:

  1. Newcomer sets pss_barrierGeneration = global_gen under spinlock; pss_pid still 0.
  2. Emitter bumps global_gen to global_gen+1, scans slots, sees pss_pid == 0, skips this slot.
  3. Newcomer writes pss_pid = MyProcPid, becomes visible.
  4. Waiter sees a live slot with stale pss_barrierGeneration and waits forever — the newcomer has no pending barrier flag and will never advance its generation.

The fundamental issue: the emitter's lock-free PID check can observe an empty slot that has already snapshot a generation, creating a window where no one will ever notify the newcomer.

Fix Approach

Sawada's patch reorders PID publication relative to generation capture in ProcSignalInit, ensuring the emitter's lock-free pss_pid == 0 check is safe. The key invariant enforced: any emitter that skips a slot (because pss_pid == 0) is guaranteed the newcomer will later read a generation ≥ the emitted one.

Patch Refinement: pg_atomic_write_membarrier_u32()

A micro-optimization was applied in the revised patch — replacing:

pg_atomic_write_u32(&slot->pss_pid, MyProcPid);
pg_memory_barrier();

with the combined:

pg_atomic_write_membarrier_u32(&slot->pss_pid, MyProcPid);

This uses Linux's sys_membarrier() (kernel ≥ 4.3) to avoid a hardware store fence in the writer, while preserving identical correctness semantics. Updated for master and v18 backpatch only.

Secondary Fix: InitializeProcessXLogLogicalInfo Ordering

A related correctness bug was identified: InitializeProcessXLogLogicalInfo() was called before ProcSignalInit(), allowing a process to read stale logical-info state and then register as "caught up." The fix moves this initialization after ProcSignalInit, matching the pattern already used for InitLocalDataChecksumState.

Backpatching Scope

Scheduling Decision

Sawada deferred pushing the fix until after the May minor releases, judging the bug "not very visible in practice" on stable branches (only triggered by rare DROP DATABASE/TABLESPACE smgr barriers). The plan: commit to master soon after minor releases, with backbranch variants getting longer soak time.

Rejected Hypothesis

Matthias initially hypothesized that the race was between slot registration and signal handler installation in AuxiliaryProcessMainCommon. Andres rebutted this: postmaster children fork with all signals blocked (BlockSig), unblocking only after handlers are installed — signals in the window are kernel-pended, not lost. Matthias conceded.

Reproduction

Alexander Lakhin confirmed the diagnosis by injecting pg_usleep(10000) between cancel-key initialization and PID publication in ProcSignalInit, turning a rare buildfarm flake into a deterministic failure across multiple test scenarios (DROP DATABASE/TABLESPACE redo paths).

History (1 prior analysis)
2026-06-01 · claude-opus-4-6

Round Update: Patch committed — no new technical content

The only new message is from Sawada, confirming that the fix has been pushed (committed) down to v15. This is a brief administrative acknowledgment thanking Matthias for the review. There is no new technical discussion, no further patch revision, and no design changes.

The thread appears to be concluded: the race condition fix has been committed to master and backpatched through v15, as previously planned.