Monthly Summary: Fix ProcKill lock-group vs procLatch recycle race (May 2026)
Overview
This thread addresses a genuine concurrency bug in ProcKill() that has existed since PostgreSQL 9.6 (when lock groups were introduced for parallel query). The bug manifests as a PANIC: latch already owned by PID ... when a newly-started backend inherits a PGPROC slot whose procLatch.owner_pid has not yet been cleared. The month saw the fix go from initial submission through expert review, partial commit, and iterative test refinement.
The Bug
The race window exists between releasing leader_lwlock (after lock-group dissolution) and calling DisownLatch(). During this window, a concurrent last-member follower can observe the leader outside the lock group, push the leader's PGPROC onto the freelist, and a new backend can pop that slot and PANIC on OwnLatch() because owner_pid is still set.
Cross-fork confirmation (Greenplum, Yugabyte) establishes this is a production-observable bug, not merely theoretical — forks exercising lock groups more aggressively hit it regularly.
The Fix (Two-Part)
- Reorder: Move
SwitchBackToLocalLatch()andDisownLatch()to before the lock-group dissolution block, so the latch is disowned before any other member can observe the PGPROC as available. - Single decision point under
leader_lwlock: Computepush_selfandpush_leaderbooleans atomically while holding the lock, then perform a single consolidated freelist update. This eliminates double-push and inconsistent-ownership hazards.
Key Developments This Month
Michael Paquier's Review (Critical Milestone)
Michael provided the first independent expert review, confirming the bug is real and the fix direction correct. His key feedback:
- Split into two commits: (1) the
push_leader/push_selfrefactoring as a code-clarity improvement, (2) theDisownLatch()reorder as the actual fix - Backport minimalism: Keep back-branch patches minimal; only include the refactoring if strictly necessary for correctness
- Test accuracy: The test comment claimed to verify a PANIC doesn't occur, but it actually verifies the ordering invariant holds — needs correction
- Test raciness: Needs synchronization tweaks to avoid failures on overloaded buildfarm machines
Test Infrastructure Evolution
- Packaging resolved:
prockill_racecallbacks merged into the existinginjection_pointsmodule per Michael's direction (no standalone test module) injection_point_condition.hcommitted: Michael applied this header extraction patch — the first piece of the thread to land in tree- Multi-wakeup rejected: Vlad proposed waking all waiters on a single
injection_points_wakeup()call; Michael rejected this as the single-wakeup behavior was intentional - Separate injection points adopted: Vlad reworked the test to use distinct injection points for leader and follower synchronization
Final Patchset Structure (vVL2)
- 0001:
injection_point_condition.hextraction — COMMITTED - 0002: Multi-wakeup semantics — REJECTED, reworked out
- 0003: The test (reworked to use separate injection points, properly fails without fix)
- 0004:
push_leader/push_selfboolean refactoring - 0005: The actual bug fix (moving
DisownLatch()earlier)
Current Status
The fix direction is validated by a committer. One preparatory patch is committed. The remaining patches (test, refactoring, fix) await Michael's detailed review. The test has been reworked to address the multi-wakeup rejection. Backport scope for the refactoring patch remains an open question.