2026-05-14 · claude-opus-4-6

synchronized_standby_slots Behavior Inconsistent with Quorum-Based Synchronous Replication

Core Problem

PostgreSQL's synchronized_standby_slots GUC, introduced to support logical replication failover, enforces ALL-of-N semantics: every physical replication slot listed in the parameter must have caught up before a logical failover slot is permitted to proceed with decoding. This creates a fundamental availability mismatch with synchronous_standby_names, which supports ANY M-of-N (quorum) semantics.

The Architectural Inconsistency

In a typical 3-node HA deployment configured for quorum-based synchronous replication:

synchronous_standby_names = 'ANY 1 (standby1, standby2)'
synchronized_standby_slots = 'sb1_slot, sb2_slot'

If standby1 goes down, synchronous commits continue to succeed because standby2 satisfies the quorum. However, logical decoding blocks indefinitely in WaitForStandbyConfirmation(), waiting for sb1_slot to catch up — even though the transaction is already durably committed on a quorum of synchronous standbys. This defeats the availability guarantee the DBA intended by choosing quorum commit, and worse, can cause silent WAL accumulation on the primary leading to disk-full scenarios.

The root issue is that the two GUCs govern related but distinct concerns — commit durability (synchronous_standby_names) vs. logical slot advancement safety (synchronized_standby_slots) — yet have incompatible availability models.

Proposed Solution

Extending the GUC Syntax

The proposal extends synchronized_standby_slots to accept ANY M (slot1, slot2, ...) and FIRST N (slot1, slot2, ...) syntax, mirroring the grammar of synchronous_standby_names:

Plain list (slot1, slot2): Retains existing ALL-mode semantics — all listed slots must catch up (backward compatible default).
ANY N (...): Quorum semantics — logical decoding proceeds once at least N of the listed slots have caught up, regardless of which ones.
FIRST N (...): Priority semantics — waits for the first N slots in listed order that are valid and active, skipping invalid/missing/inactive ones but blocking on valid-but-lagging slots in priority order.

The two GUCs remain separate because the set of slots to synchronize can differ from the synchronous standby list (e.g., a DBA might want to ensure a geo-distant standby catches up before allowing logical consumers to read changes).

Key Technical Debates

1. Quorum Safety and Failover Correctness

Ashutosh Sharma raised a critical concern early in the thread: with ANY 1 (sync_standby1, sync_standby2), if sync_standby1 is ahead and confirms WAL that gets forwarded to the logical replica, and then sync_standby1 dies forcing failover to sync_standby2, the new primary could be at a lower LSN than the logical replica. The logical replication slot would be stale.

Amit Kapila countered that this is the same situation that exists for synchronous_standby_names with quorum commit — the failover orchestrator is responsible for selecting the most-caught-up standby. The documentation at logical-replication-failover.html provides steps to identify which replica is safe for subscriber switchover. This argument carried the day, with Shveta Malik and Satya concurring that failover correctness is the orchestrator's responsibility, not the GUC's.

2. Defaulting to synchronous_standby_names

An earlier thread (referenced by Amit Kapila) proposed having synchronized_standby_slots default to SAME_AS_SYNCREP_STANDBYS. This was conclusively rejected because:

synchronous_standby_names contains application_names; synchronized_standby_slots contains slot names — these are fundamentally different namespaces
Synchronous replication doesn't even require replication slots
Tools like pg_receivewal can appear in synchronous_standby_names but aren't standbys
Even when all standbys use slots, there's no guarantee application names match slot names

Alexander Kukushkin and Ashutosh Sharma both identified these issues, leading to quick consensus that the two GUCs must remain independently configured.

3. Parser Reuse vs. Local Helper Function

A significant design disagreement emerged between Ashutosh Sharma and Hou Zhijie about how to distinguish plain lists from explicit FIRST N (...) syntax.

The problem: The existing syncrep_yyparse grammar treats a bare list slot1, slot2 as FIRST 1 (slot1, slot2). But for synchronized_standby_slots, a bare list must mean ALL-mode (wait for all). The parser output is ambiguous.

Ashutosh's approach: Keep the shared parser untouched and add a local helper IsPrioritySyncStandbySlotsSyntax() that inspects the raw string to detect explicit FIRST keyword presence. This keeps changes localized and avoids risk to synchronous_standby_names behavior.

Hou Zhijie's approach: Modify the shared syncrep grammar to emit a new method SYNC_REP_IMPLICIT (later SYNC_REP_DEFAULT) for bare lists, making the parser itself distinguish the three forms. This eliminates the need for redundant string-parsing logic and avoids bugs like the one Ajin Cherian found where slot names starting with "first" (e.g., firstsub1) were misidentified as the FIRST keyword.

Amit Kapila suggested splitting the patch to evaluate both approaches independently, which led to the final 3-patch series where the parser refactoring came first as 0001.

4. Slot State Tracking and Reporting

The patch introduces a SyncStandbySlotsState enum to classify slot conditions:

typedef enum {
    SS_SLOT_NOT_FOUND,          /* slot does not exist */
    SS_SLOT_LOGICAL,            /* slot is logical, not physical */
    SS_SLOT_INVALIDATED,        /* slot has been invalidated */
    SS_SLOT_INACTIVE_LAGGING,   /* inactive and behind */
    SS_SLOT_ACTIVE_LAGGING,     /* active but hasn't caught up */
} SyncStandbySlotsState;

A behavioral regression was caught by Shveta Malik: the initial patch treated all inactive slots as blocking, but HEAD code correctly allowed inactive slots that had already caught up (restart_lsn >= wait_for_lsn) to be counted as caught up. The fix split the inactive state into SS_SLOT_INACTIVE_LAGGING (blocking) and regular caught-up (non-blocking).

Reporting was moved to a dedicated ReportUnavailableSyncStandbySlots() function with actionable messages including LSN gap information. The log level for lagging slots was set to DEBUG1 rather than WARNING or ERROR, since during shutdown the walsender legitimately waits for standbys to catch up and WARNING messages would be noisy without being actionable.

5. Testing the SS_SLOT_LAGGING Path

Creating a deterministic test for the "active but lagging" slot state proved surprisingly difficult:

recovery_min_apply_delay: Only delays replay; WAL receiver still receives, flushes, and sends feedback that advances restart_lsn
pg_wal_replay_pause(): Same problem — pauses replay but WAL receiver keeps running and sending feedback
Setting primary_conninfo to empty: Makes the slot inactive, not actively lagging
wal_receiver_status_interval: Non-deterministic timing
SIGSTOP on walsender (from 019_replslot_limit.pl): Works but is platform-dependent

The winning approach, suggested by Hou Zhijie and implemented by Ajin Cherian, uses psql as a replication client via START_REPLICATION SLOT physical <lsn>. This acquires the slot (making it active) but unlike a real WAL receiver, psql doesn't send status feedback, so restart_lsn never advances — creating a deterministic active-but-lagging condition. This reduced test execution from 60-140 seconds to ~6 seconds.

Final Patch Structure

After extensive iteration, the patch was split into three parts at Amit Kapila's suggestion:

0001: Refactors the syncrep parser to introduce SYNC_REP_DEFAULT for bare standby lists, enabling callers to distinguish FIRST N (...), ANY N (...), and plain list forms
0002: Adds ANY N quorum semantics to synchronized_standby_slots
0003: Adds FIRST N and N (...) priority syntax support

This ordering ensures each patch is independently functional and reviewable.

Implications

This change is architecturally significant for PostgreSQL's logical replication failover story. Without quorum-aware synchronized_standby_slots, any deployment using quorum synchronous replication is forced to choose between:

Logical replication availability (risking stale slots after failover if synchronized_standby_slots is not set)
Logical replication correctness (accepting that one standby failure blocks all logical consumers indefinitely)

The patch resolves this by allowing the logical slot advancement policy to match the commit durability policy, which is the only configuration that makes operational sense in quorum-based HA deployments.

synchronized_standby_slots behavior inconsistent with quorum-based synchronous replication

Latest Update