synchronized_standby_slots behavior inconsistent with quorum-based synchronous replication

First seen: 2026-02-24 22:08:37+00:00 · Messages: 78 · Participants: 10

Latest Update

2026-06-04 · claude-opus-4-6

Monthly Summary: synchronized_standby_slots Behavior Inconsistent with Quorum-Based Synchronous Replication (May 2026)

Overview

This thread addresses a fundamental availability mismatch in PostgreSQL's logical replication failover infrastructure. The synchronized_standby_slots GUC enforces ALL-of-N semantics (every listed physical slot must catch up before logical decoding proceeds), which conflicts with synchronous_standby_names' support for ANY M-of-N (quorum) semantics. In quorum-based HA deployments, a single standby failure blocks all logical consumers indefinitely even though synchronous commits continue to succeed.

Problem Statement

In a typical 3-node HA deployment:

synchronous_standby_names = 'ANY 1 (standby1, standby2)'
synchronized_standby_slots = 'sb1_slot, sb2_slot'

If standby1 goes down, synchronous commits succeed via standby2, but logical decoding blocks in WaitForStandbyConfirmation() waiting for sb1_slot — causing silent WAL accumulation and potential disk-full scenarios.

Proposed Solution

Extend synchronized_standby_slots to accept quorum/priority syntax mirroring synchronous_standby_names:

Key Technical Debates Resolved

Quorum Safety and Failover Correctness

Ashutosh Sharma raised concerns about logical replicas ending up ahead of a new primary after failover with quorum semantics. Amit Kapila argued (with consensus) that this mirrors the existing synchronous_standby_names situation — failover orchestrators are responsible for selecting the most-caught-up standby.

Independent GUC Configuration

A proposal to default synchronized_standby_slots to match synchronous_standby_names was rejected because they operate in different namespaces (slot names vs. application names), synchronous replication doesn't require slots, and tools like pg_receivewal can appear in sync standby names.

Parser Design

A disagreement between Ashutosh Sharma (local helper function to detect syntax) and Hou Zhijie (modify shared syncrep grammar to emit SYNC_REP_DEFAULT for bare lists) was resolved in favor of the grammar approach, which avoids bugs like slot names starting with "first" being misidentified as the FIRST keyword.

Testing Active-but-Lagging Slots

After exploring recovery_min_apply_delay, pg_wal_replay_pause(), and SIGSTOP (all inadequate), the solution uses psql as a replication client via START_REPLICATION SLOT physical <lsn> — acquires the slot without sending feedback, creating a deterministic active-but-lagging condition (~6 seconds vs. 60-140 seconds).

Bug Discovery: Duplicate Slot Entries

Shveta Malik identified a correctness bug where duplicate slot names in quorum/priority mode are counted multiple times:

ALTER SYSTEM SET synchronized_standby_slots = 'FIRST 2 (standby_1, standby_1, standby_2, standby_3)';

This allows decoding to proceed with only standby_1 caught up (counted twice). Unlike synchronous_standby_names which waits on distinct walsender processes, the slot code iterates name strings without deduplication. Ashutosh Sharma acknowledged this will be fixed in the next patch version.

Final Patch Structure

  1. 0001: Refactors syncrep parser to introduce SYNC_REP_DEFAULT for bare lists (distinguishes implicit from explicit FIRST)
  2. 0002: Adds ANY N quorum semantics to synchronized_standby_slots
  3. 0003: Adds FIRST N priority syntax support

Current Status

Awaiting next patch version addressing the duplicate slot deduplication bug and minor cosmetic feedback from Shveta Malik's review.

History (1 prior analysis)
2026-06-04 · claude-opus-4-6

Incremental Update: Design Discussion on Duplicate Slot Handling Strategy

Summary

Ashutosh Sharma responds to Shveta Malik's duplicate slot bug report with a concrete design question: should duplicates be silently deduplicated internally, or should they be rejected with an error at configuration time? He argues for the error-rejection approach and identifies what he considers a pre-existing bug in synchronous_standby_names.

New Technical Argument: Error Rejection vs. Silent Deduplication

Ashutosh presents two options for handling duplicate slot names:

  1. Silent deduplication: Internally resolve FIRST 2 (s1, s1, s1, s2) to FIRST 2 (s1, s2), but SHOW would still display the original user-specified string. This creates a disconnect between what the user sees and how the system behaves.

  2. Error in check hook: Detect duplicates during GUC validation and raise an error, forcing the user to correct their configuration.

Ashutosh advocates for option 2 (error rejection) on the grounds that silent deduplication would be confusing — the displayed GUC value wouldn't match the effective behavior. This is a UX/correctness tradeoff: silent deduplication is more forgiving but potentially misleading; error rejection is stricter but transparent.

Pre-existing Bug Claim in synchronous_standby_names

Ashutosh notes that synchronous_standby_names currently accepts FIRST 2 (s1, s1, s1, s1) without error. While the runtime behavior happens to be correct (because it waits on distinct walsender processes), he argues this configuration should have been rejected at parse time — possibly causing a startup failure. This suggests he may propose a companion fix or at least wants to establish precedent that duplicate rejection is the correct approach for both GUCs.

Status

Still waiting on a new patch version. The design question about error vs. silent dedup needs resolution (likely from Shveta or Amit Kapila) before the next version is posted.