PGConf.dev CSN unconference session: notes and follow-up discussion takeaways

First seen: 2026-05-27 08:56:57+00:00 · Messages: 1 · Participants: 1

Latest Update

2026-06-01 · claude-opus-4-6

PGConf.dev 2026 CSN Unconference Session: Technical Analysis

The Core Problem: Commit Sequence Numbers and the Visibility/Durability Tension

This thread captures the state-of-the-art thinking on one of PostgreSQL's most architecturally significant in-progress features: Commit Sequence Numbers (CSN). CSN is a long-discussed mechanism to replace or augment PostgreSQL's current snapshot-based visibility system (which relies on transaction ID arrays and the pg_xact CLOG) with a monotonically increasing sequence number assigned at commit time. The goal is to make snapshots cheaper (O(1) instead of O(active_transactions)), enable more efficient replication visibility semantics, and potentially fix consistency anomalies between primary and replica.

The fundamental tension identified in this session — and articulated as the primary source of complications — is the disconnect between visibility semantics and durability semantics in PostgreSQL. These two concepts are conflated in surprising ways due to the synchronous_commit GUC:

The problem is that CSN must linearize all commits into a single total order. But the current system allows transactions with wildly different durability latencies to all become visible immediately. An async commit that takes microseconds and a remote_apply commit that takes milliseconds (or more) currently coexist in the visibility order without issue because PostgreSQL uses per-transaction visibility (CLOG lookup). Once you impose a global sequence number, you must decide: does the CSN represent the moment of visibility, or the moment of durability?

The Long Fork Problem

The thread references the Long Fork consistency phenomenon, documented in Jepsen's analysis of Amazon RDS for PostgreSQL 17.4. Long Fork occurs when a primary and replica expose different visibility orderings for committed transactions. In PostgreSQL's current architecture, this happens because:

  1. On the primary, transactions become visible immediately upon commit (CLOG bit flip).
  2. On a replica, transactions become visible when their commit WAL record is replayed.
  3. WAL replay order on replicas is strictly LSN-ordered, but primary visibility order is not strictly LSN-ordered (due to concurrent commits with different WAL flush behaviors).

This means two transactions T1 and T2 might be visible in order (T1, T2) on the primary but (T2, T1) on a replica, creating a consistency anomaly.

Proposed Solutions

Solution 1: Commit Record LSN as CSN on Replicas (Consensus)

The session reached consensus that on replicas, the LSN of the commit WAL record is a natural CSN. Replay is already strictly LSN-ordered, so this preserves existing replica visibility semantics while enabling O(1) snapshot comparisons. This is architecturally clean and non-controversial.

Solution 2: LSN-based CSN on Primaries (Contentious)

Using the commit record's LSN as the CSN on the primary is more contentious because it interacts badly with synchronous_commit:

Solution 3: "Commit Visible" WAL Record (Novel Proposal)

A suggested innovation is to log a separate 'commit visible' WAL record that is written only after a transaction's COMMIT record has met its durability requirement. Key properties:

Tradeoff: This adds WAL volume and latency to the visibility path. The suggestion in point 2b (making async commits wait for durability of sync commits' CSN) would impose latency on async-commit sessions that read recently-modified data from sync-commit sessions.

Solution 4: In-Memory Counter CSN (Pragmatic Fallback)

An alternative approach (6a) uses a local in-memory counter to generate CSNs only at the moment of visibility, without logging them to WAL:

Key Architectural Insight: The Two-Phase Nature of Commit

The deepest insight from this session is that PostgreSQL's commit is already implicitly two-phase in the presence of synchronous_commit:

  1. Phase 1 (Commit): Transaction writes commit record to WAL, flips CLOG bit → immediately visible.
  2. Phase 2 (Durability confirmation): WAL is flushed locally and/or confirmed by standbys → client is notified of success.

Currently, visibility happens at Phase 1 regardless of the durability level. The "commit visible" record proposal would make this two-phase nature explicit in WAL, enabling replicas to reproduce the primary's visibility semantics faithfully.

Implications for the PostgreSQL Architecture

  1. Snapshot scalability: CSN eliminates the need to copy arrays of active transaction IDs into snapshots, which is critical for high-connection-count workloads.
  2. Replication consistency: LSN-based or "commit visible"-based CSN could enable replicas to guarantee the same visibility ordering as the primary, eliminating Long Fork.
  3. WAL format changes: Any WAL-logged CSN approach requires WAL format changes, which are major cross-version compatibility concerns.
  4. pg_xact interaction: CSN doesn't necessarily eliminate CLOG — it may run in parallel for backward compatibility, or CLOG could be derived from CSN during recovery.

Assessment

The community appears to be converging on a layered approach: use commit-record LSN as CSN on replicas (uncontroversial), while the primary CSN mechanism remains an open design question with multiple viable approaches of increasing ambition (in-memory counter → commit LSN → "commit visible" record). The thread demonstrates that the community is taking the Jepsen Long Fork findings seriously but has not committed to solving them as part of the initial CSN implementation.