Improve conflict detection when replication origins are reused

First seen: 2026-05-14 03:05:06+00:00 · Messages: 1 · Participants: 1

Latest Update

2026-05-14 · claude-opus-4-6

Improve Conflict Detection When Replication Origins Are Reused

The Core Problem: Stale Origin IDs Causing Silent Conflict Misses

This thread addresses a subtle but architecturally significant bug in PostgreSQL's logical replication conflict detection mechanism. The issue stems from the intersection of two subsystems: replication origins (the identity tracking mechanism for replicated tuples) and commit_ts (the commit timestamp SLRU that stores per-transaction origin metadata).

How Conflict Detection Works Today

In logical replication, each subscription is assigned a ReplOriginId (a small integer, uint16) that is stored alongside the commit timestamp in the commit_ts SLRU. When the apply worker processes an incoming change that conflicts with an existing tuple (e.g., an UPDATE on a row that already exists), it checks whether tuple_origin == current_origin. If they match, the system assumes the tuple was written by this same subscription — meaning it's "our own" data — and skips raising an update_origin_differ conflict.

The Reuse Problem

ReplOriginId values are allocated from a limited namespace and are reused after a replication origin is dropped (via DROP SUBSCRIPTION). The dangerous sequence is:

  1. Subscription sub1 gets roident = 1, replicates rows into table t1
  2. sub1 is dropped — origin ID 1 is freed
  3. New subscription sub2 is created, gets roident = 1 (reused)
  4. Updates arrive for rows previously written by sub1
  5. Conflict detection sees tuple_origin (1) == current_origin (1)no conflict raised

This is a false negative: the system believes the row belongs to the current subscription when it actually belongs to a completely different (now-defunct) subscription. This becomes genuinely dangerous when sub2 connects to a different publisher than sub1 did — real data conflicts are silently swallowed.

Why This Matters Architecturally

This bug exposes a fundamental design tension: PostgreSQL uses a compact integer ID for origin tracking (good for SLRU storage efficiency) but provides no mechanism to distinguish between different temporal uses of the same ID. The commit_ts SLRU retains stale origin data indefinitely after a subscription is dropped, creating a ghost reference problem. This is particularly concerning as multi-master and bidirectional replication topologies become more common — silent conflict misses can lead to data divergence that's extremely difficult to detect and repair after the fact.

The thread also references related issues with tablesync worker origins ([1]), suggesting this is part of a broader class of problems around origin lifecycle management.

Proposed Solutions

Approach 1: Scrub Stale Origins from commit_ts SLRU on DROP SUBSCRIPTION

Mechanism: When a subscription is dropped and its replication origin is freed, scan the entire commit_ts SLRU and replace all occurrences of the old origin ID with InvalidRepOriginId (0). This ensures that any future subscription reusing the same ID will see origin 0 on old tuples, which will correctly differ from the new subscription's origin, triggering conflict detection.

Technical Implications:

Approach 2: Store Origin Creation Timestamp (Preferred)

Mechanism: Add a creation timestamp to each replication origin's metadata. During conflict detection, when tuple_origin == current_origin, perform an additional check: if the tuple's commit timestamp is ≤ the origin's creation time, it must have been written by a previous incarnation of this origin ID, so raise a conflict.

Technical Implications:

Analysis of Design Tradeoffs

The two approaches represent a classic systems design tradeoff between eager cleanup (Approach 1: fix the data when the origin is freed) and lazy detection (Approach 2: detect the problem at query time using metadata).

Approach 2 is clearly superior for several reasons:

  1. Crash safety is inherent rather than requiring additional engineering
  2. No pathological performance cases (the SLRU scan in Approach 1 has unbounded cost)
  3. The additional metadata is small (one timestamp per origin, and the origin namespace is small)
  4. The false-positive edge case is harmless in practice (microsecond-level collision is rare, and conflict over-detection is safe)

The main risk with Approach 2 is the catalog/schema change and upgrade handling, but PostgreSQL regularly handles such changes across major versions.

Relationship to Broader Issues

The referenced threads [1] and [2] discuss related problems with tablesync origins — the temporary replication origins created during initial table synchronization in logical replication. These origins can similarly cause stale-reference problems. The origin reuse fix proposed here may synergize with solutions for the tablesync issue, as both stem from the same fundamental problem: origin IDs lack temporal disambiguation.

This also touches on the broader question of whether ReplOriginId should be a richer data type or whether the origin lifecycle needs more formal state management (e.g., tombstoning rather than immediate ID reuse).