Improve conflict detection when replication origins are reused

First seen: 2026-05-14 03:05:06+00:00 · Messages: 8 · Participants: 2

Latest Update

2026-05-22 · claude-opus-4-6

Incremental Update: Brief Confirmation of LSN Infeasibility

Shveta Malik's latest message is a brief acknowledgment confirming the LSN-based alternative is not viable. She corroborates Nisha's earlier finding by noting that while TransactionIdGetCommitLSN() exists, it does not return the exact commit LSN needed for precise comparison.

This message adds no new technical arguments, no new alternatives, and no patch updates. It simply closes the loop on the LSN discussion by confirming from the reviewer side that no existing infrastructure supports the LSN-based approach.

Status

The thread remains converged on Approach 2 (origin creation timestamp) as the only viable solution. No new patch version has been posted. The review is still pending.

History (2 prior analyses)
2026-05-20 · claude-opus-4-6

Incremental Update: Alternative Approaches Explored and Dismissed

Shveta's Agreement and New Alternative Ideas

Shveta Malik formally endorsed Approach 2 (origin creation timestamp) as the most practical solution, but also explored two alternative ideas before settling on that conclusion:

Alternative Idea 1: Sequential ID Exhaustion Before Reuse

Concept: Modify replorigin_create to cycle through the full uint16 range (65K IDs) sequentially before reusing any ID, rather than eagerly reusing freed IDs.

Why dismissed: This is acknowledged as merely a probabilistic mitigation, not a fix. A busy system with frequent subscription creation/teardown could exhaust the 2-byte namespace, at which point the original bug returns. It trades a reliability guarantee for a statistical argument — unacceptable for a correctness fix.

Alternative Idea 2: LSN-Based Comparison Instead of Timestamp

Concept: Store an origin_creation_lsn and compare it against the tuple's commit LSN instead of using timestamps. This would eliminate the microsecond-granularity false-positive edge case entirely, since LSN ordering is strictly monotonic.

Why dismissed: The commit LSN for a specific tuple is not available during conflict detection without extending the commit_ts SLRU structure to include an 8-byte LSN field per entry. This would significantly bloat the SLRU (currently stores timestamp + 2-byte origin per transaction). The storage overhead was deemed unjustifiable for this use case.

Nisha's Technical Confirmation on LSN Infeasibility

Nisha confirmed that there is no existing mechanism to extract a tuple's commit LSN during the apply path without extending commit_ts. She specifically investigated pd_lsn (the page-level LSN from PageGetLSN(page)) but correctly noted this tracks the most recent WAL record that modified the page, not any specific tuple's commit — there's no correlation to a specific tuple's xmin transaction.

Significance

This exchange solidifies Approach 2 (origin creation timestamp) as the only viable path forward. The alternatives have been systematically eliminated:

  • Approach 1 (SLRU scrubbing): crash-unsafe, unbounded performance cost
  • Sequential ID exhaustion: probabilistic, not a correctness guarantee
  • LSN-based comparison: requires unacceptable SLRU structure expansion

The false-positive edge case in Approach 2 (timestamp collision at origin creation instant) remains the only known tradeoff, and both reviewers agree it is acceptable.


2026-05-18 · claude-opus-4-6

Incremental Update: New Scenario Discussion and Scope Clarification

New Technical Discussion: ALTER SUBSCRIPTION Publisher-Switching Scenario

Shveta Malik raised a new scenario that extends the problem beyond origin ID reuse to origin ID semantic ambiguity when a subscription changes its upstream publisher:

  1. sub1 on server3 subscribes to pub1 on server1, gets roident = N
  2. Both server1 and server2 independently insert identical rows (e.g., rows with key 10, 20, 30)
  3. sub1 replicates rows from server1 → stored with origin N on server3
  4. sub1 is altered to connect to server2 (via manual slot creation)
  5. An UPDATE arrives from server2 for a row that was originally inserted by server1
  6. Conflict detection sees tuple_origin (N) == current_origin (N)no conflict raised

This is a fundamentally different failure mode: the origin ID was never reused/dropped — it's the same subscription with the same origin ID, but the upstream publisher changed. The row was written by publisher A, but the subscription now streams from publisher B.

Nisha's Response: Intentional Design Boundary

Nisha argued this scenario is expected behavior under the current model, not a bug, because:

  1. Origin identity tracks subscription streams, not publisher servers — the current design deliberately ties origin metadata to "which subscription applied this row," not "which server originally generated it."

  2. The inverse problem also exists — two different subscriptions replicating the same table from the same publisher will raise update_origin_differs conflicts even though the data has common origin, demonstrating that the abstraction is consistently subscription-level.

  3. Publisher-level tracking would require architectural changes — specifically, global node identity tracking akin to BDR/pglogical, which is a fundamentally different replication model.

This exchange is significant because it explicitly scopes the patch: the proposed fix (origin creation timestamp) only addresses the origin ID reuse after DROP problem, not the broader semantic gap where a single subscription's meaning changes when re-pointed at a different publisher.

Review Status

Shveta indicated she will prioritize the review, suggesting the patch will receive formal review attention soon.