pg_rewind does not rewind diverging timelines

First seen: 2026-04-30 08:19:21+00:00 · Messages: 15 · Participants: 4

Latest Update

2026-06-04 · claude-opus-4-6

New Developments Since Last Analysis

New Reviewer Challenge: History-Based Matching as Alternative

Kyotaro Horiguchi raised a design-level question: could the existing timeline history information (TLI + switchpoint LSN) be sufficient to distinguish divergent timelines without introducing UUIDs? His concern was primarily about operational convenience — UUIDs are longer and less human-friendly — rather than runtime overhead.

Mats Kindahl's Rebuttal

Mats explained precisely why history-file content alone is insufficient: two servers that undergo the same promotion sequence can end up with identical TLI and identical switchpoint LSN but represent different promotions with different data. This happens when both servers start from the same timeline, promote at the same LSN (writing different data of the same length afterward). While unlikely in practice, the scenario is real and the UUID is the only way to completely rule it out.

Horiguchi Concedes

After Mats's explanation, Horiguchi accepted the UUID approach, acknowledging that if the goal is to eliminate the possibility entirely (not just make it statistically unlikely), an additional identifier is necessary. He noted that keeping the UUID only in history files addresses his implementation concerns.

Existing Reviewers Confirm LGTM

Both Japin Li and Surya Poondla confirmed the v6 patch looks good to them. No further code changes requested.

Assessment

The thread has now resolved its only open design challenge (UUID necessity vs. history-only approach). The patch has three reviewer sign-offs (Japin Li, Surya Poondla, and now implicit acceptance from Horiguchi). It remains in queue for committer pickup.

History (2 prior analyses)

2026-06-01 · claude-opus-4-6

pg_rewind Does Not Rewind Diverging Timelines — May 2026 Summary

Problem Statement

Mats Kindahl discovered (via TLA+/TLC model checking) that PostgreSQL's timeline identifier (TLI) system can silently produce two physically distinct histories sharing the same TLI number. This occurs when:

A primary (TLI 1) crashes, restarts, promotes to TLI 2, writes some WAL, then crashes again before propagating its new timeline.
A standby (still on TLI 1) is independently promoted — also to TLI 2 — since it never learned of the first promotion.

pg_rewind compares timelines purely by TLI number, so it concludes the two nodes share TLI 2 history and only rewinds past the smaller LSN. The result is silent data corruption: blocks dirtied by the first node's TLI-2 writes are never reverted.

The Fix: Timeline UUIDs

The patch adds a UUID (UUIDv7) to each timeline history file entry. Two timelines are considered identical only if both TLI and UUID match. pg_rewind's ancestor-finding algorithm now walks back past TLI entries that match by number but differ in UUID, using the parent timeline's fork LSN as the true divergence point.

Patch Evolution During May

v2 (2026-05-01)

Simplified from v1 by removing the UUID from XLOG_END_OF_RECOVERY WAL records; UUID lives only in .history files.
Added regression test exercising three-promotion divergence depth.

v3 (2026-05-24, responding to Surya Poondla's review)

Fixed UUIDv7 epoch bug: Switched from GetCurrentTimestamp() (PostgreSQL epoch) to gettimeofday() (Unix epoch) for RFC 9562 conformance.
Removed dead code: All vestiges of the v1 EOR-record UUID approach cleaned up.
FATAL error level: UUID parsing errors in readTimeLineHistory() now raise FATAL consistently.
New semantic rule: Zero-UUID (legacy) vs non-zero-UUID (new) on the same TLI are treated as different timelines, tightening safety during mixed-version scenarios.
Test improvements: Deeper divergence scenarios, wal_keep_size instead of fragile restore_command, consolidated into single patch file.

v4 (late May)

Refactored inline memcmp UUID comparison to use existing matchingTimelineUUID() helper (code deduplication).

v5 (late May, responding to Japin Li's review)

Fixed stale comment referencing nonexistent timelines_match().
Renamed TimelineHistoriesData → TimeLineHistoriesData for naming consistency.
Minor style and test comment fixes.

Open Questions for Committers

Scope beyond pg_rewind: The same TLI-collision affects archive recovery, walreceiver TLI negotiation, and pg_basebackup. Does the UUID check need propagation into those paths?
Upgrade story: Old history files without UUIDs — the zero-UUID-as-different rule handles mixed-version scenarios but needs explicit documentation.
Replay-only detection: Since UUID is not in the WAL stream (v2 design), a standby cannot detect divergence from WAL alone — it must compare history files. Is this sufficient for all replication paths?
Test coverage: No test exercises streaming-replication-based reattach (only pg_rewind).

Status

The patch has converged through five versions with substantive review from two external reviewers. Core design is stable since v3. Remaining items are cosmetic. Awaiting committer review (likely Heikki Linnakangas, Michael Paquier, or Álvaro Herrera).

2026-06-01 · claude-opus-4-6

v6 Patch: Windows Platform Fix

The only new development is a Windows CI test failure reported by Japin Li and a corresponding fix from Mats Kindahl (v6 patch).

The Windows Path Issue

The test 005_same_timeline.pl fails on Windows because the restore_command path contains backslashes that get mangled. The CI output shows:

cp: cannot stat 'C:cirrus\build/testrun/pg_rewind/005_same_timelinedata/...'

The backslash in C:\cirrus\build is being interpreted as an escape character (eating the \b as a backspace), and the backslash before data similarly consumes the \d. This is a classic Windows/Perl path quoting issue in TAP tests.

The Fix

Mats added path-cleaning logic (modeled after existing code in Cluster.pm) to normalize paths for the restore_command on Windows. He notes that many paths in the test infrastructure are not platform-agnostic and suggests File::Spec as a broader solution, but considers that out of scope for this patch.

Assessment

This is a purely mechanical platform-compatibility fix with no algorithmic or design changes. The core patch (UUID in timeline history files, zero-UUID-vs-nonzero treated as different, FATAL on malformed UUID) remains unchanged from v5. The patch continues to converge on commit-readiness, now addressing cross-platform CI.