Bug in ALTER SUBSCRIPTION ... SERVER / ... CONNECTION with broken old server

First seen: 2026-04-22 01:51:36+00:00 · Messages: 16 · Participants: 6

Latest Update

2026-05-27 · claude-opus-4-6

Incremental Update

Amit Kapila's Brief Comment (2026-05-27)

Amit Kapila weighs in with a single short statement agreeing that the HINT message approach (from v6-0003's ReportSlotConnectionError()) is not the right solution for the reported problem. This aligns with Evan's earlier qualification that the HINT ("Use ALTER SUBSCRIPTION ... DISABLE...") could be misleading since fixing the user mapping directly is also a valid recovery path.

However, the message is terse and does not propose an alternative, nor does it clarify whether Amit objects to the subtransaction approach itself or only to the wording/presence of the HINT. No new patch version, no new technical argument, and no concrete alternative is offered.

Assessment

This is a minor acknowledgment-level message. It signals that a committer (Amit Kapila) has eyes on the thread and has a concern about the HINT approach, but does not yet constitute actionable review feedback beyond what Evan already raised. The thread awaits either a revised patch from Jeff addressing the HINT concern, or further elaboration from Amit on what he considers "the right solution."

History (1 prior analysis)
2026-05-18 · claude-opus-4-6

New Developments (Since Prior Analysis #3)

Jeff Davis posts v5 — a 4-patch series addressing Evan's tablesync-slot concern (2026-05-14)

Jeff restructured his approach into a more granular series:

  1. v5-0001: Fixes check_pub_rdt (retain dead tuples publisher-side check) for the foreign server case — a previously unidentified bug where ALTER SUBSCRIPTION ... SERVER/CONNECTION paths didn't properly handle the retaindeadtuples property check.

  2. v5-0002: Refactors AlterSubscription() so all option parsing happens first, enabling precise determination of which paths need conninfo and which don't. This is the core fix for the ALTER case.

  3. v5-0003: Moves connection string building in DropSubscription() after the early exit (slotname == NULL && rstates == NIL), directly addressing Evan's earlier concern about the predicate needing to mirror the function's own early-exit logic.

  4. v5-0004: Handles the remaining case — slotname == NULL but rstates != NIL — using a subtransaction to catch connection errors during DROP SUBSCRIPTION. If the connection to the publisher fails (e.g., broken user mapping), the DROP still succeeds but tablesync slots may be leaked. Errors are routed through ReportSlotConnectionError() which adds a HINT about disabling the subscription and disassociating the slot.

Evan catches an Assert contradiction in v5 (2026-05-15)

Evan identifies that v5-0001 sets check_pub_rdt = sub->retaindeadtuples for both ALTER_SUBSCRIPTION_SERVER and ALTER_SUBSCRIPTION_CONNECTION paths, while v5-0002 sets conninfo_needed = false for those same paths. But later code asserts Assert(conninfo_needed) when check_pub_rdt is true. This means if a subscription has retaindeadtuples enabled, ALTER ... SERVER or ALTER ... CONNECTION would fire the Assert — a clear bug in the patch series' internal consistency.

Jeff posts v6 fixing the Assert issue (2026-05-15)

Key changes:

  • Assert changed to Assert(new_conninfo || orig_conninfo_needed) — acknowledging that when switching servers/connections, the new conninfo (not the old) should be used for check_pub_rdt validation.
  • Added missing handling for SUBOPT_ORIGIN which could also set check_pub_rdt.
  • Adopted a more conservative approach: orig_conninfo_needed=false only for three specific cases (ALTER ... SERVER, ALTER ... CONNECTION, ALTER ... SET (slot_name=NONE)), rather than trying to enumerate all settings that might need conninfo.

Evan approves v6 approach, provides minor feedback (2026-05-18)

  • Endorses the subtransaction approach in 0003: Agrees it's not over-engineered given the need to catch connection errors during DROP while still allowing the DROP to succeed.
  • Qualifies the HINT message: Notes that the HINT ("Use ALTER SUBSCRIPTION ... DISABLE...") is useful but potentially misleading — the error already says "user mapping not found," so fixing the user mapping is also a valid (and possibly more direct) solution.
  • Suggests a helper function: Points out that construct_subserver_conninfo() now duplicates the foreign server lookup + ACL check + ForeignServerConnectionString() pattern that exists in GetSubscription(). Proposes extracting this into a shared helper to reduce code duplication.

Open design question from Jeff

Jeff explicitly asks reviewers whether v6-0003's subtransaction approach is over-engineered, whether the subtransaction should live at a lower level, or whether there's an alternative. Evan's response validates the current placement (inside construct_subserver_conninfo()) as appropriate because the error-capture behavior is specific to the DROP path.