Fix Race During Concurrent Logical Decoding Activation
Background: Dynamic Logical Decoding Toggle
This thread addresses a race condition in a newer PostgreSQL feature that allows logical decoding to be dynamically enabled/disabled based on the presence of logical replication slots — the "Toggle logical decoding dynamically based on logical slot presence" feature. This feature allows effective_wal_level to be promoted from replica to logical at runtime when a logical slot is created, and demoted back when the last logical slot is dropped, without requiring a server restart.
The activation process (EnableLogicalDecoding()) involves:
- Setting shared memory flags (
xlog_logical_info,logical_decoding_enabled) - Issuing a WAL barrier to ensure all backends start including logical decoding information in WAL records
- Writing a logical-decoding status WAL record
- Emitting process barriers so all backends pick up the new state
The deactivation path (DisableLogicalDecoding()) reverses this process, while abort_logical_decoding_activation() handles cleanup when activation is interrupted (e.g., by query cancellation).
The Race Condition
The bug manifests when two sessions concurrently create logical replication slots when no slots previously existed (i.e., effective_wal_level is replica):
- Session 1 begins creating a logical slot and enters
EnableLogicalDecoding(). It setsxlog_logical_info = truebut has not yet setlogical_decoding_enabled = true(blocked at injection point mid-activation). - Session 2 creates another logical slot. Since Session 1 hasn't completed, Session 2 also runs
EnableLogicalDecoding(), which completes successfully, settinglogical_decoding_enabled = true. - Session 1 is cancelled. It enters
abort_logical_decoding_activation(), which contains the assertion:
This assertion fails because Session 2 already setAssert(!LogicalDecodingCtl->logical_decoding_enabled);logical_decoding_enabled = true.
The fundamental problem is that abort_logical_decoding_activation() assumes it is the only activator and that no one else could have completed activation in the meantime.
Proposed Solutions
Solution 1: Serialization with Condition Variable (Evan's initial patch)
Evan's first approach adds an activation_in_progress flag to shared memory, protected by LogicalDecodingControlLock, with a condition variable. When a second session wants to activate logical decoding, it waits on the CV until the first activation completes or aborts.
Rejected because: It adds a new synchronization mechanism that could introduce its own race conditions, and is more complex than necessary. Sawada noted this was similar to older discarded approaches.
Solution 2: Delegate Abort Cleanup to DisableLogicalDecoding() (Sawada's approach, accepted)
The key insight: the complexity arises because abort_logical_decoding_activation() and DisableLogicalDecoding() clear the xlog_logical_info flag in different ways. The simpler fix is to make the abort path not immediately clean up xlog_logical_info. Instead:
- When activation is interrupted, the system is left in a transient state where
xlog_logical_info == truebutlogical_decoding_enabled == false. DisableLogicalDecoding()is modified to handle this transient state: when it detectsxlog_logical_info == truewithlogical_decoding_enabled == false, it simply resetsxlog_logical_infowithout going through the full disable sequence.
This eliminates the race because:
- If another session completed activation,
logical_decoding_enabledis alreadytrue, and the aborted session simply does nothing on cleanup. - If no other session completed activation,
DisableLogicalDecoding()will eventually clean up the leftoverxlog_logical_infoflag when it runs (triggered by the slot count going to zero or another appropriate event).
Key Technical Design Decisions
Critical Sections and Shared Memory Updates
A discussion emerged about whether shared memory assignments (LogicalDecodingCtl->xlog_logical_info = false, etc.) need to be inside critical sections. Evan argued they don't strictly need to be (critical sections primarily protect against elog interruptions). Sawada argued for keeping them inside critical sections for consistency with EnableLogicalDecoding(). The final patch keeps them in critical sections for consistency.
Log Message Ordering Under Lock
A subtle design trade-off: should the "logical decoding is disabled" log message be emitted while still holding LogicalDecodingControlLock?
- Argument for (Sawada's position): Guarantees correct log ordering. Without the lock, a scenario exists where the "disabled" message could appear after a subsequent "enabled" message, confusing operators.
- Argument against (Evan's position): The race window is tiny in practice (the enable path has significant work — WAL write+flush — between acquiring the lock and emitting its log). Holding an LWLock during
ereport(LOG)is generally undesirable.
Sawada prevailed, adding comments explaining why the log is emitted under the lock. Evan acknowledged this is a reasonable trade-off.
pending_disable Flag Cleanup
Evan caught a bug in Sawada's initial v2: in the new else branch of DisableLogicalDecoding() that handles the transient state, LogicalDecodingCtl->pending_disable also needs to be cleared. Without this, a stale pending_disable = true could cause incorrect behavior on subsequent operations.
Test Infrastructure
The fix includes regression tests in 051_effective_wal_level.pl using injection points:
- Session 1 blocks mid-activation via
logical-decoding-activationinjection point (set towait) - Session 2 concurrently creates a slot (succeeds)
- Session 1 is cancelled via
pg_cancel_backend() - Assertions verify
effective_wal_levelremains correct
Evan's final review caught a test reliability issue: wait_for_log("canceling statement due to user request") could match a log line from a previous cancellation in the same test file. The fix: capture the log offset before the cancel and pass it to wait_for_log().
Current Status
The patch is at v3, with only minor cosmetic fixes remaining (grammar: "appear" → "appears", and the log offset fix). The core design is agreed upon and the patch appears ready for committer review.