effective_wal_level is not decreasing after using REPACK (CONCURRENTLY)

First seen: 2026-05-21 16:32:36+00:00 · Messages: 1 · Participants: 1

Latest Update

2026-05-22 · claude-opus-4-6

Analysis: effective_wal_level Not Decreasing After REPACK (CONCURRENTLY)

Core Problem

This thread identifies a bug in the interaction between PostgreSQL's new dynamic WAL level toggling feature and the REPACK (CONCURRENTLY) command. The issue sits at the intersection of two recent commits:

  1. 67c2097 — Introduced dynamic toggling of logical decoding, allowing effective_wal_level to automatically decrease from logical to replica when no logical replication slots exist.
  2. 28d534e — Related infrastructure for the REPACK CONCURRENTLY feature which uses temporary logical replication slots internally.

The Architectural Problem

When REPACK (CONCURRENTLY) executes, it creates a temporary logical replication slot to perform online table reorganization using logical decoding. This slot creation triggers the system to elevate effective_wal_level to logical (as seen in the log message: "logical decoding is enabled upon creating a new logical replication slot").

However, when REPACK completes and drops the temporary slot during cleanup (repack_cleanup_logical_decoding), it does not call RequestDisableLogicalDecoding() to signal the checkpointer that logical decoding is no longer needed. The result is that effective_wal_level remains stuck at logical even though no logical slots exist anymore, requiring a full server restart to restore the lower WAL level.

This matters architecturally because:

Proposed Solution

The patch is straightforward: add a call to RequestDisableLogicalDecoding() inside repack_cleanup_logical_decoding() immediately after the replication slot is dropped. This signals the checkpointer process to evaluate whether logical decoding can be disabled (i.e., check if any logical slots remain). If none exist, the checkpointer will lower effective_wal_level back to replica without requiring a restart.

This follows the same pattern that should be used by any code path that drops logical replication slots — it must coordinate with the dynamic toggling infrastructure by requesting a re-evaluation of the WAL level.

Key Design Considerations

  1. Symmetry of enable/disable: The slot creation path already calls the enable side (automatically via ReplicationSlotCreate or similar), but the REPACK cleanup path was missing the corresponding disable request. This is a classic symmetry bug in resource lifecycle management.

  2. Checkpointer-mediated approach: The disable request is intentionally asynchronous — it asks the checkpointer to handle the transition. This avoids holding up the REPACK command and ensures the WAL level change happens at a safe checkpoint boundary.

  3. Broader audit needed: This bug raises the question of whether other code paths that create and drop temporary logical slots (e.g., pg_logical_emit_message testing, custom extensions using the logical decoding API) also need similar fixes.

Technical Context

The effective_wal_level GUC was introduced as a runtime-computed value distinct from the configured wal_level to support dynamic transitions. The infrastructure works as follows:

The REPACK CONCURRENTLY feature uses logical decoding internally to capture changes made to a table while it's being reorganized (similar to how pg_reorg/pg_repack extensions work, but now built-in). It creates a slot, decodes changes, applies them to the new table copy, then drops the slot — a pattern that should be fully transparent to the WAL level management system.