pg_recvlogical: Send Final Feedback on SIGINT/SIGTERM Exit
Core Problem
pg_recvlogical is PostgreSQL's command-line tool for consuming logical replication streams. It connects to a logical replication slot, receives decoded changes, and writes them to local output (stdout or a file). A critical aspect of logical replication is feedback: the client must periodically inform the server how far it has successfully consumed data, so the server can advance the replication slot's confirmed_flush_lsn and reclaim WAL/catalog resources.
The bug/deficiency lies in the shutdown path. When pg_recvlogical receives SIGINT or SIGTERM:
- It writes decoded output to the local destination (file/stdout)
- It proceeds to send
CopyDoneto terminate the streaming session - But it does NOT send a final feedback message reflecting the latest written LSN position
This creates a window where the local consumer has already persisted decoded changes, but the server's replication slot still points to an older LSN. Upon restart, the server resends all changes from the slot's last confirmed position, resulting in duplicate decoded data.
Architectural Context
Logical Replication Slot Mechanics
A logical replication slot tracks three key LSN positions:
restart_lsn: Where WAL replay must start from if the slot consumer reconnectsconfirmed_flush_lsn: The LSN up to which the consumer has confirmed receiptcatalog_xmin: The oldest transaction whose catalog changes must be preserved
The confirmed_flush_lsn is advanced only when the client sends a standby status update (feedback) message via the replication protocol. Without feedback, the slot cannot advance, and the server will re-decode and resend the same changes on reconnection.
The Feedback Protocol
In the streaming replication protocol, feedback is sent as StandbyStatusUpdate messages within the CopyBoth mode. The client reports:
writeLSN: data written to local buffer/diskflushLSN: data flushed/synced to durable storageapplyLSN: data applied/consumed by downstream
pg_recvlogical sends these periodically (controlled by --status-interval), but the gap between the last periodic feedback and process termination can represent significant decoded output that was written but never acknowledged.
Proposed Solution
The patch adds a final feedback send after the last data has been written locally but before sending CopyDone to the server. This is a minimal, targeted change to the shutdown sequence:
Previous flow: [write output] → [CopyDone] → [exit]
Patched flow: [write output] → [final feedback] → [CopyDone] → [exit]
Design Characteristics
-
Best-effort semantics: The author explicitly acknowledges this does not provide exactly-once delivery guarantees. It merely narrows the window for duplicates.
-
No protocol changes: The fix uses the existing
StandbyStatusUpdatemessage — no wire protocol modifications needed. -
Signal timing dependency: If a signal arrives at certain points in execution (e.g., mid-write, or before the output is fully flushed), the feedback may still not reflect all written data. The at-least-once delivery guarantee of logical replication remains unchanged.
-
Targets v20: The author considers this an improvement rather than a bug fix, indicating it doesn't warrant backpatching to stable branches. This is a reasonable assessment since the behavior has always existed and applications consuming logical replication output should already handle duplicates (logical decoding provides at-least-once, not exactly-once semantics).
Technical Implications
Why This Matters in Practice
For users running pg_recvlogical in production pipelines (ETL, CDC, audit logging), graceful restarts (e.g., during maintenance, log rotation, or deployment) are common. Each restart without final feedback means:
- Re-decoding potentially large volumes of already-consumed changes
- Extra CPU/IO on the server for WAL reading and logical decoding
- Downstream consumers must deduplicate or be idempotent
Potential Edge Cases
- Network failure during final feedback: If the connection drops before the feedback reaches the server, the improvement has no effect — same as today.
- Server-side confirmation timing: Even if feedback is sent, the server may not process it before
CopyDoneterminates the session. The ordering within the replication protocol should handle this (feedback is processed in-order), but there could be subtle race conditions. - Synchronous vs. asynchronous feedback: The feedback message is fire-and-forget in the protocol; the client doesn't wait for an ACK. So sending it just before
CopyDoneshould be safe from a deadlock perspective.
Assessment
This is a straightforward, low-risk improvement that reduces operational friction for pg_recvlogical users. The patch is small in scope and doesn't alter the fundamental at-least-once semantics of logical decoding. It's the kind of quality-of-life fix that reduces unnecessary work on restart without promising stronger guarantees than the system can deliver.
The main review considerations will likely be:
- Whether the feedback LSN values are correctly computed at the point of sending
- Whether there are any ordering concerns with feedback followed immediately by CopyDone
- Whether this should also apply to
--endpos-triggered exits or other exit paths