Analysis: Missing EndCopyFrom() in Logical Replication Tablesync
Core Problem
During logical replication's initial table synchronization, each tablesync worker runs a COPY from the publisher into the local table via copy_table() in src/backend/replication/logical/tablesync.c. This function constructs a CopyFromState with BeginCopyFrom() and drives row ingestion with CopyFrom(), but never invokes the paired teardown routine EndCopyFrom().
The asymmetry matters because EndCopyFrom() is not just a memory/resource cleanup — it also calls pgstat_progress_end_command(), which clears the backend's PgBackendStatus.st_progress_command field back to PROGRESS_COMMAND_INVALID. Progress reporting for COPY is started inside CopyFrom() (via pgstat_progress_start_command(PROGRESS_COMMAND_COPY, …)), and the canonical contract is that the Begin/End boundary on the CopyFromState owns the progress-reporting lifecycle.
Architectural Significance
After copy_table() returns, the tablesync worker does not exit — it transitions into the WAL catchup phase (SUBREL_STATE_CATCHUP/SYNCWAIT), where it applies changes from the publisher up to the synchronization point before handing the relation back to the main apply worker. This phase can be arbitrarily long on busy systems.
Because the progress slot is never cleared, pg_stat_progress_copy continues to report a phantom, in-progress COPY for the entire catchup window. The row/byte counters are frozen at their final values, making the view actively misleading:
- Monitoring false positives: Operators watching
pg_stat_progress_copyto gauge initial sync progress see a COPY that appears stuck — bytes_processed stops advancing even though the tablesync is making forward progress in a different phase. - Lifecycle leak across phases: A per-command progress entry outlives the command that created it, violating the invariant that progress views reflect currently executing commands. If any later code path in the same backend were to start another progress command without first ending this one, assertions in
pgstat_progress_start_command()would fire (in assert-enabled builds the sentinel checkst_progress_command == PROGRESS_COMMAND_INVALIDis expected on entry in several call sites). - Symmetry with normal COPY: Top-level
COPY FROMviaDoCopy()incopy.calways pairsBeginCopyFrom/EndCopyFrom. The tablesync path is the outlier.
The Fix
The patch inserts a single EndCopyFrom(cstate) call immediately after CopyFrom(cstate) in copy_table(). This is the minimal, obviously-correct change:
EndCopyFrom()closes the input source (for tablesync, the COPY data arrives from the walsender over the libpq connection wrapped as a custom source), releases theCopyFromStatememory context resources, and ends the progress command.- There is no functional risk: the existing code has already consumed all rows by the time
CopyFrom()returns, soEndCopyFrom()has nothing left to read — its job here is almost entirely bookkeeping and progress cleanup.
Considerations Not Raised Yet
A reviewer is likely to ask:
- Backpatch scope:
pg_stat_progress_copywas introduced in PG14 (commit 8a4f618e7). The bug exists on every branch since, so this is a backpatchable bug fix rather than a master-only cleanup. - Error paths: If
CopyFrom()throws,EndCopyFrom()would be skipped, but in that case the tablesync worker exits and the backend dies, which clearsst_progress_commandimplicitly viapgstat_beshutdown_hook/ backend exit. So the leak is specifically a successful-path leak into the catchup phase — exactly the case the patch addresses. - Resource owner / memory:
EndCopyFrom()frees theattribute_buf, line buffer, and detaches the copy source. Not calling it leaks these until the tablesync worker eventually exits at end of sync. The progress-entry staleness is the user-visible symptom, but the memory hygiene argument is additional justification.
Verdict
This is a small, surgical, clearly-correct bug fix in the logical replication tablesync path. The author's diagnosis of the pgstat_progress_end_command() omission is precise and the fix restores the documented Begin/End pairing contract of the COPY API.