Adding pg_dump flag for parallel export to pipes

First seen: 2025-04-07 17:16:58+00:00 · Messages: 19 · Participants: 6

Latest Update

2026-05-06 · opus 4.7

Core Problem

pg_dump's text format composes trivially with Unix pipelines (pg_dump ... | lz4 | pv | ssh ...), but the directory format (-Fd)—the only format that supports parallel dump/restore (-j N)—requires a filesystem directory as the output destination. This creates a capability gap:

For very large databases this forces operators into an unhappy choice: spend hours on a single-threaded piped dump, or spend 2× the disk to stage a parallel directory dump before shipping/compressing it. Andrew Jackson highlights the practical pain: you cannot stream a parallel pg_dump directly into a parallel pg_restore today.

Proposed Solution

Introduce --pipe-command (later renamed to --pipe) to pg_dump and pg_restore. Rather than opening files with fopen/fclose, the archiver opens a subprocess with popen/pclose whose command template contains a %f placeholder expanded per-file (matching the directory-format filenames: toc.dat, <dumpid>.dat, blob_NNN.toc, blob_<oid>.dat).

Example round-trip with FIFOs enabling streaming parallel dump→restore:

pg_dump  -j4 -Fd src --pipe-command="mkfifo dumpdir/%f; cat >> dumpdir/%f"
pg_restore -j4 -Fd --dbname=dst ./dumpdir

Implementation shape

  1. A new boolean fSpecIsPipe on _archiveHandle (and analogous locals in pg_dump.c/pg_restore.c) flags the output/input target as a program rather than a path.
  2. The existing filename-carrying fields are overloaded to carry the command template when the flag is set.
  3. A helper (replace_percent_placeholders) expands %f against the per-entry filename that the directory format would otherwise have used.
  4. All fopen sites in the directory archiver are routed through a conditional that picks popen when the pipe flag is set. Close paths likewise route to pclose.
  5. Mutually exclusive with --file, and (per v4+) incompatible with the builtin --compress (since compression is delegated to the user's pipeline).

Key Design Tensions

Append-mode for LO TOC

The existing code opens blob_NNN.toc in append mode. popen has no append semantics—a child process is spawned once, writes, and exits. Nitin's v4+ response is to change the LO TOC open mode to PG_BINARY_W unconditionally, even in the non-pipe case, so the two code paths converge. This is a subtle behavioral change to the existing directory format and deserves scrutiny: any caller relying on append semantics (e.g., resumable dumps, though pg_dump doesn't really support that) would be affected. Nitin flags it explicitly: "If there is a concern, we can revert to the older version."

Shell injection surface

popen invokes /bin/sh -c, so the %f substitution is the critical point. Directory-format filenames are generated internally (not user data), but the command string itself is user-supplied and executed via the shell—this matches COPY ... PROGRAM semantics, which already sets the precedent for "superuser/operator trusts themselves." v7 adds "shell escaping in the command before setting it as the file path," addressing paths with spaces/quotes in %f expansion.

Flag naming

Hannu anchors the naming in the COPY grammar: COPY ... TO { 'filename' | PROGRAM 'command' | STDOUT }. Candidates floated: --pipe-command, --to-pipe/--from-pipe, --to-program/--from-program, --pipe-command-pattern. Mahendra pushes for the terse --pipe; Hannu concedes. v7 settles on --pipe. The --to-program/--from-program option would have been the most consistent with COPY but was rejected implicitly.

Why no - / stdout convention?

Thomas Munro notes the obvious: the POSIX convention of - meaning stdout cannot work here because the directory format produces multiple files concurrently (especially under -j), so a template with a placeholder is unavoidable.

The defunct-shell test failure

A significant chunk of the thread (April–June 2025, then periodic revisits through early 2026) is consumed by a TAP test problem: commands like --pipe-command="cat > $tempdir/%f" work manually but leave a <defunct> sh child inside 002_pg_dump.pl, with cat reporting "No such file or directory" on a path that demonstrably exists. The symptom pattern—works in shell, fails under Perl's IPC::Run—strongly suggests quoting/argv-vs-shell-string confusion in how the test harness passes the argument, possibly compounded by the embedded > being interpreted by an outer shell layer rather than the inner popen shell. Nitin's eventual v7 fix avoids the problem by using perlbin (making the test portable to Windows at the same time) instead of relying on cat + shell redirection, which sidesteps rather than diagnoses the quoting issue.

pg_dumpall interaction

With v19's directory-mode support in pg_dumpall, Mahendra asks for --pipe there too. Nitin defers this to a follow-up patch and explicitly skips it for global restore, keeping the initial scope tractable.

Architectural Significance

The patch is small but opens a meaningful extension point: the directory archiver becomes a pluggable transport. Once popen is an accepted sink/source, logical consequences include:

The design is deliberately minimal: no new transport abstraction, no URI scheme, no plugin API. Just popen with a filename template. This is philosophically in line with PostgreSQL's existing COPY ... PROGRAM and archive_command—leveraging the shell as the extension mechanism.

Review Status

By v7 (May 2026) the patch has had one substantive external reviewer (Mahendra) and light committer-level attention (Thomas Munro weighed in only on the - question; Dilip Kumar asked for a rebase and TODO cleanup but hasn't posted a code-level review). The patch has rebased repeatedly against HEAD churn. Remaining open items per the latest message: squashing the first three commits, potential revert of the LO TOC open-mode change, and pg_dumpall integration as a follow-up.