Analysis: Parallel DML (Write-Side) Support in PostgreSQL
Core Technical Problem
This thread raises a fundamental architectural question about PostgreSQL's parallel execution framework: why is parallelism limited to read-side operations, and what would it take to extend it to write-side DML (INSERT, UPDATE, DELETE)?
Current State of Parallel Execution
PostgreSQL's parallel query infrastructure, introduced incrementally since version 9.6, supports:
- Parallel Sequential Scans — workers cooperatively scan heap pages
- Parallel Index Scans — workers traverse index structures concurrently
- Parallel Aggregation — partial aggregates computed by workers, combined by leader
- Parallel JOINs — hash joins and nested loops in parallel workers
- CREATE TABLE AS SELECT (CTAS) — parallel read feeding into a single-writer creation path
- CREATE INDEX — parallel sort phases for B-tree index builds
However, for standard DML statements (INSERT ... SELECT, UPDATE, DELETE), the ModifyTable executor node sits above the Gather node in the plan tree, meaning all actual tuple modifications are serialized through the single leader process.
Why This Is Architecturally Hard
The restriction is not arbitrary — it stems from deep architectural constraints:
-
Transaction Visibility (MVCC): All tuples written within a single statement must share the same transaction ID (XID). PostgreSQL's XID allocation and CLOG/commit infrastructure assumes a 1:1 mapping between backends and active transactions. Parallel workers share the leader's XID, but the locking and visibility protocols for concurrent writes under a single XID are not fully developed.
-
WAL Serialization: Write-Ahead Logging requires that log records be inserted in LSN order. While WAL insertion has been parallelized at the infrastructure level (WAL insertion locks, per-backend WAL buffers since 9.5), having multiple workers generate WAL for the same transaction creates ordering and recovery challenges.
-
Lock Management: Row-level locks (via
xmaxand multixact mechanisms) assume a single writer per tuple per transaction. Parallel workers modifying different tuples is theoretically possible, but concurrent updates to the same page's line pointers, page headers, and visibility map bits require careful synchronization. -
Index Maintenance: Each heap tuple modification triggers index updates. Concurrent index insertions from multiple workers into the same B-tree pages would create contention at leaf-page level and complicate the page-split protocol.
-
Trigger and Constraint Execution: AFTER triggers, deferred constraint checks, foreign key validations, and TOAST operations all assume single-threaded execution context within a statement.
-
Command Completion Counting: The executor's tuple count reporting (
INSERT 0 N) must accurately reflect all modified rows — a coordination challenge with parallel workers.
Existing Work and Related Efforts
While no responses appear in this thread yet, there is significant prior art in the community:
- Amit Kapila's parallel DML patches (discussed periodically since ~2018) explored allowing
INSERT ... SELECTto run the SELECT portion in parallel workers that feed tuples to the leader for insertion, and later, having workers perform insertions directly. - The
enable_parallel_insertGUC concept has been floated in various forms. - Parallel COPY has been discussed as a related but simpler problem (no visibility complications since COPY is bulk-load).
- Partitioned table parallel INSERT — workers each write to different partitions, avoiding page-level contention.
Recommended Workarounds (Current State)
For practitioners facing this bottleneck today:
- Application-level partitioning: Split large DML into range-based chunks executed by separate sessions (separate XIDs, no coordination needed)
- Parallel COPY via external tools:
pg_bulkload, partition-aware parallel loaders - Partitioned tables with partition-wise operations: Can sometimes enable more parallelism in the scan phase
- Batch operations with
pg_backgroundordblink: Launch multiple autonomous transactions - Unlogged tables for intermediate ETL stages: Eliminates WAL overhead for staging data
Significance
This inquiry touches on one of PostgreSQL's most significant remaining scalability gaps compared to commercial databases (Oracle's parallel DML, SQL Server's parallel INSERT). The problem is well-understood in the community but requires careful incremental engineering to solve without compromising PostgreSQL's ACID guarantees or crash-recovery correctness.
The most likely path forward is incremental: first enabling parallel INSERT...SELECT (workers perform heap inserts), then parallel UPDATE/DELETE (which additionally requires parallel tuple locking and concurrent HOT chain maintenance).