Review observations for COPY ON_ERROR_TABLE patch

First seen: 2026-05-18 06:55:43+00:00 · Messages: 1 · Participants: 1

Latest Update

2026-05-20 · claude-opus-4-6

Technical Analysis: Review Observations for COPY ON_ERROR_TABLE Patch

Core Problem

The COPY ON_ERROR_TABLE feature extends PostgreSQL's existing COPY ... ON_ERROR mechanism (introduced in PostgreSQL 17 with the ignore option) to redirect malformed or rejected rows into a separate error table rather than simply discarding them. This is a significant usability improvement for ETL/data loading workflows where operators need to inspect and potentially remediate failed rows without losing them entirely.

The reviewer (Vellaipandiyan) raises several architectural concerns about how this feature interacts with existing COPY internals, particularly around batched insertion, transactional semantics, and failure recursion.

Technical Issues Raised

1. Multi-Insert Buffer Consistency

PostgreSQL's COPY path uses CopyMultiInsertBuffer to batch rows via table_multi_insert() for performance. This batching accumulates tuples in memory and flushes them in groups (typically up to 1000 rows per buffer, across multiple buffers for partitioned targets). The concern is:

The suggestion to force CIM_SINGLE (single-row insertion mode) when ON_ERROR_TABLE is active is architecturally sound as a first implementation step. It sacrifices COPY throughput for correctness simplicity — each row can be individually wrapped in a subtransaction or have its failure cleanly caught and redirected.

2. Transaction Semantics for Error Table Rows

This is a fundamental design question: should the error table insertions be:

The most likely correct answer for PostgreSQL's architecture is transactional consistency — error rows commit or rollback with the COPY statement. This aligns with PostgreSQL's strong transactional guarantees and avoids introducing autonomous transaction machinery. However, it means that if COPY hits a fatal error after successfully redirecting 1000 soft errors to the error table, all 1000 error records are lost on rollback.

3. Recursive Failure Handling

If the error table itself is unavailable (e.g., full tablespace, permissions revoked, its own constraint violation, or a trigger on the error table fails), the system enters a recursive error state. The implementation must have a clear policy:

The safest approach is to treat error-table insertion failure as a fatal COPY error, since the user explicitly requested error capture and silent loss of error records would violate that contract.

Architectural Implications

This patch touches several critical subsystems:

  1. copy.c / copyfrom.c — The main COPY FROM execution path, including the CopyMultiInsertBuffer management
  2. execPartition.c — If the error table is partitioned or if COPY targets a partitioned table
  3. heapam.c / tableam.h — Table access method layer for multi-insert
  4. xact.c — Subtransaction management if savepoints are used per-row
  5. catalog modifications — New syntax parsing and option validation for ON_ERROR_TABLE

The multi-insert concern is particularly important because disabling batching has a measurable performance impact (2-5x slower for large COPY operations), so while CIM_SINGLE is a good starting point, the feature would eventually need to support batched mode to be production-viable for large datasets.

Current Status

This appears to be an early-stage review of a patch in development. The reviewer has not identified concrete reproducible failures but is raising design-level concerns that need resolution before the patch can be committed. These are the types of architectural questions that typically need to be resolved in the thread before a committer will pick up the patch.