COPY FROM ON_ERROR SET_NULL bypasses domain NOT NULL with partial column list

First seen: 2026-04-16 17:09:40+00:00 · Messages: 5 · Participants: 4

Latest Update

2026-05-20 · claude-opus-4-6

COPY FROM ON_ERROR SET_NULL Bypasses Domain NOT NULL with Partial Column List

Core Technical Problem

This thread addresses a memory safety bug in PostgreSQL's COPY FROM implementation that causes silent data corruption when using the ON_ERROR SET_NULL option with a partial column list targeting high-numbered columns.

Root Cause: Array Allocation/Indexing Mismatch

In BeginCopyFrom(), the domain_with_constraint[] array was allocated with list_length(attnumlist) elements — i.e., only enough slots for the columns explicitly listed in the COPY command. However, the consuming code in copyfromparse.c indexes this array using attnum - 1 (the physical attribute index in the relation).

When a partial column list targets columns with high physical attribute numbers (e.g., column 10 out of 10), the code performs an out-of-bounds read on the undersized array. The garbage value read from beyond the array boundary is interpreted as "no domain constraint exists," causing the ON_ERROR SET_NULL path to silently insert NULL into a column governed by a NOT NULL domain constraint.

Why This Matters Architecturally

  1. Silent constraint violation: Domain NOT NULL constraints are a fundamental integrity mechanism. Bypassing them without any error or warning means applications relying on domain constraints for data quality have no indication of corruption.

  2. Memory safety: The out-of-bounds read is undefined behavior. While in practice it reads uninitialized/adjacent memory as a false negative for the constraint check, it could theoretically cause crashes or other unpredictable behavior depending on memory layout.

  3. Inconsistency with established patterns: All other per-column arrays in BeginCopyFrom() (e.g., defmap, typioparams, in_functions) are allocated with num_phys_attrs elements and indexed by attnum - 1. The domain_with_constraint[] array was an outlier that broke this convention, suggesting it was introduced without fully following the existing indexing pattern.

The Fix

The fix is straightforward and minimal:

This aligns the array with all other per-column arrays in BeginCopyFrom(), eliminating the out-of-bounds access.

Reproduction Scenario

The bug is triggered by a specific combination:

  1. A table with many columns where a high-numbered column has a NOT NULL domain type
  2. A COPY command with a partial column list that includes that high-numbered column
  3. The ON_ERROR SET_NULL option is active
  4. Input data contains a value that would trigger a conversion error for that column
CREATE DOMAIN d_notnull_int AS int NOT NULL;
CREATE TABLE t (
    c1 text, c2 text, c3 text, c4 text, c5 text,
    c6 text, c7 text, c8 text, c9 text,
    c10 d_notnull_int
);

COPY t(c1, c10) FROM stdin WITH (on_error set_null);
hello    bad
\.

-- This returns true when it should have raised an error:
SELECT c10 IS NULL FROM t;

The key is that c10 has attnum = 10, but the column list only has 2 entries. The array is allocated with 2 elements, but the code tries to access index 9 (attnum 10 - 1).

Design Observations

The fix required no design discussion or tradeoffs — it's a clear bug with an obvious correct solution. The array allocation was simply inconsistent with the established convention used throughout the same function. The interesting aspect is that this class of bug (allocation size vs. indexing strategy mismatch) is a recurring pattern in C code that maintains parallel arrays for different attribute properties, and PostgreSQL's COPY code has several such arrays that must all follow the same convention.

Backport Considerations

This bug exists in any version that introduced the ON_ERROR SET_NULL functionality for COPY. It should be backported to all supported branches containing that feature, as it represents both a constraint violation and a memory safety issue.