Deep Technical Analysis: DELETE/UPDATE Corruption in Partitioned Foreign Tables
The Core Problem
This thread addresses a data corruption bug in PostgreSQL's Foreign Data Wrapper (FDW) infrastructure that has been known since at least 2018. The bug manifests when a foreign table (created via postgres_fdw) references a remote partitioned table, and a non-pushed-down UPDATE or DELETE is executed against it.
Root Cause: ctid Ambiguity Across Partitions
PostgreSQL's FDW modification path works as follows for non-direct modifications:
- A Foreign Scan fetches rows from the remote, retrieving
ctidas a "junk" column for row identification - The local executor processes these rows and issues UPDATE/DELETE statements using
WHERE ctid = $1
The critical flaw: ctid is only unique within a single physical table (heap). In a partitioned table, different partitions can have rows with identical ctid values (e.g., (0,1) in both plt_p1 and plt_p2). When the remote SQL UPDATE public.plt SET b = $2 WHERE ctid = $1 is issued against the parent partitioned table, it may match and modify a row in the wrong partition — one that doesn't even satisfy the original query's WHERE clause.
Why This Is Architecturally Significant
This is not merely an edge case — it represents a fundamental mismatch between:
- The local executor's assumption that
ctiduniquely identifies a row in the target relation - The reality that for partitioned tables,
ctidalone is insufficient without also knowing which partition (child table) the row belongs to
The bug is particularly insidious because:
- It silently corrupts data (wrong rows modified/deleted)
- The
FOR UPDATElock acquired during the scan locks the correct row, but the subsequent DML targets a different row - It affects any foreign table pointing to a remote partitioned table, inherited table, or even an updatable view on such tables
Proposed Solutions
Solution 1: Error on Multi-Row ctid Matches (Original 2018 Patch, Rebased)
The rebased patch from the original 2018 discussion forbids postgresExecForeignDelete and postgresExecForeignUpdate from proceeding if more than one row would be affected.
Weakness (demonstrated by Etsuro Fujita): This check is insufficient. With a BEFORE trigger that returns NULL on one partition, the UPDATE can still affect exactly one row — but it's the wrong row. The trigger suppresses the correct row's update, and the ctid collision causes the wrong partition's row to be modified instead. Single-row-affected doesn't guarantee correctness.
Solution 2: Table Option inherited (Etsuro Fujita's WIP Patch)
Fujita proposes adding a postgres_fdw table option called inherited that users must set to indicate the foreign table references a partitioned/inherited table. When set, postgresPlanForeignModify() would throw an error for UPDATE/DELETE operations.
Tradeoffs:
- Back-patchable (minimal invasive change)
- Relies on user discipline to set the option correctly
postgresImportForeignSchema()could auto-detect this in some cases, but not all (e.g., updatable views on partitioned tables)- Fujita acknowledges this is a "leave it to the user" approach for back-branches
Criticism (from jgdr@dalibo.com): Users can still easily corrupt data by overlooking documentation — it's an opt-in safety mechanism rather than a structural fix.
Solution 3: WHERE CURRENT OF with Cursors (Originally Proposed by Ashutosh Bapat in 2018)
This approach would use cursor-based positioning (WHERE CURRENT OF) instead of ctid-based identification for UPDATE/DELETE. Since the cursor maintains position on the exact physical row regardless of partition, it's semantically correct.
Tradeoff: Requires fetching one row at a time from the remote server, causing severe performance degradation for bulk UPDATE/DELETE operations. This is why it was initially discarded in 2018.
Solution 4: Remote tableoid as Additional Junk Column (hukutoc's Patch Set, 2026)
The most comprehensive approach, developed in a three-patch series:
-
Patch 1 - Extended copy slot/tuple mechanics: Modifies PostgreSQL's internal
copy_slot/copy_tuplemachinery to support source and destination tuples with differing numbers of attributes. Previously, these functions assumed equal attribute counts. This enables carrying an additionaltableoidattribute through the executor. -
Patch 2 - Core remote tableoid parameter: Introduces a remote table OID concept in the executor, passing it through to the FDW engine. This is the architectural enabler — the executor now understands that a foreign modification target may need partition-level identification.
-
Patch 3 - FDW engine usage: Teaches postgres_fdw to incorporate the remote
tableoidinto its UPDATE and DELETE queries, resolving the ctid ambiguity by qualifying modifications to the correct partition.
Analysis: This is the most correct long-term solution but requires significant core changes (executor, tuple machinery) making it unsuitable for backpatching. The earlier 2018 discussion rejected adding tableoid as a junk clause, but hukutoc's approach revisits this with proper infrastructure changes.
Key Design Tensions
-
Correctness vs. Performance: The cursor-based approach is correct but slow; the ctid approach is fast but broken for partitioned tables.
-
Backpatability vs. Completeness: Simple fixes (error out, user option) can be backpatched but don't solve the problem. Complete fixes (remote tableoid) require core changes unsuitable for stable branches.
-
User Responsibility vs. Safety by Default: Fujita's option-based approach puts the burden on users; a complete fix would make the system safe by default.
-
Scope of the Problem: Fujita noted the issue extends beyond just partitioned tables to inherited tables and updatable views on partitioned tables, making auto-detection in
postgresImportForeignSchema()incomplete.
Current State
The bug remains unfixed in all PostgreSQL versions. The most promising path forward appears to be:
- For back-branches: Fujita's
inheritedtable option (error on known-dangerous operations) - For HEAD: hukutoc's tableoid-based approach (correct behavior with proper executor support)
The patch set by hukutoc was submitted to CommitFest 2026 (CF entry 6770) and awaits review.