[PATCH] Doc: document standard_conforming_strings dump/restore incompatibility

First seen: 2026-05-11 22:21:57+00:00 · Messages: 1 · Participants: 1

Latest Update

2026-05-14 · claude-opus-4-6

Documentation Patch: standard_conforming_strings Dump/Restore Incompatibility in PostgreSQL 19

Core Problem

PostgreSQL 19 made a significant backward-compatibility-breaking change via commit 45762084: the GUC parameter standard_conforming_strings is now forced to always be ON and cannot be set to OFF. This change aligns PostgreSQL fully with the SQL standard's treatment of backslash characters in string literals (where backslashes are literal characters, not escape prefixes).

The architectural significance is that this is a data fidelity issue during migration. When dump files produced by older pg_dump versions (from servers running with standard_conforming_strings = off) are restored into PostgreSQL 19, two failure modes emerge:

Failure Mode 1: COPY Format (Default)

The dump file contains SET standard_conforming_strings = off; which now fails in v19. However, COPY operations have their own escape processing rules that are independent of the standard_conforming_strings GUC — COPY always interprets backslash sequences in its own way regardless of this setting. So:

This is architecturally sound because COPY's text format has always had its own escape convention (documented in the COPY documentation), making it independent of the SQL-level string literal parsing controlled by standard_conforming_strings.

Failure Mode 2: --inserts Format (Data Corruption)

This is the more dangerous case. When --inserts format is used, data is encoded as SQL INSERT statements with string literals. If the source server had standard_conforming_strings = off, backslashes in string literals were treated as escape characters (e.g., '\n' meant newline). The dump file contains these escaped forms AND a SET standard_conforming_strings = off to ensure the target interprets them the same way.

In v19, the SET fails silently (or with error), but parsing continues with standard_conforming_strings = on, meaning those backslash sequences are now treated as literal backslashes — resulting in double-backslash corruption in the restored data. This is a silent data corruption scenario that is particularly insidious.

Proposed Solution

The patch is purely documentation — it adds advisory notes to three locations:

  1. pg_dump reference page (Notes section) — warns users producing dumps that may be restored into v19+
  2. pg_restore reference page (Notes section) — warns users restoring older dumps into v19+
  3. release-19 migration section — documents the upgrade path concern

The recommended workaround is a two-step migration:

  1. Restore the old dump into a pre-v19 server (where SET standard_conforming_strings = off still works)
  2. Produce a fresh dump using the v19 pg_dump client (which forces scs=on in the source session, ensuring all string literals are encoded for standard-conforming interpretation)

Technical Analysis

Why This Matters Architecturally

The standard_conforming_strings parameter controlled a fundamental aspect of the SQL parser: how single-quoted string literals interpret backslash characters. When OFF (the historical PostgreSQL default before v9.1 made it default to ON), '\t' in a string literal meant a tab character. When ON, it means a literal backslash followed by 't'.

pg_dump has always emitted a SET standard_conforming_strings statement matching the source server's setting to ensure round-trip fidelity. This worked because the target server could always honor both settings. Commit 45762084 broke this assumption for the first time.

The COPY vs INSERT Asymmetry

The fact that COPY format survives this change while INSERT format doesn't is a subtle but important architectural detail. COPY's text format has always defined its own escape processing (backslash-based, documented in the COPY command reference) that operates at a different layer than SQL string literal parsing. This is why COPY format is more robust for data migration in general — it doesn't depend on SQL parser state.

Scope of Impact

The practical impact is limited because:

However, the --inserts case represents silent data corruption, which makes documentation critical even for edge cases.

Assessment

This is a straightforward, well-motivated documentation patch. The author (Baji Shaik) has clearly verified the behavior empirically and identified both failure modes precisely. The patch targets the right documentation locations and the workaround is sound. The reference to the original discussion thread (Tom Lane's message based on the @sss.pgh.pa.us address) suggests this issue was identified during broader community discussion of the commit's implications.