Fix array-element quoting in postgres_fdw import statistics

First seen: 2026-04-12 03:12:58+00:00 · Messages: 4 · Participants: 2

Latest Update

2026-05-18 · claude-opus-4-6

Fix Array-Element Quoting in postgres_fdw Import Statistics

Core Problem

The postgres_fdw extension's statistics import feature (ANALYZE on foreign tables) constructs a remote query against pg_stats that filters columns using a text array literal in a WHERE attname = ANY('{...}') clause. The function build_remattrmap() used quote_identifier() to format column names within this array literal, which is fundamentally incorrect due to a mismatch between SQL identifier quoting semantics and PostgreSQL array literal parsing semantics.

The Quoting Mismatch Explained

quote_identifier() applies SQL identifier quoting rules:

However, inside a PostgreSQL array literal (the '{...}' text), the array parser treats backslash (\) as an escape character. This means a column name like a\b would be placed into the array literal as a\b, but the array parser would interpret \b as an escape sequence, yielding ab — silently mangling the column name.

The consequence is that the WHERE attname = ANY(...) filter fails to match columns with backslashes in their names, causing the statistics import to miss those columns entirely. PostgreSQL then emits a WARNING about missing attribute statistics rather than successfully importing them.

Architectural Significance

This bug illustrates a subtle but important principle in PostgreSQL's internal string handling: context-specific quoting matters. The same string value must be escaped differently depending on where it appears — as a SQL identifier, as a string literal, or as an element within an array literal. Using the wrong quoting function in the wrong context leads to silent data corruption bugs that are extremely hard to diagnose.

While the bug's practical impact is limited (backslashes in column names are uncommon), it represents a correctness issue in the FDW infrastructure that could cause silent failures in production for users with unusual naming conventions.

Proposed Solutions

Original Patch (Satya)

The initial patch addressed the specific problem of backslash escaping within array literals. The exact approach in the draft patch is not detailed, but it targeted build_remattrmap() to properly handle the escaping.

Preferred Solution (Etsuro Fujita)

Fujita identified a simpler and more architecturally correct solution: replace quote_identifier() with deparseStringLiteral() for formatting column names in this context.

deparseStringLiteral() is the standard postgres_fdw utility for producing properly-escaped string literals for use in deparsed queries sent to remote servers. It correctly handles:

This approach is superior because:

  1. It reuses existing, well-tested FDW deparsing infrastructure
  2. It produces values that are safe in all string-parsing contexts (including array element parsing)
  3. It's consistent with how other parts of postgres_fdw construct remote queries

The fix was ultimately committed as 5107398e6d5e, resolving the issue through the thread referenced by Fujita.

Design Considerations

The key design insight here is that when constructing array literals for remote queries, each element needs to be escaped for two levels of parsing:

  1. The PostgreSQL array literal parser (which interprets backslashes)
  2. The SQL string/identifier context within the array

Using deparseStringLiteral() handles both levels correctly because it produces output that is safe against backslash interpretation, while quote_identifier() only addresses SQL identifier rules and is unaware of the array-parsing layer.