[Patch] Omit virtual generated columns from test_decoding output

First seen: 2026-05-05 01:11:11+00:00 · Messages: 3 · Participants: 2

Latest Update

2026-05-06 · opus 4.7

Analysis: Omit virtual generated columns from test_decoding output

Core Problem

Virtual generated columns (introduced in PostgreSQL 18) are computed on read and never materialized in the heap tuple or the WAL stream. This creates a representational mismatch for logical decoding consumers:

This is not a WAL bug — WAL correctly carries only the base columns. The defect is purely in how test_decoding renders the reassembled tuple: it treats a virtual generated column as if it were a normal heap attribute that happened to be NULL, rather than recognizing that the column has no physical representation at all.

Why It Matters Architecturally

Logical decoding output plugins are the contract between the WAL replay pipeline and downstream consumers (CDC, logical replication, test harnesses). The in-core plugin pgoutput already resolves this correctly: logicalrep_should_publish_column() excludes virtual generated columns from the column list sent over the wire, irrespective of the publication's publish_generated_columns option (which only applies to stored generated columns, since only those have real values to publish).

test_decoding, historically a debugging/regression-testing plugin, has diverged. While it is documented as an example plugin, it is in practice used:

  1. As the canonical reference for output plugin authors.
  2. By the PostgreSQL test suite itself to validate decoding behavior.
  3. By some third-party CDC solutions (as Euler notes) that consume its text output directly.

Emitting a spurious null for a virtual column:

Proposed Fix

The patch filters out attributes with attgenerated == ATTRIBUTE_GENERATED_VIRTUAL in tuple_to_stringinfo() before calling heap_getattr(). Stored generated columns (attgenerated == 's') are preserved because their values are genuinely in the heap tuple and in WAL — skipping them would be a regression.

The v1 patch added a new regression test file; v2 folds the test into the existing contrib/test_decoding/sql/ddl.sql per Euler's feedback about test execution time and proliferation of small test files.

Key Design Tension: Backward Compatibility

Euler Taveira (a recognized logical replication subject-matter expert and author of pg_legacy_replication / contributor to logical decoding features) raises the central design question: is the current output a bug or a documented behavior that downstream tools depend on?

His framing — "I wouldn't say misleading but expected" — is technically defensible: given that virtual columns aren't in WAL, NULL is the mechanically consistent output. Downstream CDC tools parsing test_decoding output may already special-case or ignore this NULL, and silently dropping the column changes the column count / positional layout of the text output.

Two options are on the table:

  1. Unconditional fix (Satya's preference): Just omit the columns. Cleaner, matches pgoutput, treats the current output as a bug.
  2. Opt-in via plugin option (Euler's suggestion): Add a parameter like include-virtual-generated-columns defaulting to the old behavior, preserving compatibility.

Satya pushes back that the old output is always NULL and therefore carries no information — making an option feel like over-engineering. This is a reasonable point: the "compatibility" being preserved is compatibility with a constant null token, not with any real data.

The thread does not resolve this; Euler explicitly defers to broader community input ("I am open to this idea if others feel the same").

Technical Subtleties

Assessment

The patch is small, correct, and aligns test_decoding with the already-established semantics in pgoutput. The only open question is compatibility policy. Given that (a) the removed output is informationless (always NULL), (b) test_decoding is explicitly a contrib/example plugin with weaker stability guarantees than core wire protocols, and (c) virtual generated columns are a new-in-18 feature so the window of dependency is narrow, the unconditional fix is likely the right call — but a committer weigh-in is needed to make that policy judgment.