Refactor code around GUC default_toast_compression

First seen: 2026-05-01 07:50:46+00:00 · Messages: 5 · Participants: 3

Latest Update

2026-05-14 · claude-opus-4-6

Incremental Update: v3 Patch — GUC Encoding Reversal and Naming Changes

Key Development: Paquier Abandons the Separate GUC Integer Encoding

The most significant change in this round is Michael Paquier's reversal on the GUC representation strategy. After reflecting on Evan Chao's feedback, he concludes that the separate _GUC integer encoding (TOAST_PGLZ_COMPRESSION_GUC = 0, TOAST_LZ4_COMPRESSION_GUC = 1) "serves no actual purpose" and the GUC can simply continue storing the attcompression char values directly. This eliminates an entire layer of translation that v2 introduced.

This is a meaningful simplification: the registry no longer needs to map between three distinct numeric spaces (on-disk ID, catalog char, GUC int) — it now only maps between two (on-disk ID and catalog char), with the GUC reusing the catalog char as before.

Naming Concession: _ID Suffix Adopted

Paquier accepts Evan's naming suggestion but with a different suffix than originally proposed. The on-disk compression ID macros will use _ID (e.g., TOAST_PGLZ_COMPRESSION_ID) to distinguish them from the catalog char macros. This addresses the readability hazard where TOAST_COMPRESS_PGLZ (value 0) and TOAST_PGLZ_COMPRESSION (value 'p') were confusingly similar despite representing entirely different numeric domains.

Forward-Looking Architectural Notes

Paquier signals two future directions beyond this patch:

  1. CompressionIdIsValid() could become smarter — potentially handling cases where the same 2-bit ID values map to different compression methods across different vartag_external or varlena types. This hints at a possible path to expand beyond the 4-slot ceiling without changing the on-disk bit width.
  2. ToastCompressionId could become uint32 with values moved to varatt.h — but Paquier explicitly says he wants to be "more ambitious" than that and considers this patch a stepping stone.

Implications for the ABI-Break Concern

By dropping the separate GUC encoding entirely, the ABI-break concern Evan raised is largely neutralized. If the GUC continues to store 'p'/'l' as before, out-of-tree extensions reading default_toast_compression directly will see the same values they always have. The rename/suffix question for the GUC variable itself becomes moot. The remaining naming changes (_ID suffix on on-disk macros) will still cause compile errors for code using the old macro names, which is desirable hygiene.