Making the ENUM operators LEAKPROOF

First seen: 2026-04-29 15:32:44+00:00 · Messages: 2 · Participants: 1

Latest Update

2026-05-11 · opus 4.7

Making ENUM Operators LEAKPROOF — Technical Analysis

The Core Problem

PostgreSQL's LEAKPROOF function attribute is a promise to the planner: the function will not leak information about its arguments via any side channel (error messages, timing, etc.) beyond its return value. This attribute is critical because it governs whether the optimizer may push qualifications below security barriers — specifically below Row-Level Security (RLS) policies and security_barrier views. A non-leakproof qual must be evaluated after the security filter, which frequently prevents index usage and blocks plan shapes that would otherwise be efficient.

For ENUMs, the practical consequence is severe: users with RLS policies or security barrier views on tables containing enum columns cannot get index scans on equality/ordering predicates against those columns, because enum_eq, enum_lt, enum_cmp, etc. are currently marked non-leakproof. This has been a recurring complaint (see prior threads [1], [2], [3] cited in the first email), and the code change is a one-line catalog tweak — the hard part is the proof obligation.

Why ENUMs Are Architecturally Tricky for LEAKPROOF

Unlike most built-in scalar types, enum comparison is not a pure function of its OID arguments. The ordering of enum values is not baked into the OID (since ALTER TYPE ... ADD VALUE ... BEFORE/AFTER can insert values anywhere in the sort order), so enum_cmp_internal():

  1. Does a SearchSysCache1(ENUMOID, ...) to fetch the pg_enum row and its enumsortorder.
  2. On the "added after ALTER TYPE" path where sort order cannot be determined from the cached tuple alone, calls compare_values_of_enum(), which materializes a per-type sorted cache via load_enum_cache_data() — an actual index scan on pg_enum.

This means enum comparison touches the syscache, the relcache, can allocate memory, and can emit several error messages. Each of these is a potential leak vector that must be reasoned about. This is why the discussion is less "flip a bit in pg_proc" and more "establish a methodology for LEAKPROOF proofs."

Laurenz's Proposed Methodology (the Real Contribution)

Laurenz Albe's central proposal is meta-technical: establish the goal posts for what LEAKPROOF requires. He argues the bar should be calibrated against what RLS and security barrier views actually promise, not against a theoretical ideal. His framework:

  1. Threat model = application-supplied parameters, not arbitrary SQL. The RLS/security-barrier documentation already concedes that a user able to run arbitrary SQL can defeat these mechanisms (e.g., via EXPLAIN (ANALYZE), custom functions, side-channel observation). Therefore LEAKPROOF need only defend against attackers who inject values into a parameterized query executed by a trusted application. This excludes timing attacks (swamped by network/app jitter) and anything requiring catalog manipulation.

  2. OOM errors can be disregarded. Several allocators (dshash.c, mbutils.c, dsa.c, mcxt.c) emit the requested allocation size in their error message, which is technically data-dependent. But triggering a specific allocation failure through an application-level parameter requires engineering memory pressure at exactly the right moment — not feasible as an application-level attack. Laurenz notes he'd alternatively prefer stripping sizes from these messages to remove the concern entirely.

  3. Data-corruption-only error paths don't count. Errors reachable only when catalog contents are inconsistent (e.g., an enum OID not present in pg_enum) are excluded, because a user cannot induce corruption via SQL parameters.

Analysis of the ENUM Error Paths

Applying his framework, Laurenz enumerates the three user-visible ereports in the enum comparison stack:

Error Location Reachability via parameterized SQL
"invalid internal value for enum: %u" enum_cmp_internal Requires passing a non-enum to the comparator — impossible through SQL's type system
"%s is not an enum" load_enum_cache_data Same — unreachable via SQL
"enum value %u not found in cache for enum %s" compare_values_of_enum Requires comparing an enum OID not in pg_enum for that type — only reachable via corruption

The syscache/index-scan memory allocations cache information (pg_enum contents) that is already publicly readablepg_enum has no ACL restriction — so even if allocation behavior were observable, no secret is exposed.

SearchCatCacheInternal has DEBUG2 messages, but they don't print catalog values, so at normal log_min_messages they emit nothing, and even at DEBUG2 they don't leak the compared values.

Conclusion: all error paths are either SQL-unreachable or corruption-only, so the enum comparison operators meet the proposed LEAKPROOF bar.

The text Precedent (Follow-up)

In the May 10 follow-up, Laurenz strengthens the case via a precedent argument: texteq/text_lt/etc. are already marked leakproof. But text has a worse leak surface than enum, because comparing against a TOASTed value triggers detoasting, which can OOM in a size-dependent way — so an attacker supplying a short probe value could in principle learn something about the length of the stored value via OOM timing/error. The project has collectively decided this risk is acceptable for text. Since enum_cmp has no detoast path and no value-dependent allocation, it is strictly safer than an already-blessed leakproof function. This is a strong "a fortiori" argument.

Design Decisions & Tradeoffs

Participant Weight

Only Laurenz Albe has posted in this thread so far. Laurenz is a well-known Cybertec contributor with significant work on RLS, security, and documentation; his opinions on the RLS threat model carry weight because he authored much of the current thinking on [4] (the referenced security-list discussion). The thread is effectively an RFC awaiting committer input — the eventual arbiter will almost certainly be Tom Lane, who has historically gated LEAKPROOF decisions and was the one who marked text operators leakproof, establishing the precedent Laurenz leans on.

Open Questions Posed to the List

  1. Is the "application-parameter-only" threat model the right bar for LEAKPROOF?
  2. Can OOM errors be categorically disregarded?
  3. Given (1) and (2), are the enum comparators leakproof?

The thread is currently awaiting answers to (1) and (2) — these are the real deliverables. A "yes" to both would unlock not just enum operators but a broader category of functions currently blocked by nominal-but-unexploitable leak paths.