Should IGNORE NULLS cache nullness for volatile arguments?

First seen: 2026-05-14 04:14:54+00:00 · Messages: 1 · Participants: 1

Latest Update

2026-05-14 · claude-opus-4-6

Technical Analysis: Should IGNORE NULLS Cache Nullness for Volatile Arguments?

Core Problem

The thread raises a subtle correctness concern with PostgreSQL's implementation of IGNORE NULLS support for window functions (a feature that was added in a recent release). The issue lies at the intersection of expression evaluation caching and volatile function semantics.

How IGNORE NULLS Works Internally

When IGNORE NULLS is specified for a window function (e.g., LEAD, LAG, FIRST_VALUE, LAST_VALUE, NTH_VALUE), the executor must skip over rows where the argument expression evaluates to NULL. The implementation optimizes this by caching the nullness determination — recording whether a given argument was NULL or NOT NULL when first checked — to avoid redundant evaluation.

The Volatility Problem

The optimization assumes that the nullness of an expression is stable across evaluations within the same row context. This assumption holds for:

However, it breaks for volatile expressions (e.g., random(), functions with side effects, or functions that depend on external state). A volatile argument could:

  1. Evaluate as NOT NULL during the nullness check phase
  2. Be marked as "not null" in the cache
  3. Later, when the actual value is needed, be re-evaluated and return NULL

This creates a semantic inconsistency: the system believes it has a non-null value (based on the cached nullness determination), but the actual re-evaluation produces NULL. The result is that a NULL value could "leak through" the IGNORE NULLS filter, or conversely, the system could attempt to use a value that has silently become NULL.

Why This Matters Architecturally

PostgreSQL has a long-standing contract that volatile functions are re-evaluated on every reference. The caching optimization violates this contract by splitting evaluation into two phases (nullness check and value retrieval) that can disagree for volatile expressions. While users of volatile functions in window contexts might expect non-deterministic results, they should not get internally inconsistent results where the system's own invariants are violated.

Proposed Solution

The patch takes a conservative approach:

  1. Check expression cacheability: Before relying on the cached nullness result, verify that the argument expression is safe to reuse (i.e., it contains no volatile components).
  2. Fall back for volatile arguments: When the argument is not cacheable (volatile), treat the nullness as unknown and re-evaluate the argument expression, ensuring the nullness determination and value retrieval are consistent.

This approach:

Design Tradeoffs

Approach Pros Cons
Current (cache always) Fast for all cases Incorrect for volatile args
Patch (cache if safe) Correct for all cases Slightly slower for volatile args
Cache both nullness AND value Correct and fast Higher memory usage, more complex code

The author notes an alternative: if both the nullness determination and the value itself were cached together (so the cached value is used instead of re-evaluating), the inconsistency would disappear. However, the current implementation only caches the nullness flag, not the actual value, which creates the window for divergence.

Technical Assessment

This is a real correctness issue, albeit one that only manifests in edge cases (volatile expressions used as arguments to window functions with IGNORE NULLS). The severity is moderate:

The patch appears to be a minimal, targeted fix that respects PostgreSQL's existing patterns for handling volatile expressions (similar to how the executor handles volatile expressions in index scans and other contexts where caching is used).

Open Questions

  1. Should the fix instead cache the evaluated value alongside the nullness flag, avoiding re-evaluation entirely?
  2. Are there other places in the window function code where similar volatility assumptions are made?
  3. Should this be back-patched to whatever version introduced IGNORE NULLS support?