Fix HAVING-to-WHERE pushdown with nondeterministic collations

First seen: 2026-03-31 03:41:09+00:00 · Messages: 7 · Participants: 3

Latest Update

2026-05-06 · opus 4.7

Core Problem: Collation-Aware Semantics vs. HAVING-to-WHERE Pushdown

PostgreSQL has a long-standing planner optimization that moves HAVING quals to the WHERE clause when they don't reference any aggregate and don't depend on grouping semantics. This is a meaningful win: WHERE quals are evaluated before grouping, dramatically reducing the number of rows fed into aggregation. The transformation is performed in subquery_planner() / preprocess_grouping_sets() area and has been safe for decades under the assumption that a predicate like x = 'a' produces the same answer regardless of whether it runs pre- or post-grouping.

Nondeterministic collations (introduced in PG12) break that assumption. With a nondeterministic collation, multiple distinct byte sequences can be considered equal. When GROUP BY x uses collation case_insensitive, the rows 'a' and 'A' collapse into a single group represented by one of the two values. A HAVING clause such as x = 'a' COLLATE "C" applied post-grouping tests the single surviving representative; applied pre-grouping (after pushdown to WHERE) it tests each individual row. These give different answers:

SELECT x, count(*) FROM t GROUP BY x HAVING x = 'a' COLLATE "C";
-- Correct (post-group): count=2 for the merged 'a'/'A' group (if representative is 'a')
-- Incorrect (pushed):   count=1, because 'A' is filtered out before grouping

This is a correctness bug, not a performance issue — the planner silently changes aggregate results.

Why This Is Architecturally Tricky

The pushdown safety check traditionally only asked: "does this qual reference aggregates or grouping-set columns?" It had no concept of collation identity of the grouping operation. Under deterministic collations, equality is bit-identical, so the representative row is interchangeable with any other row in the group, and pre- vs. post-group filtering are equivalent. Nondeterministic collations destroy that equivalence.

The fix requires the planner to understand, per HAVING sub-expression, whether the collation used to evaluate a comparison matches the collation that defined the group. If they differ, the qual must stay in HAVING.

The v18+ Solution: Leveraging RTE_GROUP Vars

Richard Guo's approach exploits a v18 infrastructure change: GROUP Vars — Vars that reference a synthetic RTE_GROUP range table entry representing the grouped relation. A Var with varno == grouped RTE carries the collation of the grouping expression in its varcollid. This is the key hook: when walking a HAVING clause, encountering a GROUP Var tells you "this value has grouping-collation semantics"; the surrounding operator node's inputcollid tells you what collation the comparison will actually use. A mismatch where the grouping collation is nondeterministic means pushdown is unsafe for that sub-clause.

Critically, this is per-clause granularity: safe clauses still get pushed; only unsafe ones stay in HAVING. A blanket opt-out would regress performance for the common case.

Evolution of the Walker

  1. v1 — Bottom-up: at each collation-aware node, call pull_var_clause() on children to find GROUP Vars. Wenhui Qiu flagged the O(n²) risk of repeated subtree traversal.
  2. v2 — Fixed a real bug: pull_var_clause() must recurse through Aggref nodes because havingQual still contains them at this planning stage, and GROUP Vars can appear inside Aggref direct arguments.
  3. v3 (committed) — Inverted to top-down with a LIFO collation stack: push each node's inputcollid onto the stack on descent, pop on ascent; when a GROUP Var is reached, compare its varcollid against the stack. Single-pass, O(n).

The RowCompareExpr Special Case

exprInputCollation() returns a single collation, but RowCompareExpr has a per-column inputcollids[] array. The walker must manually descend into each (largs[i], rargs[i]) pair pushing the matching per-column collation. Without this, ROW(x, 1) < ROW('ABC' COLLATE case_sensitive, 1) over a case-insensitive group gives wrong results. This foreshadows the post-commit defect.

The Backpatch Decision

Richard declined to backpatch before v18 because pre-v18 branches lack RTE_GROUP / GROUP Vars entirely. A pre-v18 fix would require a fundamentally different, coarser mechanism — likely "disable pushdown if any GROUP BY expression uses a nondeterministic collation." Carrying a divergent, more-invasive fix on stable branches against zero field reports was judged higher-risk than leaving the bug. This is a defensible stability-over-correctness call for an obscure corner case, though it leaves v15–v17 silently wrong.

Post-Commit Defect: CaseExpr Bypass (Satya's Report)

Satya Narlapuram identified that the top-down walker only recognizes collation-bearing node types covered by exprInputCollation() plus the manually-handled RowCompareExpr. CaseExpr is not covered. A CASE x WHEN 'abc' COLLATE cs THEN ... END performs an equality test between x and the WHEN expression using the WHEN's collation, but because the walker never pushes anything onto the collation stack for CaseExpr, the GROUP Var x sees an empty/matching stack and the clause is deemed safe for pushdown. Result: the case-sensitive filter runs pre-grouping and eliminates 'ABC' before it can be counted alongside 'abc'.

The proposed fix mirrors the RowCompareExpr handling: add an explicit CaseExpr branch that, for the shorthand CASE expr WHEN ... form, pushes the comparison collation while descending into arg and each CaseWhen.expr.

Architectural Lesson

This defect exposes a completeness fragility in the design: the walker is a whitelist of node types that introduce comparison collations, and any node type not in the whitelist silently creates a correctness hole. Candidates worth auditing similarly include:

A more robust design might invert the logic: enumerate every node that performs a collation-sensitive comparison and require explicit handling, with an assertion failure for unrecognized comparison-bearing nodes in assert builds.

Performance Considerations

Wenhui Qiu's early concern about pull_var_clause() overhead was legitimate in principle (repeated subtree scans → quadratic), though Richard correctly noted HAVING clauses are human-written and small. The v3 rewrite to single-pass top-down walking resolved the theoretical concern cleanly and is the kind of refactor that's obviously correct once pointed out. The exchange is a small case study in taking paranoid review feedback seriously even when the practical impact is bounded.

Key Design Tradeoffs Summary

Decision Alternative Rationale
Per-clause conflict detection via GROUP Vars Disable pushdown whenever any GROUP key has nondeterministic collation Preserves optimization for safe clauses
Top-down walker with collation stack Bottom-up with pull_var_clause per node Single-pass, avoids quadratic behavior
Backpatch to v18 only Backpatch different fix to all supported branches No field reports; pre-v18 fix would be invasive
Whitelist collation-bearing nodes Blacklist / assertion-based completeness Simpler but fragile — CaseExpr bug demonstrates