Core Problem: Collation-Aware Semantics vs. HAVING-to-WHERE Pushdown
PostgreSQL has a long-standing planner optimization that moves HAVING quals to the WHERE clause when they don't reference any aggregate and don't depend on grouping semantics. This is a meaningful win: WHERE quals are evaluated before grouping, dramatically reducing the number of rows fed into aggregation. The transformation is performed in subquery_planner() / preprocess_grouping_sets() area and has been safe for decades under the assumption that a predicate like x = 'a' produces the same answer regardless of whether it runs pre- or post-grouping.
Nondeterministic collations (introduced in PG12) break that assumption. With a nondeterministic collation, multiple distinct byte sequences can be considered equal. When GROUP BY x uses collation case_insensitive, the rows 'a' and 'A' collapse into a single group represented by one of the two values. A HAVING clause such as x = 'a' COLLATE "C" applied post-grouping tests the single surviving representative; applied pre-grouping (after pushdown to WHERE) it tests each individual row. These give different answers:
SELECT x, count(*) FROM t GROUP BY x HAVING x = 'a' COLLATE "C";
-- Correct (post-group): count=2 for the merged 'a'/'A' group (if representative is 'a')
-- Incorrect (pushed): count=1, because 'A' is filtered out before grouping
This is a correctness bug, not a performance issue — the planner silently changes aggregate results.
Why This Is Architecturally Tricky
The pushdown safety check traditionally only asked: "does this qual reference aggregates or grouping-set columns?" It had no concept of collation identity of the grouping operation. Under deterministic collations, equality is bit-identical, so the representative row is interchangeable with any other row in the group, and pre- vs. post-group filtering are equivalent. Nondeterministic collations destroy that equivalence.
The fix requires the planner to understand, per HAVING sub-expression, whether the collation used to evaluate a comparison matches the collation that defined the group. If they differ, the qual must stay in HAVING.
The v18+ Solution: Leveraging RTE_GROUP Vars
Richard Guo's approach exploits a v18 infrastructure change: GROUP Vars — Vars that reference a synthetic RTE_GROUP range table entry representing the grouped relation. A Var with varno == grouped RTE carries the collation of the grouping expression in its varcollid. This is the key hook: when walking a HAVING clause, encountering a GROUP Var tells you "this value has grouping-collation semantics"; the surrounding operator node's inputcollid tells you what collation the comparison will actually use. A mismatch where the grouping collation is nondeterministic means pushdown is unsafe for that sub-clause.
Critically, this is per-clause granularity: safe clauses still get pushed; only unsafe ones stay in HAVING. A blanket opt-out would regress performance for the common case.
Evolution of the Walker
- v1 — Bottom-up: at each collation-aware node, call
pull_var_clause()on children to find GROUP Vars. Wenhui Qiu flagged the O(n²) risk of repeated subtree traversal. - v2 — Fixed a real bug:
pull_var_clause()must recurse throughAggrefnodes because havingQual still contains them at this planning stage, and GROUP Vars can appear inside Aggref direct arguments. - v3 (committed) — Inverted to top-down with a LIFO collation stack: push each node's
inputcollidonto the stack on descent, pop on ascent; when a GROUP Var is reached, compare itsvarcollidagainst the stack. Single-pass, O(n).
The RowCompareExpr Special Case
exprInputCollation() returns a single collation, but RowCompareExpr has a per-column inputcollids[] array. The walker must manually descend into each (largs[i], rargs[i]) pair pushing the matching per-column collation. Without this, ROW(x, 1) < ROW('ABC' COLLATE case_sensitive, 1) over a case-insensitive group gives wrong results. This foreshadows the post-commit defect.
The Backpatch Decision
Richard declined to backpatch before v18 because pre-v18 branches lack RTE_GROUP / GROUP Vars entirely. A pre-v18 fix would require a fundamentally different, coarser mechanism — likely "disable pushdown if any GROUP BY expression uses a nondeterministic collation." Carrying a divergent, more-invasive fix on stable branches against zero field reports was judged higher-risk than leaving the bug. This is a defensible stability-over-correctness call for an obscure corner case, though it leaves v15–v17 silently wrong.
Post-Commit Defect: CaseExpr Bypass (Satya's Report)
Satya Narlapuram identified that the top-down walker only recognizes collation-bearing node types covered by exprInputCollation() plus the manually-handled RowCompareExpr. CaseExpr is not covered. A CASE x WHEN 'abc' COLLATE cs THEN ... END performs an equality test between x and the WHEN expression using the WHEN's collation, but because the walker never pushes anything onto the collation stack for CaseExpr, the GROUP Var x sees an empty/matching stack and the clause is deemed safe for pushdown. Result: the case-sensitive filter runs pre-grouping and eliminates 'ABC' before it can be counted alongside 'abc'.
The proposed fix mirrors the RowCompareExpr handling: add an explicit CaseExpr branch that, for the shorthand CASE expr WHEN ... form, pushes the comparison collation while descending into arg and each CaseWhen.expr.
Architectural Lesson
This defect exposes a completeness fragility in the design: the walker is a whitelist of node types that introduce comparison collations, and any node type not in the whitelist silently creates a correctness hole. Candidates worth auditing similarly include:
ScalarArrayOpExpr(covered byexprInputCollation, likely OK)MinMaxExpr(LEAST/GREATEST — uses aninputcollid, should be covered)NullIfExpr(covered)CaseExprwith implicit equality (the bug)CoerceToDomainwrapping comparisons- Subplans / SubLinks containing correlated comparisons
A more robust design might invert the logic: enumerate every node that performs a collation-sensitive comparison and require explicit handling, with an assertion failure for unrecognized comparison-bearing nodes in assert builds.
Performance Considerations
Wenhui Qiu's early concern about pull_var_clause() overhead was legitimate in principle (repeated subtree scans → quadratic), though Richard correctly noted HAVING clauses are human-written and small. The v3 rewrite to single-pass top-down walking resolved the theoretical concern cleanly and is the kind of refactor that's obviously correct once pointed out. The exchange is a small case study in taking paranoid review feedback seriously even when the practical impact is bounded.
Key Design Tradeoffs Summary
| Decision | Alternative | Rationale |
|---|---|---|
| Per-clause conflict detection via GROUP Vars | Disable pushdown whenever any GROUP key has nondeterministic collation | Preserves optimization for safe clauses |
| Top-down walker with collation stack | Bottom-up with pull_var_clause per node |
Single-pass, avoids quadratic behavior |
| Backpatch to v18 only | Backpatch different fix to all supported branches | No field reports; pre-v18 fix would be invasive |
| Whitelist collation-bearing nodes | Blacklist / assertion-based completeness | Simpler but fragile — CaseExpr bug demonstrates |