Wrong results from inner-unique joins caused by collation mismatch

2026-05-06 · opus 4.7

Core Problem: Uniqueness Proofs Broken by Nondeterministic Collations

The planner relies on proving uniqueness of a relation's output to enable several significant optimizations:

Inner-unique join execution (match_first_tuple_only in the executor, which stops scanning the inner side after the first match)
Left-join removal (dropping a left join entirely when the inner side is unique and no columns from it are used)
Semijoin → inner-join reduction
Self-join elimination (a newer PG17-era optimization)
DISTINCT/GROUP BY proofs via query_is_distinct_for() used by subquery pullup / unique-ness proofs for join inputs

All of these ultimately consult relation_has_unique_index_for(), which walks a rel's index list looking for a unique index whose columns are each pinned to constants or matched by a mergejoinable equality restriction clause. Historically, this function carried an explicit XXX comment punting on collation:

/*
 * XXX at some point we may need to check collations here too.
 * For the moment we assume all collations reduce to the same
 * notion of equality.
 */

That assumption was tenable pre-PG12: all textual collations were deterministic, so = ultimately reduced to byte-identity regardless of which collation was in scope. The assumption silently broke when PG12 introduced nondeterministic collations (ICU locales with e.g. ks=level2 for case-insensitivity, or stripping accents). Under a nondeterministic collation 'A' = 'a' can be true, even though a unique btree built under the default (deterministic) collation happily stores both rows.

The minimal reproducer is damning:

create collation ci (provider=icu, locale='und-u-ks-level2', deterministic=false);
create table t (a text);
insert into t values ('A'),('a');
create unique index on t(a);  -- built under default deterministic collation

select * from t t1 join t t2 on t1.a = t2.a collate ci;

The correct answer is 4 rows (A=A, A=a, a=A, a=a — all equal under ci). The planner, however, sees a unique index on t.a, declares the inner side inner-unique, and the executor returns only 2 rows because it stops after the first match on the inner. This is a silent wrong-results bug across every supported branch since PG12.

First Attempt and Why It Was Too Strict

Richard Guo's initial fix was to reuse the existing IndexCollMatchesExprColl() predicate:

if (!IndexCollMatchesExprColl(ind->indexcollations[c],
                              exprInputCollation((Node *) rinfo->clause)))
    continue;

This passed the new test but regressed an existing join.sql case: left-join removal over a (name, text) join key combination stopped working. name and text carry different but both deterministic collations (C-collation for name, default for text). Refusing the uniqueness proof on any collation mismatch is overkill — for deterministic collations, = is byte-equality regardless of locale, so mixing two deterministic collations is perfectly safe for uniqueness reasoning.

The correct predicate is therefore narrower: reject only when at least one side is nondeterministic and the two collation OIDs differ.

Tom Lane's Architectural Guidance

Tom Lane (core committer, de-facto planner maintainer) pushed back on two grounds:

Factor the logic. He predicted — correctly — that the same test would be needed elsewhere in the planner. Open-coding it at the one call site would lead to drift.
Readability. The inline boolean expression mixing determinism flags and OID comparisons was hard to audit, particularly the asymmetric cases (one deterministic, one not).

Both points were vindicated almost immediately: Richard found that query_is_distinct_for() — which proves uniqueness of DISTINCT and GROUP BY subqueries — has the identical hazard:

select * from t t1 join (select distinct a from t) t2
  on t1.a = t2.a COLLATE "ci";     -- returns 2 rows, should be 4
select * from t t1 join (select a from t group by a) t2
  on t1.a = t2.a COLLATE "ci";     -- same bug

Here the subquery's DISTINCT/GROUP BY collapses 'A' and 'a' under the default deterministic collation, producing a "unique" output, but the outer join's equality operates under ci, under which the collapsed row matches both rows in t1.

The Committed Solution

The final design introduced a named helper. Richard initially called it collations_are_compatible(), but renamed it to collations_agree_on_equality() in response to his own critique that the earlier name overclaimed — the predicate says nothing about ordering (<), and two deterministic collations with different sort orders still agree on =. This is a useful distinction: btree's uniqueness enforcement depends on the equality equivalence class, not on the full sort order.

Semantics of collations_agree_on_equality(coll_a, coll_b):

If either side is InvalidOid → compatible. This generalizes the asymmetric behavior in IndexCollMatchesExprColl(), treating absence of a specified collation as "no constraint." Richard flagged this as the one slightly uncertain design choice — an absent collation on an expression plausibly inherits from context, and treating it as universally compatible is the safe-for-uniqueness-proofs choice but could mask bugs elsewhere.
If both sides are deterministic → compatible regardless of OID (byte-equality is locale-independent).
Otherwise require exact OID match.

The helper is applied at two sites:

relation_has_unique_index_for() — fixing inner-unique, left-join removal, semijoin reduction, and self-join elimination in one shot.
query_is_distinct_for() — fixing the DISTINCT/GROUP BY subquery case.

Back-Patching and the ABI Problem

The fix is a correctness bug affecting all branches back to PG12, so it must be back-patched. But query_is_distinct_for() needed a new argument (the list of clause collations to check against). Changing an exported function's signature on stable branches is an ABI break that could crash extensions linked against the old signature.

The committer resolved this cleanly:

On master: change the signature directly.
On back-branches: introduce a new internal function query_is_distinct_for_with_collations() carrying the new logic, and keep query_is_distinct_for() as a thin wrapper preserving the old signature. Extensions continue to link, though they'd be calling the non-collation-aware variant (acceptable because extensions almost certainly aren't calling this planner internal, and if they were, they'd get pre-existing behavior rather than a crash).

This is textbook PostgreSQL back-branch discipline.

Why This Matters Architecturally

The bug illustrates a recurring pattern: when a new feature (nondeterministic collations in PG12) weakens a long-standing invariant (equality ↔ byte-equality), the code needs to be audited for every place that silently assumed the invariant. The XXX comment in relation_has_unique_index_for() shows that at least one developer foresaw the hazard; it took six years and a user-visible wrong-results report to close the loop. The companion thread on GROUP BY / havingQual collation mismatches suggests more dusty corners remain.

The fix is also a good example of planner-side equivalence-class reasoning being finer-grained than full collation equality: uniqueness proofs care about the equality partition, not the order. Bundling that distinction into a named helper (collations_agree_on_equality) makes the reasoning reusable and auditable rather than hidden inside boolean soup.