Add a greedy join search algorithm to handle large join problems

First seen: 2025-12-02 03:48:26+00:00 · Messages: 31 · Participants: 7

Latest Update

2026-05-06 · opus 4.7

GOO (Greedy Operator Ordering) as a Potential GEQO Replacement

The Core Problem

PostgreSQL's join-order search has two regimes: exhaustive dynamic programming (DP) in make_rel_from_joinlist / standard_join_search, and GEQO (Genetic Query Optimizer) which kicks in once the number of baserels exceeds geqo_threshold (default 12). Both regimes have well-known limitations:

This thread proposes a third option: GOO (Greedy Operator Ordering) based on Fegaras's 1998 DEXA paper. GOO's algorithm is simple:

  1. Enumerate all currently-legal join pairs among remaining relations.
  2. Rank them by some greedy key (cost, output rows, result bytes, selectivity).
  3. Merge the cheapest/smallest pair into a single "super-relation."
  4. Repeat until one relation remains.

Each iteration is O(N²) in the worst case; total complexity is roughly O(N³), far cheaper than DP or GEQO's configured generations × population.

Architectural Significance

GOO is not merely a performance tweak. It probes a structural question in the planner: how much of the join-ordering problem can be solved by a deterministic, single-pass contraction versus full DP enumeration? The answer matters because:

The Patch's Evolution

The patch series went through several iterations, and the trajectory is itself informative:

Key Technical Findings

1. Greedy keys have qualitatively different failure modes

On TPC-H SF=1, the submitter identified two distinct pathological patterns:

Tomas's response is sharp and correct: Q20 is "garbage in, garbage out" and cannot be solved at the search level. Q7 is the textbook limitation of greedy algorithms and is also not solvable by picking a different scalar key.

2. Selectivity is scale-free and therefore brittle

The v5 JOB-Complex run produced the most damning evidence: GOO(selectivity) causes 6 timeouts and is 6000× slower than GEQO on JOB 24b and 29a. The submitter's diagnosis is precise — selectivity measures relative reduction (join_size / (left × right)) and discards absolute magnitude. A highly selective join between two huge inputs still produces a huge intermediate. This rules out selectivity as a primary key but still leaves it useful as one of several candidates.

3. Plan-diversity via multiple greedy runs is more robust than tuning one key

GOO(combined) on JOB: gmean 0.953 (beats DP on geometric mean), no regressions ≥10×, max 8.68×. Compare GOO(cost) alone: max 431×, 15 queries ≥5×. The lesson: since different greedy keys fail for uncorrelated reasons, running two or three and picking the cheapest final plan converts the individual worst cases into a near-best-of ensemble at modest extra cost.

4. The selector is itself a weakness

In JOB-Complex q28, q12, and 15a, a better GOO candidate existed (result_size) but the total_cost-based final selector chose the worse one. This exposes that any final selection based on the same cost model that failed to predict the good plan's quality is self-limiting. The CIDR 2021 "Simplicity Done Right" paper (Hertzschuch et al.), flagged by Tomas, proposes upper-bound (pessimistic / worst-case) estimates precisely for this reason — they're more resilient to the common nested-loop catastrophe where a very optimistic cardinality wins a planner contest and then the plan never completes.

5. Planning-time results are compelling but contextual

On JOB-Complex the v5 run shows GOO(combined) is ~5.2× faster to plan than GEQO. Lakshmi's synthetic 20-table chain joins show GOO at ~1.5ms vs DP ~48ms and GEQO ~51ms. However, Tomas correctly punctures these synthetic numbers: with every join on the same id column, there's one giant equivalence class, no order is actually forced, and the join_collapse_limit=8 default reduces DP's real problem size. Synthetic linear chains without meaningful selectivity variation tell us almost nothing about plan quality.

Key Disagreements and Framing

The most important framing intervention comes from Tomas and later John Naylor: GOO should not be evaluated against DP. DP is the gold standard when it's feasible; the relevant comparison is GEQO. The submitter gradually internalizes this — the v4 analysis separates the "GEQO range" (≥12 relations) subset of JOB, and v5 focuses entirely on JOB + JOB-Complex. John reinforces this in the final message: reporting small-join regressions is a distraction since those cases wouldn't trigger GEQO anyway.

A secondary disagreement concerns what "good enough" means. The submitter's v5 JOB-Complex numbers show GOO(combined) beats GEQO on execution-time sum (292s vs 430s, 1.47× faster) but has 4 queries ≥10× slower than GEQO. The submitter concludes this is not sufficient to replace GEQO as default. This is a mature judgment: a heuristic with severe tail risk cannot be the default, regardless of average wins.

The Pivot to Effort-Budgeted Hybrid Planning

The final message (2026-05-03) marks a direction change. Rather than pushing pure-GOO further, the submitter proposes a fixed-effort planner that runs exact DP while a budget allows, then completes the remaining problem with GOO contraction. This echoes prior community discussion (linked CAFBsxsE3Sb...) about smoother degradation instead of the current hard cliff at geqo_threshold. This direction is architecturally more promising because:

Interaction with Other Planner Machinery

Several subtle integration points surfaced:

Assessment

The thread is in a healthy state but not close to commit. The core GOO implementation appears stable (v5 passes make check-world), but:

  1. The value proposition as a direct GEQO replacement is unproven given the tail regressions.
  2. The more promising direction (effort-budgeted DP+greedy hybrid) is only prototyped.
  3. No committer has signaled intent to pick up the patch. Tomas is engaged as a reviewer and has domain expertise (he authored the star-join thread that inspired this) but hasn't endorsed commit.
  4. Integration questions (plan advice, parallelism, deterministic tie-breaking across platforms) are only partially addressed.

The intellectual contribution — the systematic evaluation of greedy keys, the identification of selectivity's scale-freeness, the diagnosis that the final selector is itself a weakness, and the pivot toward hybrid DP/greedy — is substantial regardless of whether this specific patch lands.