Multi-Entry Indexing for GiST & SP-GiST

First seen: 2026-05-20 17:44:52+00:00 · Messages: 5 · Participants: 3

Latest Update

2026-06-04 · claude-opus-4-6

Incremental Analysis: Memory Boundedness Concern for TID Deduplication

Two new participants engage with the patch, with Matthias van de Meent raising a fundamental correctness/resource concern about the TID deduplication mechanism, and Andrey Borodin offering a brief rebuttal.

Matthias van de Meent's Review (June 1)

Matthias raises several pointed questions, the most technically significant being:

1. Unbounded Memory for TID Hash (Correctness/Resource Concern)

The core challenge: the simplehash-based TID deduplication must track all TIDs seen so far during a scan to correctly deduplicate. In the worst case, a scan touching many multi-entry tuples could accumulate up to 2^48 TID entries in the hash table, consuming unbounded memory.

Matthias contrasts this with GIN's approach: GIN traverses TID space linearly (posting lists are TID-sorted) and can use lossy bitmap pages that bound memory usage — when memory pressure rises, GIN downgrades from exact TID tracking to page-level tracking, then rechecks at the heap. This lossy approach is safe for bitmap scans but not safe for amgettuple() which must return exact, non-duplicate results without a higher-level dedup layer.

This is a genuine architectural concern: GiST/SP-GiST scans via amgettuple() cannot use lossy dedup without breaking the AM contract. The current implementation implicitly assumes the TID hash fits in memory.

2. Compression Removal Safety

Matthias challenges the assertion that compress can be omitted when extractValue is present, stating "I don't think it's generally safe to remove compression even when extractValue is present." This echoes Borodin's earlier API unification suggestion but from a safety angle — compress may serve purposes beyond just producing leaf keys (e.g., page-level space management).

3. Index-Only Scan Disabling Mechanism

Matthias asks how index-only scans are disabled — specifically whether this requires planner awareness of specific GiST opclasses, which would be an undesirable coupling.

4. Skepticism About AM-Level vs. Specialized AM

Matthias expresses a higher-level architectural reservation: whether bolting multi-entry capability onto GiST/SP-GiST is preferable to building a specialized access method (analogous to how GIN is a specialized decomposition AM rather than a btree extension). This is a design philosophy question about where decomposition logic belongs in the AM hierarchy.

Andrey Borodin's Response (June 2)

Borodin's response to the memory concern is brief but substantive:

Practical dismissal: For realistic IndexScan usage (limited result sets, LIMIT clauses), the TID hash stays small.
Fallback proposal: For pathological cases, spill to a unique tuplestore when the hash/ART exceeds work_mem. This mirrors the general PostgreSQL pattern of memory-bounded hash operations (hash joins, hash aggregation) that spill to disk.

This is a reasonable but incomplete answer — it acknowledges the problem exists but doesn't detail how a tuplestore-based fallback would integrate with the ordered (KNN) scan path, where dedup happens at dequeue time from the pairing heap.

Significance

The memory-boundedness question is the first potential blocking concern raised in the thread. If the TID hash can grow without bound and there's no spill mechanism, the feature could cause OOM in production under adversarial or simply large-scale workloads. The tuplestore suggestion is viable but would need actual implementation and testing, particularly for the KNN path.

History (2 prior analyses)

2026-06-01 · claude-opus-4-6

Incremental Analysis: Andrey Borodin's Architectural Review

Andrey Borodin provides the first substantive review of the GiST-side architecture, raising three distinct technical concerns and broadening the discussion to PostGIS stakeholders.

Key Technical Points

1. TID Hash Overhead on Hot Path (Performance Concern)

Borodin identifies that the current implementation unconditionally routes every leaf entry through the simplehash TID deduplication, even when extractValue returned nentries == 1 (meaning no duplicates are possible). Since GiST scans are CPU-bound (consistency checks on every tuple per page), this hash probe sits on the critical path unnecessarily for single-entry values.

Proposed optimization: Use INDEX_AM_RESERVED_BIT (0x2000 in t_info) — currently unused anywhere in the backend — as a per-tuple flag set at insert/build time only when extractValue returns nentries > 1. During scan, entries without this bit skip the hash entirely. This is feasible because multi-entry opclasses are new non-default opclasses with no existing on-disk format constraints.

This is a practical microoptimization proposal that acknowledges the new leaf format has no backward-compatibility baggage.

2. extractValue as Generalized compress (API Design Critique)

Borodin observes that multirange_me_ops drops the compress support proc (proc 3) and adds extractValue (proc 13), while multirange_ops does the reverse. This means extractValue already functionally supplants compress — it produces leaf-typed values directly, with compress being merely the degenerate case of extractValue constrained to nentries == 1.

He questions whether the two should be unified into a single "produce leaf entries" entry point, with a backward-compatible 1→1 shim over compress for existing opclasses. This would:

Eliminate the insert/build path branching on extractValue existence
Frame multi-entry as a generalization of compress rather than a parallel mechanism
Reduce catalog complexity

This is an API-cleanliness argument that could affect the long-term maintenance burden.

3. Single-Column Restriction Concern

Borodin pushes back on the single-key-column restriction, arguing that features "should be orthogonal to the rest of the AM" and that error-on-multiple-columns tends to become permanent rather than temporary. This suggests he'd prefer at minimum a design that doesn't architecturally preclude future multi-column support.

4. Sorting Build Gap

A minor note: the sorting build path ignores extractValue, meaning bulk index creation via sorted build won't produce multi-entry indexes correctly. Borodin acknowledges this isn't critical for the current stage.

Broadening to PostGIS Stakeholders

Borodin explicitly CC's Darafei and Paul (PostGIS maintainers), asking whether PostGIS would adopt extractValue for multi-part geometries (MultiPolygon with holes, routes, regions with exclaves). He frames this as the key adoption question: multiranges alone are a narrow audience, but PostGIS adoption would validate the AM-level machinery investment.

He specifically asks whether the single-column restriction or per-entry dedup cost would be problematic for PostGIS use cases.

2026-05-22 · claude-opus-4-6

Multi-Entry Indexing for GiST & SP-GiST: Deep Technical Analysis

Core Problem

PostgreSQL's GiST (Generalized Search Tree) and SP-GiST (Space-Partitioned GiST) access methods fundamentally assume a one-to-one mapping between heap tuples and index entries. This constraint becomes a severe performance liability for composite data types like multiranges, where the indexed value has internal structure that is lost when compressed into a single bounding representation.

The Multirange Bounding Box Problem

The current GiST multirange_ops opclass handles multiranges by computing their bounding union range — collapsing a multirange like {[1,10), [100000,100010)} into a single range [1, 100010). This bounding box is stored as the index key. The fundamental issue is that this bounding representation loses all information about gaps between component ranges.

For containment operators (@>, <@), this gap information is critical. A query like mr @> 100000 must check whether 100000 falls within any component range. The bounding box [1, 100010) trivially contains 100000, so the index returns a match — but the actual multirange has a gap from 10 to 100000, making it a false positive. In the benchmark scenario with 100k rows, this results in 99,990 false-positive heap rechecks, making the index scan slower than a sequential scan.

SP-GiST has an even more fundamental limitation: it has no multirange opclass at all, because its quad-tree partitioning scheme has no natural way to handle a composite range value.

Architectural Significance

This problem generalizes beyond multiranges. Any composite data type that can be meaningfully decomposed — arrays of geometric points, route geometries (polylines decomposed into segments), temporal data with gaps — suffers the same bounding-box information loss in GiST. The solution establishes infrastructure for a whole class of future opclasses.

Proposed Solution: Multi-Entry Decomposition

The patch introduces an optional extractValue support function (mirroring GIN's approach) that decomposes a single datum into multiple index entries, each pointing back to the same heap TID. This gives GiST and SP-GiST the decomposition capability that GIN has always had, but within the R-tree and quad-tree frameworks that support range queries and KNN ordering.

Key Design Decisions

1. extractValue as Optional Support Function

Datum *extractValue(Datum value, int32 *nentries, bool **nullFlags)

GiST: Registered as proc 13
SP-GiST: Registered as proc 8

The function is entirely optional; existing opclasses are completely unaffected. This is a crucial backward-compatibility decision — it means zero risk to existing installations and a clean upgrade path.

2. TID Deduplication via simplehash

Since one heap tuple now produces N index entries, scans must deduplicate results. The implementation uses PostgreSQL's simplehash infrastructure to maintain a TID hash during scanning. This is the same approach used in other parts of the executor and provides O(1) amortized lookup.

For ordered (KNN) scans, the design is more nuanced: the TID hash is consulted as an early filter before enqueuing leaf items into the pairing heap, but the actual dedup insertion happens at dequeue time. This ensures the pairing heap can correctly select the copy with the smallest distance — if you deduplicated at enqueue time, you might keep a farther copy and discard a nearer one that arrives later during tree traversal.

3. Separate Opclass (multirange_me_ops)

Rather than modifying the existing multirange_ops, a new multirange_me_ops opclass is introduced. This is necessary because:

Leaf consistent functions see individual sub-entries (component ranges), not the full multirange
The semantics of consistency checking change fundamentally
Strategies like OVERLAPS and CONTAINS_ELEM are exact per-component (no recheck needed)
Strategies like CONTAINS and EQ must set recheck=true because no single component proves containment of the full query

The new opclass is marked non-default, so users must explicitly request it via USING gist (col multirange_me_ops).

4. Internal Node Descent Strategy Relaxation

This is perhaps the most subtle design aspect. In a standard GiST, internal node keys represent bounding unions of their subtree's entries. With multi-entry indexing, a single multirange's components may be scattered across multiple subtrees.

For containment strategies (CONTAINS, EQ), the consistent function at internal nodes must relax to OVERLAPS during descent. If it required the union key to fully contain the query, it could incorrectly prune subtrees that contain some (but not all) components of a matching multirange. Since rechecks happen at the leaf/heap level anyway, this relaxation is safe — it may visit extra subtrees but won't miss valid results.

5. SP-GiST: compress Made Optional

When extractValue is present, the compress function becomes optional for SP-GiST. This is because extractValue directly produces leaf-typed values — for multiranges, it outputs individual ranges that can be fed directly into the existing range quad-tree. The leafType can differ from the input type (e.g., anymultirange → anyrange), enabling elegant type-level decomposition.

6. Single-Column Restriction

Multi-entry indexing is restricted to single-key-column indexes (INCLUDE columns are permitted). This simplifies the initial implementation significantly — multi-column support would require defining semantics for how decomposition interacts with composite index keys, which can be addressed in a future patch.

7. Index-Only Scans Disabled

Since stored sub-entries don't represent the original datum, index-only scans cannot return the correct value from the index alone. This is an inherent limitation of the decomposition approach (GIN has the same constraint).

Empty Multirange Handling

A notable edge case: when extractValue returns zero entries for a non-NULL input, the AM falls back to storing a single NULL index entry (matchable only by IS NULL). However, for multirange semantics, an empty multirange should still be findable by operator queries (e.g., an empty multirange is contained by everything). The opclass handles this by returning an empty range sentinel instead of zero entries, keeping the value visible to operator scans.

Performance Implications

The benchmark demonstrates the extreme case:

Method	Exec Time	Buffers	Rechecks
Sequential scan	7.732 ms	834	-
GiST multirange_ops	9.504 ms	2311	99,990
GiST multirange_me_ops	0.056 ms	6	0
SP-GiST multirange_me_ops	0.112 ms	27	0

The multi-entry approach is ~170x faster than the standard GiST opclass and ~138x faster than sequential scan. The buffer count (6 vs 2311) shows the dramatic I/O reduction.

The tradeoff is index size: storing N entries per multirange means the index is roughly N times larger than a standard GiST index. For multiranges with many components, this could be significant. However, for the common case of multiranges with a small number of wide-gap components, the tradeoff strongly favors multi-entry.

Relationship to GIN

This work fills a design-space gap in PostgreSQL's index AM taxonomy:

GIN: Decomposition + inverted index (exact match on keys, posting lists for TIDs)
GiST multi-entry: Decomposition + R-tree (range/spatial queries, KNN ordering)
SP-GiST multi-entry: Decomposition + quad-tree (range queries, space partitioning)

GIN cannot support KNN or efficient range-overlap queries on decomposed components. GiST/SP-GiST multi-entry enables decomposition while preserving the spatial/range query capabilities of these AMs.

Open Questions and Future Work

Multi-column support: Currently restricted; semantics need definition for how decomposition interacts with composite keys
Index size management: No discussion yet of compression or deduplication strategies for the enlarged indexes
VACUUM/HOT implications: Multiple index entries per heap tuple affect HOT chain eligibility and vacuum costs
Parallel build: Whether the multi-entry path works correctly with parallel GiST build
Generalization to other types: Arrays of geometric types, PostGIS geometries, etc.