Analysis: Cross-type integer operator support for btree_gist
The Core Problem: Operator Family Matching in the Planner
This thread addresses a long-standing usability gap in GiST's btree_gist contrib extension: the lack of cross-type operator family entries for numeric types. The issue is rooted in how PostgreSQL's planner decides whether an index is usable for a given qual.
When processing a WHERE clause like camera_id = 1189 where camera_id is int8, the parser resolves 1189 as int4 and selects the =(int8, int4) operator from pg_operator (this operator exists because btree opfamilies for integers are populated with cross-type entries). The planner then calls match_clause_to_indexcol() which requires the operator to be a member of the index's opfamily (via op_in_opfamily() / pg_amop lookup). For btree integer opfamilies, cross-type =(int8,int4) is registered, so a standard B-tree index works transparently. For gist_int8_ops, only same-type =(int8,int8) is registered, so the lookup fails and the index column is silently dropped from consideration.
The architectural consequence is subtle but painful:
- Single-column GiST index on int8: the qual becomes a filter and the whole query degenerates to a seq scan (or a scan over another qual that happens to be indexable).
- Multi-column GiST index (the more common
btree_gistuse case — mixing a scalar column with a range/geometry column): the scalar column is not used as an index quals, so the bitmap/index scan loses its most selective predicate. The user sees the range-column qual driving the scan and the integer equality applied as a Filter. This materially changes plan shape.
The same issue exists for every btree_gist opclass (float, date, timestamp, numeric, etc.), and also for core range-type opfamilies where similar cross-type situations can arise. The thread's proposal scopes the fix to integers but advertises it as scaffolding.
Proposed Design
Registering the operators
The patch extends gist_int{2,4,8}_ops via ALTER OPERATOR FAMILY ... ADD OPERATOR to include all six btree strategies (<, <=, =, >=, >, <>) plus the KNN distance operator (<->) for each cross-type pair. This is the part the planner actually cares about — once these pg_amop rows exist, op_in_opfamily() succeeds and index paths are generated.
The support-function problem
The harder architectural question is what happens at execution time. GiST's consistent and distance support functions are called by gistindex_keytest() with the query datum, and they must compare the query value against the compressed key (here, an INT{2,4,8}KEY struct containing lower/upper bounds for the subtree). The natural approach — register additional support functions per cross-type — is rejected by the author for two reasons:
amvalidateconstraint: GiST'sgistvalidate()enforces that support procedures have matching left/right argument types with the opclass's type. Cross-type support procs would either require relaxing the validator or using a workaround.- Combinatorial bloat: 3 families × 2 subtypes × (6 cmp + 1 dist) = 42 new C functions plus catalog entries.
Instead, the proposal leverages the subtype OID that GiST already passes to consistent via the StrategyNumber/scan key machinery. The existing gbt_num_consistent() is replaced with a variant that dispatches on the subtype OID:
typedef struct gbt_subtype_info {
Oid subtype;
gbt_cmp_fn lt, le, eq, ge, gt;
gbt_dist_fn dist;
} gbt_subtype_info;
Each opclass carries a static table of these entries; when subtype == InvalidOid or matches the indexed type, the fast same-type path runs (backward compatibility preserved for extensions and existing indexes). Otherwise the table is walked and the cross-type callback invoked.
Implications and tradeoffs
Positive:
- No catalog bloat; no new support procs.
- Backward compatible — existing indexes don't need REINDEX; the key storage format is unchanged (keys are still stored in the indexed type).
- Establishes a pattern trivially extensible to float/date/timestamp families.
Concerns a reviewer would raise:
- Semantic correctness of cross-type comparison within the key struct: the subtree key is stored as
int8(say), but the query isint4. The comparison must promote the query toint8or compare carefully to avoid false negatives at subtree boundaries. This is the same issue btree solved for cross-type B-tree comparisons via cross-type comparison support functions (strategyBTORDER_PROC). The author's dispatch table essentially reimplements that per-opclass, which raises a question of whether reusing btree's existing cross-type support functions (btint48cmp, etc.) viaamproclookup on the btree opfamily would be cleaner than a hand-rolled static table. - KNN distance correctness:
<->returns a float8 distance. Forint8 <-> int4, the subtraction must be done in a wide-enough type to avoid overflow. The dispatch table approach requires each callback to be audited for this. amvalidaterelaxation vs. workaround: by not registering support procs for the cross-type operators, the patch technically leaves the opfamily in a state wheregistvalidate()may warn about operators lacking matching support procs. The author should clarify whethergistvalidate()needs adjustment.- Why not fix it at the framework level? A more invasive but cleaner alternative would be to teach GiST's generic machinery to look up cross-type comparison support functions automatically (similar to how btree opfamilies are consulted). The per-opclass dispatch table is pragmatic but duplicative.
Ecosystem implications
btree_gist is heavily used precisely for the multi-column case shown in the second reproducer: GIST (scalar_col, range_col) to accelerate exclusion constraints and range overlap queries filtered by a scalar. In that idiom, the scalar equality is nearly always the most selective predicate, and losing it to an implicit-cast mismatch is a silent performance cliff that shows up in production (ORMs like SQLAlchemy, Hibernate, and parameterized queries from most drivers routinely send int4 literals). Fixing this meaningfully improves btree_gist's usability without requiring user schema changes.
Status
This is an initial RFC/"is there interest?" posting with no patch attached yet. No responses from committers or other hackers are in the thread at this point, so there is no community verdict. The real test will be whether reviewers accept the dispatch-table approach or push back in favor of leveraging btree's existing cross-type comparison infrastructure or extending GiST's generic opfamily lookup.