2026-06-04 · claude-opus-4-6

Technical Analysis: Regression Tests for B-tree Skip Scan Support Functions

Core Problem

PostgreSQL 18 introduced the B-tree skip scan optimization (commit 92fe23d93aa), which allows index scans to "skip" over values in the leading column of a multi-column index when a query only filters on a non-leading column. This is a significant optimizer enhancement that avoids full index scans in cases where previously only sequential scans or less efficient index scans were possible.

The skip scan mechanism relies on per-type support functions that implement increment and decrement operations for each data type. These functions are critical for generating "skip array elements" — the boundary values the scan uses to jump forward or backward through the index's leading column. The relevant infrastructure in nbtcompare.c includes:

skipsupport functions: Registration functions (e.g., btoid8skipsupport, btint8skipsupport, btboolskipsupport, btcharskipsupport) that install the type-specific increment/decrement callbacks
increment helpers: Functions like oid8_increment, int8_increment, bool_increment, char_increment that compute the next value in the type's domain
decrement helpers: Functions like oid8_decrement, int8_decrement, bool_decrement, char_decrement, oid_decrement, int2_decrement that compute the previous value

Why This Matters Architecturally

The skip scan optimization is only as correct as its per-type support functions. If an increment or decrement function produces an incorrect boundary value, the skip scan could:

Miss rows — if it skips too far ahead
Produce duplicates — if boundaries overlap
Enter infinite loops — if increment/decrement don't make progress

Without regression test coverage, these functions could silently regress during refactoring or type system changes. The coverage gap is particularly concerning because:

The existing tests only exercise int4 and varchar types on leading columns
Types like oid8 (used in system catalog indexes), bool, char, and int2 have different boundary conditions (e.g., bool has only two values, char wraps at byte boundaries)
Edge cases in increment/decrement (overflow, underflow, wraparound) for these types remain untested

Proposed Solution

The patch adds targeted regression tests to btree_index.sql that:

Create two-column indexes (a, b) for each undertested type (oid8, int8, bool, char, oid, int2)
Run queries with predicates only on column b — this forces the planner to choose a skip scan on the index rather than a sequential scan, because the index can still be used if the skip scan machinery can enumerate values of a
Test both forward and backward scans (using ORDER BY ... ASC and DESC with Index Only Scans) — this exercises both the increment and decrement helpers respectively
Uses EXPLAIN or plan-forcing to verify that Index Only Scan with skip scan is actually chosen

The approach is elegant because it doesn't require any code changes — it simply creates the conditions where the optimizer must invoke the skip support infrastructure for types that weren't previously covered.

Key Design Considerations

Why Index Only Scans?

Index Only Scans are used because they make the test deterministic — the scan operates entirely within the index structure, and the skip scan behavior is entirely driven by the B-tree skip support functions without heap access complications.

Why Two-Column Indexes?

The skip scan optimization is specifically designed for multi-column indexes where the query doesn't constrain the leading column. A two-column index (a, b) with a predicate on b is the minimal reproduction case.

Coverage Gap Assessment

The fact that coverage.postgresql.org shows these functions as uncovered suggests that the original skip scan commit's test suite was focused on proving the optimization works (using common types like int4) rather than proving all type-specific implementations are correct. This is a common pattern where feature tests verify the mechanism but not the full matrix of type support.

Assessment

This is a straightforward, low-risk patch that improves test infrastructure. It requires no review of behavioral changes — only verification that:

The tests actually trigger skip scans (visible in EXPLAIN output)
The tests cover all the identified functions (verifiable via coverage tooling)
The tests are stable and don't depend on planner cost decisions that could change

The main review concern would be ensuring the test data and queries are crafted such that the planner reliably chooses skip scan over alternative plans (e.g., sequential scan might be preferred for very small tables).

[PATCH] Add regression tests for btree skip scan support functions

Latest Update