Core Problem
This thread identifies a classic off-by-one boundary check bug in code that validates an index against a list length before calling list_nth_oid(). The thread title mentions "pg_repack" but the actual code in question is inside PostgreSQL proper — it uses foreach_current_index(), a macro introduced with the modern foreach_* iteration primitives (list.c/pg_list.h) that yields the 0-based position of the current element during list iteration.
The buggy pattern was:
int pos = foreach_current_index(ind_old);
if (unlikely(list_length(ind_oids_new) < pos))
elog(ERROR, "list of new indexes too short");
ident_idx_new = list_nth_oid(ind_oids_new, pos);
Why this is wrong
Given a list of length N, valid 0-based indices are 0 .. N-1. The guard list_length(new) < pos only fires when pos > N, meaning pos == N would slip through — and list_nth_oid(list, N) on a length-N list is an out-of-bounds access that will either assert-fail in list_nth_cell() or return garbage/crash in a non-assert build.
The correct guard is list_length(ind_oids_new) <= pos, i.e. any pos >= N must raise the error before the list_nth_oid call. This is the fix Lakshmi proposed.
Architectural significance (small, but real)
While the defect is one character (< → <=), the class of bug matters because:
elog(ERROR, ...)guards of this kind exist specifically as "can't happen" defensive checks. Their whole purpose is to convert a latent out-of-bounds read into a clean ereport with a stable error message before the dereference. An off-by-one in such a guard defeats its entire reason for existence — the unsafe access still happens in exactly the edge case the guard was meant to catch.foreach_current_index()is a relatively new idiom (added alongside the rewritten List implementation), and callers coming from the pre-List-rewrite world sometimes reason about indices as if they were 1-based (mirroring older SQL-level conventions). This patch is a data point showing that even committers get the boundary wrong when mixinglist_length()(a count) withforeach_current_index()(a 0-based offset).
Secondary Cleanup: removing unlikely()
John Naylor's response invokes commit 913ec71d6, which established a project-wide convention that unlikely() should not be sprinkled on cold error paths where the compiler and branch predictor will already do the right thing. elog(ERROR, ...) is a pg_attribute_noreturn path; the compiler already treats it as cold, so the unlikely() adds no optimization value and just clutters the source. Naylor's comment — "it's just a noise word here" — reflects an established stylistic ruling rather than a new debate. The final pushed version therefore both fixes the comparison and drops unlikely():
if (list_length(ind_oids_new) <= pos)
elog(ERROR, "list of new indexes too short");
Review Dynamics and Authority
- Lakshmi N (reporter) submitted a correct, minimal patch with the right diagnosis.
- Xiaopeng Wang provided a secondary +1 confirming the semantics of
foreach_current_index(). - John Naylor (committer, List/pg_list internals expertise, author of much of the modern list iteration infrastructure) layered on a style improvement grounded in prior project policy. His input carried weight both because of committer status and because he works directly in this area.
- Álvaro Herrera (senior committer) performed the final review and push, accepting both the correctness fix and Naylor's stylistic refinement in one go ("pushed that way").
There was no disagreement — the thread is a textbook example of a tiny, uncontroversial correctness fix getting a small scope expansion from a committer to align with project conventions before commit.
Implications
- No user-visible behavior change in the normal case; this only matters if the two index lists (
ind_oldvsind_oids_new) ever diverge in length by exactly one, which would have previously led to memory misbehavior instead of a cleanelog. - Serves as a reminder to audit other
list_length(x) < foreach_current_index(...)patterns in the tree for the same off-by-one.