Re-add recently-removed tests for ltree and intarray

First seen: 2026-05-14 23:38:17+00:00 · Messages: 8 · Participants: 3

Latest Update

2026-05-18 · claude-opus-4-6

Re-adding Regression Tests for ltree and intarray: Stack Depth and Platform-Specific Recursion Failures

Core Problem

Regression tests for the ltree and intarray contrib extensions were removed (commit 906ea101d0d5) because they caused intermittent failures on specific buildfarm members. The tests had been added to cover recent bug fixes, but their removal left those fixes without test coverage — a significant regression testing gap that needed to be resolved, particularly since the fixes were backpatched down to v14.

The root cause was a stack overflow in recursive tree-walking functions — specifically findoprnd() in intarray/_int_bool.c and ltree/ltxtquery_io.c. These functions parse/traverse internal query representations recursively, and the original test cases generated right-deep (degenerate) query trees that required recursion depth proportional to the number of operands.

Why This Matters Architecturally

The findoprnd() function walks an internal operator/operand tree to reconstruct structure from a flat (Polish notation-style) representation. A right-deep tree of N nodes requires O(N) stack frames for a naive recursive descent. PostgreSQL's max_stack_depth GUC (default 2MB) provides a safety limit, but the actual stack consumption per frame is platform and compiler dependent, making this a portability landmine.

This is a textbook example of how PostgreSQL's cross-platform support creates subtle constraints on test design: a test case that works on x86_64 with gcc can fail on ppc64 with clang, not because of any logical bug, but because of differences in calling conventions, register allocation, and compiler optimization capabilities.

The Platform-Specific Failure Mechanism

Tom Lane's investigation revealed a precise three-factor interaction:

  1. Architecture-dependent stack frame size: ppc64 and s390x use approximately 3× the stack per call frame compared to x86_64 in findoprnd(). This is attributable to the different ABIs — ppc64 ELFv2 ABI has a larger minimum stack frame (including a mandatory save area), and the calling convention preserves more registers on the stack.

  2. Compiler optimization (tail-call elimination): GCC is capable of recognizing the tail-recursive calls in findoprnd() and collapsing them into iteration, effectively making the right-deep tree case consume O(1) stack. Clang does not perform this optimization at default optimization levels on these architectures. Even GCC fails to optimize under -O0, confirming the optimization is the differentiator.

  3. Default max_stack_depth: At PostgreSQL's default 2MB max_stack_depth, the combination of large stack frames (ppc64) and no tail-call optimization (clang or -O0) causes the recursive descent to exceed the limit, triggering a stack depth check failure.

This explains the pattern of buildfarm failures: only ppc64 and s390x members failed, and likely only those using clang or non-optimized builds.

The Solution: Balanced Binary Trees

Tom Lane proposed the key insight: instead of generating right-deep (linear/degenerate) query trees that require O(N) recursion depth, the tests should use balanced binary trees that require only O(log N) depth. For the same number of nodes, a balanced tree of depth ~10 replaces a degenerate tree of depth ~1000, reducing stack consumption by roughly two orders of magnitude.

Michael Paquier implemented this approach in the replacement test cases. The balanced tree structure ensures that:

Verification Strategy

The team used a staged rollout approach:

  1. Commit to HEAD first — Michael pushed the new balanced-tree tests to the development branch
  2. Buildfarm validation — Waited for ppc64 buildfarm members to run the tests successfully
  3. Cross-validation with -O0 — John Naylor confirmed on ppc64le that the old tests failed under -O0 while the new tests passed, providing direct evidence of the fix
  4. Backpatch after confirmation — Only after buildfarm green lights across ppc members was the backpatch to v14 through v17 executed

Key Technical Takeaway

This thread illustrates a general principle for PostgreSQL regression test design: test cases involving recursive data structures must be designed with awareness of the worst-case recursion depth across all supported platforms. The ~3× stack frame multiplier on ppc64/s390x combined with the absence of guaranteed tail-call optimization means that right-deep trees of even moderate size (a few hundred nodes) can be dangerous. Balanced constructions provide equivalent coverage with logarithmic depth, making them inherently portable.