Fix typo 586/686 in atomics/arch-x86.h

First seen: 2025-11-28 08:44:37+00:00 · Messages: 26 · Participants: 9

Latest Update

2026-05-08 · opus 4.7

Core Problem: Broken 64-bit Atomic Read/Write on 32-bit x86

The immediate issue is a trivial typo in src/include/port/atomics/arch-x86.h: preprocessor checks reference non-existent macros __i568__ and __i668__ instead of __i586__ and __i686__. These macros guard the definition of PG_HAVE_ATOMIC_U64_SIMPLE_READ_WRITE (single-copy atomicity for 64-bit values) on 32-bit x86 builds. Because of the typo, since 2017 PostgreSQL has silently been falling back to locked/emulated 64-bit atomics on any 32-bit x86 build, regardless of the underlying CPU generation.

Architecturally this matters because on 32-bit x86, a naive MOV of an 8-byte value is not guaranteed to be tear-free — two 32-bit loads/stores can be interrupted. The intent of the original code was to detect Pentium-class or newer CPUs (i586+) where specific instruction sequences (CMPXCHG8B, MOVQ via MMX/SSE, or FILD/FISTP via the x87 FPU) can provide single-copy atomicity for aligned 8-byte accesses, and enable the SIMPLE_READ_WRITE fast path.

Why the Typo Went Undetected: The -march Question

The discussion reveals that even with the typo fixed, the check itself is unreliable. Zsolt Parragi, John Naylor, and Tom Lane compared notes showing that whether __i586__/__i686__ gets defined depends entirely on distribution-level gcc configuration:

This means the code path was effectively a lottery even before the typo, and the typo simply guaranteed it never fired anywhere. Nobody has complained in ~8 years, which is a strong signal that 32-bit PostgreSQL on real Pentium-era hardware is not a meaningful deployment.

Andres Freund's Deeper Diagnosis

Andres (who authored the original buggy code) injects the critical technical observation: even if the macros were spelled correctly, gcc won't emit the required atomic instruction sequences for a naïve volatile 8-byte load/store. To actually get tear-free 64-bit access on i586 you need to compel the compiler to emit FILDQ/FISTPQ (x87) or MOVQ (MMX), which in practice only happens through the __atomic_* builtins or C11 <stdatomic.h>.

His recommended direction is therefore not "fix the typo" but "rip the hand-rolled inline assembly path and use C11 atomics with a configure probe for ATOMIC_LONG_LOCK_FREE/ATOMIC_LLONG_LOCK_FREE." The obstacle is MSVC: C11 atomics require a newer MSVC plus /experimental:c11atomics.

The Secondary Debate: Dropping 32-bit Support Entirely

Jakub Wartak escalated the thread into a broader question: should PG19 deprecate 32-bit builds altogether? His motivations are pragmatic — the 32-bit Cirrus CI job (test_world_32 on Debian Trixie) is a major time sink (~7 minutes).

Positions:

Resolution

Daniel Gustafsson produced a patch in May 2026 that removes the 32-bit atomics code path rather than fixing the typo, targeting HEAD only (v19+), leaving backbranches untouched. Nathan confirms this is right for the broken path but flags two adjacent issues:

  1. The MSVC block in the same file uses the wrong macro guard (analogous bug), so it's dead code on MSVC entirely — arguably the file should be included on MSVC with correct guards.
  2. 64-bit code in the file should be fixed/retained even as 32-bit paths are removed.

Key Design Tradeoffs

  1. Fix vs. remove: Fixing the typo would re-enable a code path nobody has exercised for 8 years, on hardware nobody uses, with compiler behavior that doesn't reliably produce atomic instructions anyway. Removal is lower-risk.
  2. Hand-rolled asm vs. C11 atomics: PostgreSQL's atomics layer predates widespread C11 atomic support and carries substantial per-architecture inline assembly. Andres's long-term direction is to replace this with C11 primitives plus configure probes, gated on dropping older MSVC.
  3. Backbranch policy: Since the code was never functional due to the typo, removing it from backbranches is not a behavioral change — yet the consensus leans toward HEAD-only out of conservatism.
  4. CI coverage vs. maintenance cost: The expensive 32-bit CI job is valuable primarily for UBSan/alignment coverage, not 32-bit-ness per se. This reframes the "drop 32-bit to speed up CI" argument.