2026-06-04 · claude-opus-4-6

Centralised Architecture Detection in PostgreSQL

The Core Problem

PostgreSQL's codebase relies on compiler-defined preprocessor macros to detect CPU architecture and conditionally compile architecture-specific optimizations. However, different compilers define different macros for the same architecture:

GCC/MinGW: __x86_64__, __aarch64__, __powerpc64__, etc.
MSVC (Visual Studio): _M_AMD64, _M_IX86, _M_ARM64, etc.
Sun Studio: __i386, __amd64, __sparc (no trailing underscores)

The detection logic is scattered across multiple files, and critically, several code paths fail to detect x86 on MSVC/Visual Studio builds. This has concrete, measurable performance consequences:

Real Performance Impact on Windows/MSVC Builds

pg_{read,write}_barrier() falls back to pg_memory_barrier() — On x86, read and write barriers are no-ops (due to x86's strong memory ordering: stores are ordered w.r.t. stores, loads w.r.t. loads). Falling back to a full memory barrier introduces unnecessary mfence or lock instructions in hot paths involving spinlocks and lightweight locks.
pg_spin_delay() compiles to nothing — On x86, this should emit a PAUSE instruction, which hints to the CPU that a spin-wait loop is executing, improving performance on hyperthreaded cores by reducing pipeline flush penalties and yielding execution resources to the sibling hyperthread.
PG_HAVE_8BYTE_SINGLE_COPY_ATOMICITY is undefined — On x86-64, aligned 8-byte reads and writes are guaranteed atomic by the architecture. Without this macro, PostgreSQL must use more expensive atomic operations or locking where a simple load/store would suffice (e.g., in certain WAL and shared memory access patterns).

The root cause is that src/include/port/atomics/arch-x86.h is never included on MSVC builds because the detection only checks GCC-style macros. Thomas Munro confirmed this empirically: adding #error to arch-x86.h produces failures on MinGW CI but passes on Visual Studio CI — proving VS never hits that header.

Proposed Solutions

Approach 1: PG_ARCH_XXX Macros (Thomas Munro)

Define PostgreSQL-native macros in a central location:

/* In c.h or port/pg_cpu.h */
#if defined(__x86_64__) || defined(_M_AMD64)
#define PG_ARCH_X86 1
#define PG_ARCH_X86_64 1
#elif defined(__i386__) || defined(_M_IX86)
#define PG_ARCH_X86 1
#define PG_ARCH_X86_32 1
#elif defined(__aarch64__) || defined(_M_ARM64)
#define PG_ARCH_ARM 1
#define PG_ARCH_ARM_64 1
/* ... etc for LOONGARCH, MIPS, PPC, RISCV, S390, SPARC ... */
#endif

The naming convention provides both a generic family macro (PG_ARCH_X86) and specific width macros (PG_ARCH_X86_64), covering architectures: ARM, LoongArch, MIPS, PPC, RISC-V, S390, SPARC, x86.

Advantages: Clean, self-documenting, PostgreSQL-idiomatic, provides both family and bitwidth detection.

Disadvantages: More code churn; future patch submitters may not know to use PG_ARCH_XXX and will submit patches using __x86_64__ directly.

Approach 2: Standardize on GCC Spellings (Tom Lane)

Instead of inventing new macros, ensure GCC's __x86_64__, __aarch64__, etc. are always defined, even on non-GCC compilers:

/* In c.h */
#if defined(_M_AMD64) && !defined(__x86_64__)
#define __x86_64__
#endif
#if defined(_M_IX86) && !defined(__i386__)
#define __i386__
#endif
/* ... */
#else
#error "unrecognized CPU architecture"
#endif

Advantages: Minimal code churn — only need to remove the few alternative spellings scattered in the tree. Patch submitters are already familiar with GCC macros. Precedent exists (the old solaris.h approach did exactly this).

Disadvantages: Defines double-underscore macros that are technically in the implementation's reserved namespace. Though as Tom notes, there's no mechanism for compilers to prevent user code from defining __ macros, and PostgreSQL has done this before.

Key Design Decision: The `#else #error` Branch

Tom Lane's proposal includes an #else #error clause that fires if no architecture is detected. This is architecturally important: it turns silent misdetection (the current bug) into a hard compile error, ensuring any new platform or compiler must explicitly be handled.

The BF_ASM Tangent

Dagfinn Ilmari Mannsåker noticed that the architecture detection consolidation reveals dead code: BF_ASM in the Blowfish implementation (src/backend/libpq/crypt.c or similar) has never been defined to anything but 0 since 2001, and the assembly function it would invoke (_BF_body_r) has never existed in the PostgreSQL tree. This is a separate cleanup opportunity exposed by the refactoring.

Architectural Significance

This issue matters beyond the immediate Windows performance fix because:

It's a class of bug, not a single bug — Any new architecture-specific optimization added in the future is at risk of the same silent failure if detection is ad-hoc.
Growing architecture diversity — With LoongArch, RISC-V, and ARM server adoption growing, PostgreSQL needs robust detection for more architectures than the traditional x86/SPARC/PPC trio.
Atomics are correctness-critical — While the current failure "merely" loses performance on Windows, incorrect atomics detection could theoretically lead to data corruption on weaker memory architectures (ARM, RISC-V, PPC) if barriers were accidentally omitted rather than accidentally over-applied.
Related work — The timing_clock_source thread and John Naylor's pg_cpu.h work both need reliable architecture detection, making centralization timely.

Current Status

The thread reached a natural discussion point between the two approaches but no patch was committed as of the last message. Tom Lane's preference for GCC spellings represents a pragmatic minimal-churn approach, while Thomas Munro's PG_ARCH macros are more principled. The discussion is likely to continue, potentially converging on Tom Lane's approach given his committer authority and the practical argument about reducing future patch friction.

Centralised architecture detection

Latest Update

Centralised Architecture Detection in PostgreSQL

The Core Problem

Real Performance Impact on Windows/MSVC Builds

Proposed Solutions

Approach 1: PG_ARCH_XXX Macros (Thomas Munro)

Approach 2: Standardize on GCC Spellings (Tom Lane)

Key Design Decision: The `#else #error` Branch

The BF_ASM Tangent

Architectural Significance

Current Status

Centralised architecture detection

Latest Update

Centralised Architecture Detection in PostgreSQL

The Core Problem

Real Performance Impact on Windows/MSVC Builds

Proposed Solutions

Approach 1: PG_ARCH_XXX Macros (Thomas Munro)

Approach 2: Standardize on GCC Spellings (Tom Lane)

Key Design Decision: The #else #error Branch

The BF_ASM Tangent

Architectural Significance

Current Status

Key Design Decision: The `#else #error` Branch