Sequence Access Methods, round two

First seen: 2023-12-01 05:00:54+00:00 · Messages: 72 · Participants: 10

Latest Update

2026-05-20 · claude-opus-4-6

Incremental Update: Independent Benchmark Validation by Lepihov

Summary

Lepihov (Andrei Lepikhov) followed up on his earlier suggestion about avoiding unnecessary cache lookups with independent benchmark results validating that the v29 patch introduces no measurable performance overhead.

New Benchmark Results

Lepihov ran two types of benchmarks on an Intel-based MacBook:

1. Pure nextval() throughput (UNLOGGED sequence, 100 iterations × 1M calls each):

The results show v29 performing essentially identically to baseline, with no statistically significant difference:

Window Baseline (ms) v29 (ms)
iter 1-20 236.92 235.28
iter 21-40 235.77 236.17
iter 41-60 238.53 234.64
iter 61-80 236.62 234.35
iter 81-100 237.13 235.24

v29 is actually marginally faster in most windows, likely within noise.

2. pgbench INSERT workload (UNLOGGED table):

Metric Baseline v29 Delta
Mean TPS 24,997 25,195 +0.79%

Again, no regression — slight improvement within noise.

Technical Significance

Lepihov's Cache Lookup Optimization

In his first message, Lepihov explicitly drops his earlier suggestion about optimizing away unnecessary cache lookups, saying "let me just leave the idea... here" — indicating he's satisfied that the current implementation doesn't warrant the complexity of a fast-path optimization for the default AM OID.

History (1 prior analysis)
2026-05-18 · claude-opus-4-6

Sequence Access Methods: A Deep Technical Analysis

The Core Problem

PostgreSQL's sequence infrastructure is monolithic: every sequence stores its state (last_value, is_called, log_cnt) in a single-page heap relation, computes values via a hardcoded local increment algorithm, and WAL-logs changes through a dedicated RMGR (RM_SEQ_ID). This architecture is fundamentally incompatible with distributed/active-active deployments where:

  1. Value uniqueness across nodes cannot be guaranteed without external coordination or schema modifications (e.g., different INCREMENT values per node, or replacing sequence defaults with custom functions).
  2. Schema transparency is broken — migrating an application from single-node to multi-node requires altering column defaults, which is unacceptable for large-scale deployments.
  3. Generated columns and SERIAL/IDENTITY are tightly coupled to the local computation, making it impossible to transparently substitute alternative value-generation strategies.

The historical context is significant: Postgres-XC (2009-2012) solved this by patching sequence.c to fetch values from a Global Transaction Manager. BDR/pgEdge explored similar approaches. The fundamental insight is that the computation method for sequence values should be pluggable, just as storage is pluggable via table AMs and indexing via index AMs.

Architectural Design

Separation of Concerns

The patch identifies three conceptual layers historically entangled in sequence.c:

  1. pg_sequence catalog properties — start, increment, min, max, cycle, cache. These are DDL-level properties shared across all AMs.
  2. Local sequence cache — per-backend caching for lastval(), tracking the last sequence accessed.
  3. Sequence metadata/computation — the actual value generation, storage, and WAL-logging.

The key design decision is that (1) and (2) remain universal across all AMs, while (3) becomes the pluggable component. This means pg_sequence continues to provide "hints" to the AM about desired behavior, cross-property validation remains centralized, and the CACHE mechanism is handled by core code — not individual AMs.

Callback Architecture

The sequence AM handler (defined in sequenceam.h) provides callbacks for:

  • init() — Create sequence storage (attributes, initial tuple), or do nothing for storage-less sequences
  • get() — Retrieve current state (last_value, is_called) for dump/restore and pg_sequence_last_value()
  • set() — Force state (for setval() and restore)
  • next() — Compute the next value (the core computation callback)
  • reset() — Handle ALTER SEQUENCE state resets
  • get_table_am() — Declare which table AM (if any) the sequence uses for storage

This plugs into nextval_internal() via function pointers cached in the relcache, achieving transparency for:

  • nextval(), currval(), lastval(), setval()
  • SERIAL and GENERATED ALWAYS AS IDENTITY columns
  • pg_dump/pg_restore

Storage Model

A critical design choice: sequences remain relations (stored in pg_class), but the AM controls whether they have physical storage and what attributes that storage contains. The patch uses the same pattern as views — DefineRelation() creates the pg_class entry with no attributes, then AlterTableInternal() adds AM-specific columns afterward. This means:

  • The "seqlocal" AM creates 3 columns (last_value, log_cnt, is_called) in a heap relation
  • A snowflake AM might store timestamp + counter in an unlogged table
  • An in-memory AM might store nothing at all

WAL and Persistence

Each AM controls its own WAL strategy:

  • The in-core "seqlocal" AM retains the existing RM_SEQ_ID RMGR (renamed to RM_SEQ_LOCAL_ID)
  • Custom AMs can use generic WAL (like the snowflake contrib module does)
  • AMs can choose to skip WAL entirely for truly ephemeral sequences
  • The fill_seq_with_data() function handles UNLOGGED logic (writing to both main and init forks)

Relcache Integration

Sequences gain rd_seqamroutine (cached AM callbacks) and optionally rd_tableam in their RelationData. A new RELKIND_HAS_SEQUENCE_AM() macro mirrors RELKIND_HAS_TABLE_AM(). The relcache initialization path branches:

if (RELKIND_HAS_TABLE_AM(relkind))
    RelationInitTableAccessMethod(rel);
else if (relkind == RELKIND_SEQUENCE)
    RelationInitSequenceAccessMethod(rel);

Dump/Restore Strategy

The patch introduces:

  • default_sequence_access_method GUC (analogous to default_table_access_method)
  • --no-sequence-access-method flag for pg_dump/pg_restore
  • SET commands emitted between CREATE/ALTER SEQUENCE in dump output
  • Dependency on pg_sequence_last_value() returning (last_value, is_called) tuple rather than scanning heap directly

Key Technical Debates

"Is this really an Access Method?"

Peter Eisentraut and others questioned whether "access method" is the right abstraction. The argument against: sequences don't "access" data in the traditional sense — they generate it. The argument for: the existing AM infrastructure (CREATE ACCESS METHOD, pg_am catalog, handler functions, GUCs) provides ready-made plumbing that would be wasteful to duplicate. Michael's position prevailed pragmatically: sequences are already relations, and the AM pattern fits.

Scope of Abstraction

Peter Eisentraut raised a fundamental design concern: the patch conflates high-level concerns (how to compute the next value) with low-level concerns (table AM, persistence, storage). He argued these should be separable layers — e.g., "local sequence of UUIDs" vs "global sequence of integers" should be independently combinable.

Michael acknowledged this but argued that:

  1. The current proposal solves real problems today
  2. Custom data types can be layered on top later via additional DDL clauses
  3. 64-bit integer values cover all practical distributed use cases he's encountered

Custom Data Types (int64 restriction)

The patch hardcodes all sequence values as int64. Peter advocated for UUID support and flexible types. Michael's counterarguments:

  • The SQL specification requires sequences to work on integer values
  • 64 bits provides sufficient space even for thousands of nodes (snowflake-style partitioning)
  • Custom types would require different SQL functions (breaking transparency with nextval())
  • An informal poll showed ~50% of respondents considered 64b sufficient

This remains an unresolved design tension — the patch explicitly punts on it.

Performance Impact

Michael's benchmarks showed negligible overhead from function pointer indirection:

  • N=5M nextval calls: HEAD=3230ms (646ns/value) vs patch=3315ms (663ns/value)
  • ~2.6% overhead, within noise

Lepihov suggested a fast-path optimization: wire the default "seqlocal" handler OID directly to avoid catalog lookups in RelationInitSequenceAccessMethod. Michael acknowledged this as worth investigating but focused on correctness first.

Reloptions vs GUCs for AM Parameters

The patch deliberately omits reloptions for sequences (e.g., for configuring snowflake machine_id). Michael argued this can be added later and isn't a strict requirement — GUCs defined within the AM module suffice for initial use cases. Peter noted that the LCG use case (Matthias's PRNG example) would need modulus/multiplier/increment parameters, which the patch doesn't directly support.

The Snowflake Contrib Module

Patch 0007 provides a concrete implementation demonstrating the API:

  • Stores state in an unlogged table (timestamps + machine IDs ensure cross-node uniqueness)
  • Uses generic WAL for init fork logging
  • Implements snowflake ID structure: timestamp bits + machine_id bits + local counter bits
  • Machine ID controlled via GUC
  • Demonstrates that lastval(), currval(), nextval() work transparently
  • Identity columns can feed from snowflake sequences without schema changes

Relationship to Logical Replication of Sequences

A parallel effort (led by Vignesh C and others) adds logical replication support for sequences. The sequence AM patch is designed to be orthogonal:

  • Logical decoding depends on relfilenodes and page LSNs — the "seqlocal" AM preserves these
  • Custom AMs that don't use local storage would need their own replication strategy
  • The get() callback provides a universal way to extract current state for initial sync
  • Non-transactional sequence behavior is deliberately unchanged

File Organization (Final State)

src/backend/access/sequence/seqlocalam.c    — "seqlocal" AM callbacks
src/backend/access/sequence/seqlocalxlog.c  — WAL redo for seqlocal
src/backend/commands/sequence.c             — Core sequence logic (30% smaller)
src/include/access/sequenceam.h             — AM callback definitions
src/include/access/sequence_page.h          — Helper macros for single-page AMs
contrib/snowflake/                          — Example snowflake ID AM

Current Status and Outlook

After 3+ years and 28+ revisions, the patch remains in a "rebase loop" without consensus from core committers. Michael acknowledges "zero feedback from other core developers" and suspects "nobody really cares about this thread." The patch is technically mature — regression tests pass, dump/restore works, the API is well-exercised — but lacks the community momentum needed for committal. The unconference at pgconf.dev 2024 did not produce consensus, with discussion drifting toward logical replication of sequences instead.

The fundamental tension remains: this is infrastructure that primarily benefits distributed PostgreSQL deployments (pgEdge, Neon, custom active-active systems), while core PostgreSQL development energy has focused on built-in logical replication features that address a subset of the same problems differently.