Row Pattern Recognition (SQL/RPR) Implementation
Overview
This thread, spanning from June 2023 to May 2026, documents the design and implementation of SQL:2016's Row Pattern Recognition (RPR) feature in PostgreSQL — specifically Feature R020 (RPR in WINDOW clause). Over nearly three years and 47+ patch revisions, Tatsuo Ishii drove an initial PoC through a near-committable state, with a transformative contribution from Henson Choi in 2026 that rewrote the core pattern matching engine from a regex-based approach to a streaming NFA.
The Core Technical Problem
RPR provides SQL syntax to search for row sequences matching regex-like patterns over ordered window frames. The canonical example is detecting a "V-shape" price pattern:
PATTERN (START DOWN+ UP+)
DEFINE
START AS TRUE,
UP AS price > PREV(price),
DOWN AS price < PREV(price)
The feature fundamentally changes window semantics: PATTERN/DEFINE reduces the window frame to the matched row subsequence. This has deep architectural implications — every window function and aggregate must see only the "reduced frame," not the full frame.
Key Design Evolution
Phase 1: Initial PoC and Architectural Framing (2023)
Ishii's initial v1 patch placed RPR logic inside window functions themselves. Vik Fearing (SQL standards expert) pushed back strongly on this architecture, arguing that RPR defines the frame, and window functions should be oblivious to how the frame was computed:
"Window functions do not need to know how the frame is defined, and indeed they should not. WinGetFuncArgInFrame should answer yes or no and the window function just works on that."
This led to the eventual design where row_is_in_reduced_frame() is embedded within frame-accessing APIs, keeping window functions unaware of RPR.
Phase 2: Regex-Based Matching Engine (2023-2025)
Ishii chose to map pattern variables to single ASCII characters (imposing a 26-variable limit from [a-z]), encode each row's matching variables as a string, and feed candidate strings to PostgreSQL's built-in regex engine. This approach had three fundamental problems:
- Combinatorial explosion: Generating all possible candidate strings was
O(V^N), causing OOM on large partitions - 26-variable limit from alphabet encoding
- No preferment/lexical-order guarantees — the regex engine doesn't honor SQL standard's lexicographic matching priority
Jacob Champion identified early on that POSIX NFA semantics (which pg's regex engine uses) conflict with SQL's traditional-NFA requirements — reluctant quantifiers cannot be expressed under POSIX "leftmost longest" rules. Vik cited the standard directly confirming Perl-style traditional NFA as the required model.
Performance optimizations like pruning candidate strings and caching regex compilation got the engine working for small cases but fundamental issues remained.
Phase 3: The NFA Rewrite (January 2026)
Henson Choi joined the thread and proposed a complete architectural replacement: a flat-array streaming NFA inspired by Apache Flink CEP. Key design decisions:
- Flat array of 16-byte
RPRPatternElementinstead of a tree — cache-friendly, index-addressable - Three-phase execution per row: Match → Absorb → Advance
- Match: Evaluate DEFINE predicates, remove states whose current VAR fails
- Absorb: Merge dominated contexts (the critical O(n²)→O(n) optimization)
- Advance: Epsilon-expand via DFS to generate states waiting for the next row
- DFS traversal order guarantees preferment (standard's lexicographic priority) naturally — no separate
altPrioritytracking needed, with early termination when FIN is reached - Context absorption: An older context whose repeat counts dominate a newer context's counts can absorb it, eliminating redundancy
The absorption optimization is the theoretical heart of the new engine. For patterns like A+ B, without absorption each row spawns a context that tracks independently, producing O(n²) states across n rows. Absorption exploits the monotonicity that an older context always contains all future matches of a younger one, reducing concurrent contexts to O(1) and total complexity to O(n).
This is formalized through a two-flag design at compile time:
RPR_ELEM_ABSORBABLEmarks the judgment point (WHERE to compare)RPR_ELEM_ABSORBABLE_BRANCHmarks the absorbable region (monotonic state tracking)
And two flags at runtime per context:
hasAbsorbableState(monotonic, true→false only): "can absorb others"allStatesAbsorbable(dynamic): "can be absorbed"
Benchmark impact was dramatic: the 100k-row START UP+ test that took 26 seconds with v25 completed in 1.2 seconds with absorption; Henson later reported that PostgreSQL's match-failure scaling was O(n) compared to Trino's apparent O(n²), a ~217,000× speedup at 100k rows.
Phase 4: Navigation Redesign (March 2026)
The original PREV/NEXT implementation used a three-slot model — current, previous, next tuples in separate ExprContext slots — with varno rewriting at plan time. Henson identified this was fundamentally limited:
- Cannot support standard's variable offset
PREV(price, N) - Cannot extend to FIRST/LAST navigation
- Plan-time varno mutation was unsafe (not copy-safe for plan cache)
Henson proposed a 1-slot swap model using two new expression opcodes:
EEOP_RPR_NAV_SET: save current slot, swapecxt_outertupleto target rowEEOP_RPR_NAV_RESTORE: restore original slot after inner expression evaluation
A new node type RPRNavExpr replaces funcid-based detection of PREV/NEXT. A dedicated nav_winobj with its own tuplestore read pointer (mark pinned at position 0) prevents truncation of rows needed by navigation. Later this was optimized: the planner computes max_offset across all PREV/FIRST calls and advances the mark to currentpos - max_offset, reclaiming tuplestore memory.
This architecture generalized naturally to FIRST/LAST and compound navigation like PREV(FIRST(col)), which flattens to a single RPRNavExpr with two offset values.
Key Technical Insights
Empty matches and the reduced frame map
RPR distinguishes four outcomes: match, unmatch, no-match (skipped), and empty match. Window semantics require input/output row counts match, so empty matches cannot insert extra rows. The executor tracks per-row status in reduced_frame_map (later replaced with a single-match scalar after analysis showed only one active match is possible per row position).
Aggregate restart requirement
Vanilla WindowAgg uses inverse transition and reuses prior aggregate state when frame boundaries shift predictably. Under RPR this optimization must be disabled — every new current row may yield a different reduced frame requiring complete aggregate recomputation from scratch. This was a significant finding that affected eval_windowaggregates.
Planner interactions
Multiple planner optimizations had to be disabled or guarded for RPR:
- Frame optimization (RANGE→ROWS conversion) — frame semantics differ
- Run condition pushdown — cannot short-circuit evaluation across reduced frames
- Window deduplication — both
transformWindowFuncCallandoptimize_window_clausesneeded RPR-aware equality remove_unused_subquery_outputs— could NULL-out columns referenced only by DEFINE, causing crashes- SQL function inlining via
expression_tree_walker— needed to visit defineClause
JIT conflict with slot swap
Henson discovered that JIT caches tuple data pointers in the expression entry block. The mid-expression slot swap invalidates these caches, producing wrong results under jit_above_cost=0. The fix generates a reload path only when navigation opcodes are present — compile-time decision keeps non-RPR expressions unaffected.
Standard compliance findings
Vik's standards expertise caught numerous subtleties:
- Traditional NFA (Perl-style), not POSIX NFA
- Lexicographic ordering takes precedence over greediness
- PREV/NEXT only allowed in DEFINE
- Range variables from FROM forbidden in DEFINE (§6.5)
- Navigation function offsets must be runtime constants (§5.6.2)
A{0,0}forbidden by standard (§7.9 SR 20)
Process Observations
The collaboration pattern between Ishii and Henson in 2026 is notable: Ishii acted as the senior committer/steward while Henson drove aggressive redesign. Ishii repeatedly integrated Henson's incremental patches, created vNN tarball drops, and provided architectural judgment (e.g., deferring MEASURES, PREFIX absorption, and MATCH_RECOGNIZE to post-commit work). Henson contributed extensive design documentation (multiline algorithm guides in multiple languages) and performance analysis.
Michael Paquier made a brief intervention flagging AI-generated mails as generally unhelpful; Ishii was accommodating given both developers' English limitations.
Chao Li and Zsolt Parragi contributed systematic code reviews catching real bugs (unchecked coerce_to_boolean result, child->max not incremented in prefix/suffix merge, crashes on specific patterns). SungJun contributed cross-DBMS validation against Oracle and Trino.