duckdb has extensible parser

First seen: 2026-05-13 13:25:52+00:00 · Messages: 1 · Participants: 1

Latest Update

2026-05-14 · claude-opus-4-6

Analysis: DuckDB's Extensible Parser as Inspiration for PostgreSQL

Core Problem

PostgreSQL's SQL parser is monolithic and tightly coupled to the server's grammar definition (gram.y). This makes it extremely difficult to:

  1. Add new syntax without modifying core PostgreSQL source code
  2. Support domain-specific languages or SQL dialects within PostgreSQL
  3. Allow extensions to introduce new statement types that are first-class citizens in the parser

The parser is one of the least extensible parts of PostgreSQL's architecture. While PostgreSQL has rich extension APIs for functions, operators, types, access methods, and even custom scan providers, the parser remains a hard boundary — extensions cannot introduce new grammar productions at runtime.

Context: DuckDB's Approach

The referenced article (DuckDB blog, November 2024) describes DuckDB's implementation of a runtime-extensible parser. DuckDB's approach allows extensions to:

This is architecturally significant because it demonstrates a practical implementation of parser extensibility in a production analytical database system.

Historical PostgreSQL Context

This topic has surfaced multiple times in pgsql-hackers history:

Technical Tradeoffs

The fundamental tension for PostgreSQL is:

  1. Security and correctness: A monolithic, well-tested parser provides strong guarantees about what SQL is accepted. Extensible parsing introduces risk of ambiguous grammars or security bypasses.

  2. Performance: PostgreSQL's parser is generated by Bison at compile time, producing efficient LALR(1) parsing tables. Runtime extensibility could introduce overhead at parse time for every query.

  3. Compatibility: If extensions can introduce arbitrary syntax, it becomes harder to reason about SQL compatibility and portability.

  4. Practical need: Many use cases (e.g., CREATE EXTENSION-specific DDL, graph query languages like openCypher, compatibility shims for other databases) would benefit enormously from parser extensibility.

Assessment

This is a very brief, link-sharing post rather than a formal proposal or patch submission. Pavel Stehule is pointing the community toward DuckDB's implementation as a reference design for a long-discussed capability. The thread generated no responses, suggesting either the community considers this a known-but-intractable problem, or the post didn't provide enough concrete proposal material to spark discussion.

For PostgreSQL to adopt something similar, a concrete proposal would need to address:

Relevance to PostgreSQL Architecture

The parser extensibility question is deeply connected to PostgreSQL's extension ecosystem maturity. As the extension ecosystem grows (with projects like Citus, TimescaleDB, pgvector, AGE/graph, etc. all wanting custom syntax), the pressure to make the parser extensible increases. DuckDB's approach provides a concrete reference implementation that PostgreSQL developers can evaluate.