Server-side SNI Support in libpq: Deep Technical Analysis
The Core Problem
PostgreSQL's TLS implementation has historically supported a single server certificate/key pair, configured via ssl_cert_file and ssl_key_file GUCs in postgresql.conf. This is a significant limitation in modern deployments where:
- Multiple hostnames share a single PostgreSQL instance (via DNS aliases, load balancers, or connection poolers) and each may require distinct certificate chains — potentially issued by completely disjoint CAs. This is important in multi-tenant scenarios where tenants maintain their own PKI.
- Cross-host attack mitigation — without SNI enforcement, a client connecting to hostname A can receive the same certificate as a client connecting to hostname B, enabling certain classes of confusion attacks.
- Graceful PKI rotation — different hostnames may need to migrate between CAs on different schedules.
Client-side SNI was added years ago (the client sends the hostname in the TLS ClientHello extension), but the server has always ignored it. This patch makes the server actually act on the SNI extension by selecting a certificate chain, key, CA, and CRL based on the hostname the client requested.
Architectural Design Decisions
Configuration file: pg_hosts.conf
The patch introduces a new configuration file parsed using the same tokenizer machinery as pg_hba.conf and pg_ident.conf (tokenize_auth_file, TokenizedAuthLine, etc.). Each line binds a hostname to a set of TLS materials: certificate, key, CA, optional CRL, and optional ssl_passphrase_command. This reuses existing infrastructure (include directives, error reporting with line context) and keeps the admin mental model consistent.
Special pseudo-hostnames were introduced during the design:
*— wildcard/default fallback when SNI was sent but didn't match/no_sni/— matches connections where the client sent no SNI extension at all
The choice of /no_sni/ (rather than ?, braces, etc.) was a usability compromise between "unambiguous to parsers" and "visually distinct from real DNS names." Jacob Champion noted that ? collides with well-known wildcard conventions in other servers.
The ssl_sni GUC and the long debate about configuration surface
The patch went through three significant redesigns of the configuration model:
- v1: Three-valued
ssl_snimodeGUC (off/default/strict). Indefault, postgresql.conf values acted as fallback; instrict, SNI was mandatory. - v2: "magic file exists" approach proposed by Jelte Fennema-Nio: if
pg_hosts.confexists, use it exclusively. Daniel and Jacob both disliked silently overriding a user'sssl_cert_filesetting. - Final: Boolean
ssl_sniGUC combined with*and/no_sni/keywords inpg_hosts.conf. This satisfied Heikki Linnakangas's requirement that "opening pg_hosts.conf in an editor shows you everything that affects this," while addressing Jacob's concern that admins shouldn't have SNI thrust upon them implicitly.
The final design preserves full backward compatibility: if ssl_sni=off (the default), behavior is unchanged from prior releases.
The SSL_CTX switching saga — the hardest technical problem
This was the dominant technical issue in the thread, spanning roughly 18 months. Two fundamentally different implementations were tried:
Approach A (original): multiple SSL_CTX objects, swap via SSL_set_SSL_CTX()
The server pre-builds one SSL_CTX per pg_hosts.conf entry at startup. During the TLS handshake, the servername callback (or ClientHello callback) calls SSL_set_SSL_CTX() to reparent the active SSL object to the matching context.
Jacob flagged this early (July 2024), citing OpenSSL issue #6109 where OpenSSL committer Matt Caswell describes SSL_set_SSL_CTX() as "fundamentally broken." The function copies the certificate chain but leaves many other settings (verify mode, verify callback, password callback, CA list sent to client for client-cert requests) tied to the original context — or inherited onto the SSL at connection creation and not updated.
The symptoms were insidious:
- Nondeterministic wrong-chain serving
- Servername callback firing twice (eventually traced to TLS 1.3 HelloRetryRequest triggered by
ssl_groups, unrelated to this patch) - Client certificate verification behavior depending on what the default (unused) context's CA was set to
- Tests that should have failed in one way failed in another
Approach B (final, March 2026): single reconfigurable SSL_CTX
After off-list collaboration between Daniel and Jacob, the design pivoted: there is one "main" SSL_CTX that gets reconfigured per-connection in the ClientHello callback by copying settings from a per-host staging SSL_CTX stored in the HostsLine struct. This sidesteps the SSL_set_SSL_CTX() minefield entirely — settings are applied to the SSL object (or the single CTX) via the individual setters that OpenSSL actually maintains correctly.
ClientHello vs. servername callback
OpenSSL provides two hooks: SSL_CTX_set_tlsext_servername_callback (older) and SSL_CTX_set_client_hello_cb (newer). OpenSSL's own documentation is contradictory — the ClientHello docs say not to use the servername callback, but the servername docs say you need it even when using ClientHello. OpenSSL's own command-line tools use the servername callback. The patch eventually settled on the ClientHello callback, manually parsing the TLS server_name extension bytes and raising appropriate TLS alerts (decode_error vs missing_extension vs internal_error) — a nuance Zsolt Parragi correctly flagged per RFC 8446.
Platform and Dependency Issues
LibreSSL divergence
Daniel explicitly observed that LibreSSL is "falling further and further behind OpenSSL in its compatibility layer." Several APIs needed by the reconfigurable-CTX approach don't exist in LibreSSL. The patch adds meson/autoconf feature probes and disables ssl_sni on LibreSSL. Daniel floated splitting be-secure-openssl.c into a dedicated be-secure-libressl.c file to control the proliferating #ifdefs — an issue that will need to be addressed soon independent of this patch.
Windows / EXEC_BACKEND passphrase behavior
A pre-existing quirk exposed by the new tests: on Windows (and any EXEC_BACKEND build), every new backend re-runs the SSL passphrase command because the postmaster's decrypted key isn't inherited through CreateProcess. When ssl_passphrase_command_supports_reload=off, this was expected to block only on reload, but on Windows every connection triggers it. Daniel split this out as a separate thread and fix. The CI failure on culicidae after commit was also due to EXEC_BACKEND — the SKIP predicate needed to check $exec_backend rather than $windows_os.
The longfin regression
Post-commit, Tom Lane's longfin animal (running OpenSSL 1.1.1a from 2018) reported SYSCALL error: EOF detected instead of the expected SSL handshake error. Daniel reproduced this only on OpenSSL 1.1.1 pre-latest; later 1.1.1 patch levels return the correct error. Tom upgraded longfin to OpenSSL 3.0.19, implicitly retiring buildfarm coverage of very old 1.1.1 point releases.
Notable Technical Sub-Issues
- Verify callback on the default context: Because of how OpenSSL handles client cert verification during SNI selection, the verify callback must be registered even when the default context has no CA. Daniel discovered this while debugging Jacob's failing tests.
- Hostname case-insensitivity: Per RFC 952/921/1035, matching is case-insensitive.
- List of hostnames per line: Added as a scope-controlled middle ground short of wildcards/regex. Cannot include
*or/no_sni/in a list. - Duplicate detection:
pg_hosts.confrejects duplicate hostnames at load time. tls_inithook conflict: Whenssl_sni=on, the extension-installableopenssl_tls_init_hookis bypassed because it operates on a single context, which doesn't cleanly map to the multi-host world. AWARNINGis logged once (not per-line) when both are configured.- Typedef + header hygiene (post-commit cleanup by Tom Lane):
HostsFileLoadResultwasn't captured by the buildfarm's typedef extractor becauseload_hosts()was declared to returnint. Tom moved the typedef to a shared header and corrected the return type.
Design Tensions Worth Noting
Jacob Champion pushed back repeatedly against designs that made SNI a default concern for every DBA, arguing PostgreSQL isn't like a web server where name-based hosting is a top-level concept. This shaped the final decision to gate everything behind ssl_sni=off by default.
Heikki Linnakangas argued for local readability ("pg_hosts.conf should tell you the full story") which pushed the design toward * and /no_sni/ sentinels inside the file rather than mode-like GUC values changing interpretation.
Michael Banck raised a late bikeshed on the filename pg_hosts.conf being too generic, but this didn't gain traction.
Outcome
Committed by Daniel Gustafsson on 2026-03-18 as commit 4f433025f, with follow-up fixes for EXEC_BACKEND test skipping and typedef/header hygiene (Tom Lane, 2026-05-04). The feature ships as opt-in (ssl_sni=off default), OpenSSL-only (LibreSSL explicitly excluded until it catches up).