Feature: Use DNS SRV records for connecting

First seen: 2026-04-22 09:48:50+00:00 · Messages: 4 · Participants: 3

Latest Update

2026-05-14 · claude-opus-4-6

Feature: DNS SRV Record Discovery for libpq

The Core Problem

PostgreSQL's client library (libpq) currently requires clients to know the exact hostnames and ports of database servers at connection time. In environments with dynamic infrastructure — cloud deployments, Kubernetes clusters, or any HA topology — this creates a tight coupling between application configuration and infrastructure topology. When a primary fails over, or standbys are added/removed, every client's connection string must be updated.

DNS SRV records (RFC 2782) solve this at the DNS layer. A single well-known name like _postgresql._tcp.cluster.example.com can return multiple host/port/priority/weight tuples, enabling:

  1. Service discovery — clients learn about all cluster members from DNS without reconfiguration
  2. Priority-based failover — the primary can be advertised at priority 10, standbys at priority 20
  3. Weighted load distribution — multiple standbys at the same priority can share load via DNS weights
  4. Port flexibility — SRV records include port numbers, unlike A/AAAA records

This is the same pattern MongoDB adopted with mongodb+srv://, and it has proven successful in that ecosystem. The proposal aims to bring this capability not just to libpq but across the entire PostgreSQL driver ecosystem (pgx, pgjdbc, npgsql).

Architectural Design and Implementation

DNS Resolution Layer

The implementation adds approximately 200 lines of DNS resolver code with platform-specific backends:

This addresses Tom Lane's concern from the 2019 thread about writing a custom DNS client. The implementation delegates entirely to OS-provided resolver APIs.

Connection Parameter Design

A new srvhost connection parameter (environment variable PGSRVHOST) triggers SRV resolution:

srvhost=cluster.example.com dbname=mydb target_session_attrs=read-write

This queries _postgresql._tcp.cluster.example.com and populates the internal host list. Crucially, the SRV resolution happens before connhost[] is built, injecting the resolved hosts into the existing multi-host connection machinery. This means target_session_attrs, load_balance_hosts, and failover logic all work on the expanded host list without any modifications to PQconnectPoll.

srvhost is mutually exclusive with host and hostaddr — you either use SRV discovery or explicit hosts, not both.

Blocking Resolution Concern

Andres Freund previously raised concerns about non-blocking DNS resolution. The patch author explicitly acknowledges that res_query() is blocking, but argues this is no worse than the existing getaddrinfo() call used for regular hostname resolution. Async DNS (e.g., via c-ares or a thread pool) is a separate, larger architectural problem that shouldn't gate this feature.

The URI Scheme Debate

The most significant design discussion centers on whether and how to support SRV discovery in URI connection strings. Three options were evaluated:

Option 1: Keyword-only (srvhost=)

No URI scheme. Users must use the keyword=value connection string format. This has the smallest API surface and zero compatibility risk for any driver ecosystem, but sacrifices the convenience of URI-based configuration.

Option 2: postgresql+srv:// / postgres+srv://

Follows the MongoDB precedent (mongodb+srv://). The + character is legal in URI scheme names per RFC 3986 §3.1. However, Go's net/url package (as of Go 1.26) applies strict modern parsing rules to any scheme other than postgres and postgresql — which received special exemptions due to PostgreSQL's established comma-delimited multi-host syntax. While a single SRV name in the authority would likely parse fine, this creates a fragile dependency on parser behavior in the Go driver ecosystem.

Option 3: Alternative scheme (pgsrv://)

Avoids the + character entirely but invents a new namespace with no clear advantage over Option 2.

The patch author's preference shifted to Option 1 (keyword-only) after feedback from Jack Christensen (pgx maintainer), reasoning that every driver in the ecosystem would need to replicate URI scheme parsing, and the srvhost= parameter alone is unambiguous.

Jacob Champion (EDB/committer) provided important architectural perspective:

Notably, neither postgresql nor mongodb+srv are in the IANA scheme registry, and mongodb+srv was never provisioned despite wide adoption. The redis and rediss schemes are registered as provisional.

Implementation Issues Identified

Zsolt Parragi (Percona) performed initial code review and identified several concrete issues:

  1. Memory leak: srvhost is not freed in freePGconn(), which would leak the duplicated string for every connection object.

  2. Mutual exclusivity incomplete: The validation checks srvhost against host and hostaddr but does not address port. If a user specifies both srvhost and port, the behavior is undefined — should the explicit port override SRV-provided ports? Should it be rejected? This needs explicit semantics.

  3. RFC 2782 compliance deficiency: The implementation sorts results deterministically by weight, but RFC 2782 specifies a weighted random selection algorithm. Records with the same priority should be selected randomly with probability proportional to their weight. Additionally, weight=0 entries have special handling requirements — they should have a lower chance of being selected, receiving service only when no weighted entries remain at that priority level. The RFC provides a detailed selection algorithm that the patch does not implement.

The RFC 2782 compliance issue is architecturally significant because deterministic sorting defeats the purpose of weighted load balancing. If two standbys at priority 20 have weights 50/50, the current implementation would always try them in the same order rather than distributing connections randomly between them.

Cross-Ecosystem Implications

This proposal is notable for its cross-ecosystem scope. The author is simultaneously working on equivalent patches for:

Achieving consistency across drivers is important: if libpq supports postgresql+srv:// but pgx cannot easily parse it, the ecosystem fragments. This practical consideration strongly favors the keyword-only approach, since all drivers already support keyword=value parameters.

Design Tradeoffs Summary

Decision Tradeoff
srvhost= keyword only vs. URI scheme Simplicity and cross-driver compatibility vs. user convenience
Blocking res_query() Parity with existing getaddrinfo() behavior vs. future async goals
Mutual exclusivity with host/hostaddr Clean semantics vs. potential use cases for SRV + explicit fallback
Deterministic vs. random weight selection Simpler implementation vs. RFC 2782 compliance and proper load distribution
libresolv dependency Existing dependency chain (Kerberos/LDAP) vs. environments without libresolv