Set Notice Receiver Before libpq Connection Startup
Core Problem
PostgreSQL's libpqsrv infrastructure (used by extensions like dblink and postgres_fdw to make outbound libpq connections from within a backend) has a timing bug in how it installs the notice receiver callback. The libpqsrv_notice_receiver — which routes remote NOTICE/WARNING messages through ereport() for proper logging — is only installed after libpqsrv_connect() completes. This means any notices generated during the connection establishment phase (e.g., from login event triggers on the remote server) bypass the structured logging path and are instead dumped raw to stderr.
This is a regression in the user experience of the recently committed feature "Log remote NOTICE, WARNING, and similar messages using ereport()" — that feature's contract is that all remote notices should appear as proper PostgreSQL log messages, but startup-time notices violate this contract.
Why This Matters Architecturally
The libpqsrv layer exists as a shared abstraction for all extensions that open libpq connections from within a backend. It manages:
- External file descriptor accounting (
AcquireExternalFD/ReleaseExternalFD) - Connection lifecycle (async connect with cancellation support)
- Notice/warning routing to the server log
If the notice receiver isn't active during connection startup, there's a window where messages can leak to stderr in an unstructured format, breaking log parsing, monitoring, and the principle of least surprise for DBAs. Login event triggers are the most obvious reproduction case, but any server-side logic that runs during authentication/session setup could trigger this.
Proposed Solutions
V1: Dedicated Wrapper Functions (Rejected)
The initial patch added libpqsrv_connect_with_notice_receiver() and libpqsrv_connect_params_with_notice_receiver() — new entry points that install the notice receiver before calling the internal connection logic.
Drawback: This approach is narrow and ad-hoc. It only solves the notice receiver case and would require yet another wrapper if other per-connection setup needed to happen between PQconnectStart() and connection completion.
V2/V3: Two-Phase Connection API (Accepted Direction)
Fujii proposed splitting libpqsrv_connect() into two explicit phases:
-
libpqsrv_connect_start()/libpqsrv_connect_params_start()— callslibpqsrv_connect_prepare()(which doesAcquireExternalFD()) and thenPQconnectStart()/PQconnectStartParams(). Returns thePGconn*in a started-but-not-completed state. -
libpqsrv_connect_complete()— renamed from the formerlibpqsrv_connect_internal(), this drives the async connection to completion (polling withWaitLatchOrSocket, handling interrupts, etc.).
Between these two calls, the caller can invoke PQsetNoticeReceiver() or any other per-connection configuration. This is a fundamentally more extensible design.
Key Implementation Details (V2/V3)
-
Resource cleanup contract:
libpqsrv_connect_complete()must be called even ifPQconnectStart()returns NULL, becauselibpqsrv_connect_prepare()already calledAcquireExternalFD()and the FD slot must be released. This is documented in the header comment. -
postgres_fdw integration: The
connection.cchange introduces astart_connvariable to keep the existingPG_TRY/PG_CATCHerror handling correct —connis only assigned afterlibpqsrv_connect_complete()succeeds, preventing double-free or use-after-free in the catch path. -
Style cleanup (V3): Removed unnecessary local
connvariables in the_startfunctions, directly returning thePQconnectStart*()result per reviewer feedback.
Design Tradeoffs
| Aspect | Wrapper Approach (V1) | Two-Phase Approach (V2/V3) |
|---|---|---|
| API surface | Larger (new functions per variant) | Smaller (composable primitives) |
| Future extensibility | Poor (new wrapper per feature) | Good (open hook point) |
| Caller complexity | Simple (one call) | Slightly more (two calls + setup) |
| Backward compat | Existing callers unchanged | Existing callers unchanged (old functions still exist) |
The two-phase approach was unanimously preferred because it follows the libpq design philosophy (cf. PQconnectStart/PQconnectPoll being the async API) and provides a general extension point rather than a special-case solution.
Remaining Considerations
- The existing
libpqsrv_connect()convenience functions presumably remain as wrappers for callers that don't need the hook point, preserving backward compatibility. - The
ReleaseExternalFD()contract whenconnis NULL requires careful documentation — callers who use the two-phase API must understand they cannot skip the complete phase.