Analysis: Improving Visibility of Temporary Table Usage via pg_stat_database
The Core Problem
PostgreSQL's cumulative statistics system exposes rich observability for regular relations and for transient spill files (work_mem overflow sorts, hash spills) but has a blind spot in between: session-local temporary tables created with CREATE TEMP TABLE. The proposal from Mohamed Ali (a new/occasional contributor, based on the gmail address and lack of prior committer history) targets this gap.
The asymmetry is architecturally real:
pg_stat_database.temp_files/temp_bytescount BufFile-backed spill files produced by executor nodes (tuplesort, tuplestore, hashjoin batches). These are tracked viapgstat_report_tempfile()when the files are closed.pg_stat_statementsrecordslocal_blks_{hit,read,dirtied,written}per normalized query, sourced from theBufferUsageaccumulators that the executor updates whenever a local buffer (i.e.temp_buffers-backed) is touched. Local buffers are the dedicated buffer pool used forRELPERSISTENCE_TEMPrelations — they bypass shared_buffers entirely and are managed insrc/backend/storage/buffer/localbuf.c.- Between these two mechanisms, there is no database-scoped counter for temp-table activity. A DBA looking at
pg_stat_databasecannot answer: "Is this database hammering its local buffer pool? Istemp_bufferssized correctly? How many temp tables are being created per unit time?"
Why This Matters Architecturally
Temporary tables are a distinct storage citizen. Because they are session-private and exempt from WAL (RelationNeedsWAL() returns false for RELPERSISTENCE_TEMP), they use a separate buffer manager (local buffers), separate smgr behavior for file naming (t<backendid>_<relfilenode>), and separate cleanup at backend exit. When a workload overflows temp_buffers (default 8MB), local buffers are evicted to backend-private temp files, creating I/O that is invisible to shared_buffers hit-ratio analysis.
The example output in the proposal demonstrates precisely this effect: with temp_buffers = 800kB, the local-buffer hit ratio collapses to 24%, and local_blks_written spikes to 3216 — diagnostic information that today requires pg_stat_statements aggregation or EXPLAIN (BUFFERS) on individual queries.
There is also a secondary pathology the author highlights: using pg_class scans as a workaround is self-defeating because heavy temp-table workloads bloat pg_class itself (each CREATE TEMP TABLE inserts a pg_class row; even with ON COMMIT DROP this churns the catalog). The performance-degradation feedback loop this creates is a legitimate motivation.
The Proposed Patch
Five new columns are added to pg_stat_database:
temp_tables— a counter incremented presumably in the CREATE TABLE path whenrelpersistence == RELPERSISTENCE_TEMP. 2–5.local_blks_{hit,read,dirtied,written}— database-level aggregation of the same counterspg_stat_statementsalready collects per-query.
Mechanically, the local_blks_* additions are straightforward: the PgStat_StatDBEntry struct gains four fields, and pgstat_report_stat() flushes them from pgStatLocalBlocks-style accumulators (the same source BufferUsage uses). This is a near-mirror of the existing blks_read/blks_hit plumbing for shared buffers in pgstat_database.c.
The temp_tables counter is more interesting from a design standpoint — it raises questions the initial post does not address:
- Counting semantics: Is
CREATE TEMP TABLE foo ON COMMIT DROPin a loop counted as N tables or 1? The natural implementation (increment onheap_create_with_catalogfor temp relations) gives N, which matches the stated motivation. - What about
CREATE TEMP TABLE ... AS,SELECT INTO TEMP, and implicit temp relations created forWITHmaterialization or cursors? The patch presumably hooks a low enough level to catch all, but this should be verified. - Partitioned temp tables: should each partition count?
- Why not also
temp_tables_droppedor a gauge of currently-live temp tables? The proposed counter is monotonic-create-only, which means you cannot distinguish a workload that creates-and-drops 1000 temp tables per second from one that creates 1000 and keeps them.
Likely Review Pushback
Based on typical pgsql-hackers review patterns for statistics additions, several concerns will almost certainly be raised:
-
Catalog bloat of pg_stat_database itself. Adding five columns to a universally-queried system view has a non-trivial cost; reviewers (Andres Freund, Michael Paquier historically gatekeep this area) tend to push for consolidation or for placing new counters in a more specialized view such as a new
pg_stat_local_buffersor extendingpg_stat_io(introduced in PG16 by Melanie Plageman), which already has the right dimensional model (object/context/io_op) to absorb local buffer activity. In fact,pg_stat_ioalready tracksio_context = 'normal'vs'vacuum'vs'bulkread'and was specifically designed to be the landing place for granular I/O accounting — much of thelocal_blks_*proposal duplicates what pg_stat_io could expose with acontext = 'local'or via the existing BackendType dimension. -
Redundancy with pg_stat_statements. Committers will ask why database-level aggregation cannot be done with
SELECT sum(local_blks_read) FROM pg_stat_statementsjoined againstpg_database. The counter-argument — that pg_stat_statements is an extension, has a finitepg_stat_statements.max, and normalizes away DDL — is defensible but must be made explicitly. -
The
temp_tablescounter specifically. DDL-count statistics are not currently tracked in pg_stat_database for any object type (notables_created,indexes_created, etc.). Adding one for temp tables sets a precedent and invites "why not also permanent tables / indexes / etc.?" A likely counter-proposal is to expose this through the existingpg_stat_xact_*or a new view, or to add it topg_stat_ioas a separateopdimension. -
Shared statistics infrastructure. Since PG15 (Andres Freund's shared-memory stats rewrite), adding fields to
PgStat_StatDBEntryrequires careful attention to the pgstats kind registration, fixed vs variable kinds, and the serialization format for stats file persistence across restarts. The post does not mention whether the patch handles this.
Strength of the Motivation
The observability gap is real and the diagnostic query shown is genuinely useful. The hit_ratio computation for local buffers is exactly the kind of tuning signal DBAs need and cannot easily get today without per-query EXPLAIN or pg_stat_statements post-processing. The 800kB vs 10MB temp_buffers A/B comparison is a compelling concrete demonstration.
However, the shape of the solution is likely to evolve substantially in review. The strongest path forward is probably:
- Route
local_blks_*throughpg_stat_io(where it arguably belongs architecturally). - Either drop
temp_tablesor justify it as part of a broader DDL-counting proposal. - Provide tap tests (the author supplies a SQL script, but
src/test/regressorsrc/test/modulesTAP coverage is the project standard).
Thread State
This is a v1 posting with no responses yet in the provided corpus. The technical merit of exposing local buffer activity at database granularity is defensible; the specific API surface will almost certainly be renegotiated before any commit.