Truncate logs by max_log_size

First seen: 2024-09-26 18:30:08+00:00 · Messages: 39 · Participants: 10

Latest Update

2026-05-18 · claude-opus-4-6

Technical Analysis: Truncate Logged Statements by Maximum Length

Core Problem

PostgreSQL's logging infrastructure has no built-in mechanism to limit the size of individual logged statements. When queries contain very large payloads—multi-megabyte BLOBs encoded as literals, massive IN-lists, or enormous COPY data—the resulting log entries can consume disproportionate disk space and degrade log analysis tooling. This is not merely a disk management issue; it's a log abuse vector where a single malformed or intentionally oversized query can flood logs, making it impossible to find meaningful diagnostic information.

The existing mitigations are indirect:

Proposed Solution

The patch introduces a new GUC parameter log_statement_max_length (originally named max_log_size, renamed after review feedback) that truncates statements written to the server log at a configurable byte limit. The key design decisions evolved over ~20 months of development:

GUC Design

Implementation Architecture

The core implementation adds a truncate_query_log() function in elog.c that:

  1. Checks if log_statement_max_length >= 0 and the query exceeds that length
  2. Uses pg_mbcliplen() to ensure truncation respects multi-byte character boundaries (important for UTF-8 correctness)
  3. Returns a palloc'd truncated copy, or NULL if no truncation needed
  4. Caller is responsible for pfree

Truncation is applied in four code paths:

Critical Fix: Empty STATEMENT Bug

An important bug was identified by Fujii: when log_statement logs a truncated query and then the same query errors, the STATEMENT line in the error report was empty. This occurred because the original patch modified debug_query_string globally. The fix separates truncation from the error-reporting path—truncation applies only to proactive logging (log_statement, log_min_duration_statement), while error STATEMENT lines remain untruncated.

Performance Consideration

A key optimization identified during review: early patch versions called truncate_query_log() unconditionally in exec_simple_query() (performing strlen() on every query regardless of whether logging would occur). The final version calls truncation lazily—only inside the check_log_statement() true-branch or check_log_duration() case 2—avoiding overhead for the common case where queries aren't logged.

Key Technical Tradeoffs

  1. Truncation on error statements: The patch deliberately does NOT truncate statements in error STATEMENT lines. Maxym Kharchenko raised whether a companion log_statement_max_length_on_error (analogous to log_parameter_max_length_on_error) is needed. The consensus is this should be a follow-up patch—error statements serve debugging purposes where full context is valuable.

  2. Multi-byte safety: The patch uses pg_mbcliplen() rather than raw byte truncation, ensuring truncated output never splits a multi-byte character. The size limit is in bytes (not characters), which was a deliberate choice documented clearly.

  3. Replication commands: Fujii noted that log_replication_commands also logs statements but the patch doesn't cover this path. This remains unaddressed.

  4. CONTEXT line queries: Queries logged in CONTEXT (e.g., from PL/pgSQL DO blocks) are not truncated. This is noted but not addressed.

  5. Security model: The feature is positioned as log-size management, not security/privacy control (unlike log_parameter_max_length which protects sensitive parameter values).

Relationship to CVE-2026-2006

A brief security review was performed regarding CVE-2026-2006 (details not elaborated in thread). The conclusion was that the patch's use of pg_mbcliplen() is correct and not vulnerable to the encoding-related issues that CVE addressed.

Current Status

The patch reached "LGTM" status from Fujii Masao (a committer) as of v11, with the intent to commit once the PostgreSQL 20 development cycle opens. It has been through extensive iterative review addressing compilation warnings, C90 compliance, documentation clarity, test placement (moved from pg_ctl tests to src/test/modules/test_misc/), and performance optimization.