CREATE INDEX CONCURRENTLY on partitioned index

First seen: 2020-10-31 06:31:17+00:00 · Messages: 41 · Participants: 9

Latest Update

2026-06-01 · claude-opus-4-6

CREATE INDEX CONCURRENTLY on Partitioned Tables: Deep Technical Analysis

The Core Problem

PostgreSQL's CREATE INDEX CONCURRENTLY (CIC) has never supported partitioned tables. When a user issues CIC on a partitioned table, they receive an error, forcing them to manually create indexes on each partition individually and then attach them to a parent index using ALTER INDEX ... ATTACH PARTITION. For tables with hundreds or thousands of partitions, this is operationally painful and error-prone.

The fundamental challenge is architectural: CIC's multi-phase protocol (create catalog entry → build index → wait for old transactions → validate → mark valid) was designed for single relations. Extending it to partitioned tables requires orchestrating this protocol across an entire partition hierarchy while maintaining the non-blocking property that makes CIC valuable.

Architectural Approach and Evolution

Phase 1: Reindex-Based Strategy (2020-2022)

The original approach by Justin Pryzby leveraged existing REINDEX CONCURRENTLY infrastructure:

  1. Create catalog entries for all partitions with indisvalid=false
  2. Reindex them concurrently using ReindexRelationConcurrently(), which already handles the multi-phase CIC protocol

This was elegant in code reuse but had significant drawbacks:

Phase 2: Native CIC Implementation (December 2022 onwards)

Pryzby refactored to extract DefineIndexConcurrentInternal() — the concurrent portion of DefineIndex() — into a separate function. The new approach:

  1. Create catalog entries for all partition indexes (non-concurrently, within a single transaction)
  2. Loop over each leaf partition, calling DefineIndexConcurrentInternal() to build each index concurrently
  3. Mark intermediate partitioned indexes as valid once all their children succeed

This eliminated the _ccnew naming problem and produced a smaller, more maintainable patch.

Key Technical Challenges

Locking Protocol

CIC's defining characteristic is using ShareUpdateExclusiveLock instead of ShareLock, avoiding write blocking. For partitioned tables, several locking questions arose:

Handling Concurrent Partition Changes

A critical correctness issue: what happens if a partition is dropped or detached while CIC is running? The evolution of solutions:

  1. Early versions: No protection — resulted in "cache lookup failed for index" errors
  2. Lock-all approach: Lock all partitions upfront — caused long lock times for later partitions
  3. Skip-if-dropped approach (final): Try to lock each partition when processing it; if it's been dropped, skip it gracefully. This mirrors REINDEX CONCURRENTLY's strategy.

Intermediate Partitioned Indexes

A subtle bug: in a multi-level partition hierarchy (e.g., partitioned table → sub-partitioned table → leaf), after successfully building all leaf indexes, the intermediate partitioned table indexes must also be marked as valid. This required tracking which OIDs are partitioned indexes vs. leaf indexes.

Progress Reporting

pg_stat_progress_create_index was designed for single-relation operations. Challenges:

Snapshot Management Bug (2025)

An assertion failure occurred on partitioned tables without leaf partitions — CIC would attempt to pop an active snapshot that was never pushed. This paralleled a fix in REINDEX (c426f7c2b36a) for event triggers and required guarding the snapshot pop with a check.

childStmt Concurrent Property Loss (2026)

The most recent fix addresses a regression where childStmt (the IndexStmt passed to child partition index creation) lost its concurrent property during processing, causing the index to be built non-concurrently despite the user's request.

Failure Semantics

The patch preserves CIC's existing failure semantics:

Code Organization

The final patch structure:

Current Status

As of the latest message (May 2026), the patch is being maintained by Alexander Pyhalov, with the most recent fix addressing the childStmt concurrent property loss. The patch has never attracted sustained committer attention despite being functionally complete for several years, which has been a recurring frustration expressed by the authors.