Heads Up: cirrus-ci is shutting down June 1st

First seen: 2026-04-09 20:55:14+00:00 · Messages: 18 · Participants: 14

Latest Update

2026-05-20 · claude-opus-4-6

Technical Analysis: Cirrus CI Shutdown and PostgreSQL CI Infrastructure Migration

The Core Problem

Cirrus CI, the continuous integration platform that PostgreSQL's community development infrastructure depends upon, announced shutdown effective June 1, 2026. This creates an urgent infrastructure crisis for two critical development workflows:

  1. cfbot — the automated system that runs CI on every patch submitted to the commitfest, providing green/red status indicators visible to reviewers and authors
  2. Personal repository CI — developers (especially committers) running CI on their own GitHub forks before pushing commits or submitting patches

The resource consumption numbers Andres provided give a sense of the scale: ~1,464 core-hours/day across all CI jobs, ~396 of which are Windows (expensive due to licensing), plus macOS on self-hosted runners. This is not a trivial workload to relocate.

Why This Matters Architecturally

PostgreSQL's cross-platform correctness guarantees are central to its value proposition. The project supports Linux, macOS, Windows, FreeBSD, NetBSD, and OpenBSD — each with distinct system call semantics, compiler behaviors, and filesystem characteristics. CI that covers these platforms catches:

The loss of multi-platform CI would represent a significant regression in development velocity and code quality. Features like Andres's Asynchronous I/O (AIO) work — which requires testing across io_uring, kqueue, and Windows IOCP — would be particularly impacted.

Proposed Solutions and Design Tradeoffs

Short-term: GitHub Actions Migration (Consensus Path)

The thread converges on GitHub Actions as the only viable short-term solution given the ~2 week window. Jelte Fennema-Nio produced a working GitHub Actions workflow (with AI assistance from Claude Code) that achieves green builds across all previously-supported platforms, including BSDs via the cross-platform-actions/action which uses nested QEMU virtualization.

Key technical challenges identified:

Long-term: Self-hosted Open Source CI

Multiple participants (Peter Eisentraut, Alexander Korotkov, Thomas Munro) advocate for eventually running self-hosted open source CI to achieve "capitalism-proof" infrastructure. Specific proposals include:

  1. Woodpecker CI (David Wheeler) — Go-based, Forgejo-integrated, has local mode
  2. QEMU-based universal image infrastructure (Thomas Munro) — Publishing standardized qemu images at ci.postgresql.org/images/qemu/ that work in multiple contexts: local development, public cloud VMs, GitHub Actions, and cfbot's own infrastructure
  3. Sponsored cloud + open source CI (Alexander Korotkov) — Self-host open source CI software on cheap cloud with sponsorship

Thomas Munro's QEMU image proposal is the most architecturally comprehensive: it decouples the "what to test" (images) from "where to test" (CI platform), making the project resilient to any single provider's shutdown. The same images would serve local development, personal CI, and cfbot.

Resource Optimization

Robert Haas raises an important efficiency concern: cfbot runs 14 complete CI cycles on a 6-line patch with 4 thread messages. This suggests heuristics could reduce load:

Key Design Decisions and Disagreements

Self-hosted vs. Proprietary

Position Advocates Argument
Proprietary (GH Actions) is fine Jelte Fennema-Nio Self-hosted can be abandoned too; GitHub will outlive underfunded OSS CI
Self-hosted is critical Bruce Momjian, Peter Eisentraut Proprietary becomes expensive/obsolete; GitHub already tried per-minute fees for self-hosted runners
Both (layered approach) Thomas Munro Build capitalism-proof base layer, use commercial services as convenience layer

BSD Platform Support

Position Advocates Argument
BSDs less important Jelte Signal-to-flakiness ratio too low; BSDs rarely catch issues Linux+macOS miss
BSDs important Bilal Yavuz OpenBSD catches unique issues; flakiness is due to building images from scratch, not inherent

Who Should Pay

The Emergency Timeline

The thread spans April 9 to May 18, 2026 — but real action only happens in the final two weeks. Jelte's May 18 message is essentially a fire alarm: "In less than two weeks we won't have a working CI anymore." The patch he produces is explicitly described as AI-generated with cursory review, reflecting the urgency. Bilal's immediate response offering to take over and merge his own parallel work suggests the community recognized the deadline pressure.

Technical Details of the GitHub Actions Implementation

From what's described, the workflow:

Bilal's alternative approach introduces helper scripts (install-deps.sh, configure.sh, build.sh, test.sh) that abstract CI-provider-specific details — a design that supports future migration to yet another platform.

Unresolved Issues

  1. Public log access: No solution identified; may require a separate log hosting service
  2. macOS runners: Self-hosted Mac fleet management unclear under new regime
  3. cfbot integration: Thomas Munro's domain, not addressed in patches yet
  4. Image pipeline: pg-vm-images currently targets GCP; needs QEMU/container output paths
  5. Cost model: Daily 1,464 core-hours on GitHub Actions would be expensive without donated/sponsored runners