pg_dump Dependency Loop for PROPERTY GRAPH Objects
Core Problem
PostgreSQL's SQL/PGQ PROPERTY GRAPH feature (an emerging SQL:2023 standard construct) allows defining graph views over relational vertex and edge tables. Like foreign keys, property graph element definitions require the referenced columns on each vertex/edge table to be backed by a unique constraint (either PRIMARY KEY or UNIQUE). This establishes a catalog-level dependency from the property graph to the unique constraint object.
This dependency interacts poorly with pg_dump's section model. pg_dump partitions dumped objects into three sections:
- PRE-DATA: schema objects (tables, types, functions, views)
- DATA: table contents (COPY statements)
- POST-DATA: constraints, indexes, triggers, rules — anything whose creation is expensive or order-sensitive once data is loaded
Unique constraints (and the indexes backing them created via ALTER TABLE ... ADD CONSTRAINT) are emitted in POST-DATA for good reasons: loading data first, then building indexes, is dramatically faster than maintaining indexes during COPY.
The collision is:
CREATE PROPERTY GRAPHis a schema object → naturally belongs in PRE-DATA.- It depends on the PRIMARY KEY constraint of its vertex/edge tables → which lives in POST-DATA.
- pg_dump detects a cycle involving the synthetic PRE-DATA/POST-DATA boundary objects that enforce section ordering, prints
could not resolve dependency loop, and makes an arbitrary cut. - The arbitrary cut places
CREATE PROPERTY GRAPHin PRE-DATA, before the constraint exists. - On restore:
ERROR: there is no unique constraint matching the key for element ... - Because pg_restore tolerates individual object failures, the property graph is silently missing from the restored database — a data-fidelity bug, not merely an ergonomic one.
Prior Art: Materialized Views
This exact class of problem was previously solved for materialized views. A matview can reference (via its query) tables whose indexes/constraints are in POST-DATA, and a matview's REFRESH/population logically belongs after data loads anyway. The solution, implemented in pg_dump/pg_dump_sort.c, is repairMatViewBoundaryMultiLoop(): when a multi-object loop crosses the PRE-DATA boundary and includes a matview, pg_dump severs the boundary→matview edge and sets TableInfo.postponed_def = true. Downstream, dumpTableSchema() inspects this flag and emits the CREATE statement into the POST-DATA section instead.
Proposed Fix
Satya's patch generalizes the matview machinery to cover property graphs:
- Rename
repairMatViewBoundaryMultiLoop→repairPostponableBoundaryMultiLoop, reflecting its broadened scope. - Extend the predicate that selects the "postponable" node in the cycle to recognize property graph relkinds in addition to
RELKIND_MATVIEW. - Reuse the existing
TableInfo.postponed_defflag — no new plumbing indumpTableSchema()required, since the section-assignment logic already routes postponed definitions to POST-DATA. - Add a regression test that creates a property graph over a table with a PRIMARY KEY, dumps, restores, and verifies the property graph survives.
Why this is the right shape of fix
- Minimal surface area: it leans on an existing, battle-tested mechanism rather than inventing a new section or dependency class.
- Semantically correct: a property graph is a pure metadata object (no data of its own), so deferring its creation to POST-DATA has no performance or correctness downside; it only needs its referenced constraints to exist.
- Symmetric with matviews: both are "derived" relation-like objects whose definitions legitimately depend on POST-DATA artifacts.
Potential concerns worth probing in review
- Naming:
repairPostponableBoundaryMultiLoopis accurate but slightly awkward. Reviewers may suggest alternatives, or prefer a dispatch table mapping relkinds to "postponable" status. - Single-object loops: the existing code also has
repairMatViewBoundaryMultiLoopand logic for matview self-loops / simple loops. The patch must ensure property graphs are handled in all loop topologies pg_dump's sort can discover, not just multi-object loops. If there is an analogousrepairMatViewBoundary(simple-loop) path, it needs parallel treatment. - Extension of relkind check: property graphs may be represented as a new relkind (e.g.,
RELKIND_PROPGRAPH) or as a distinct catalog entirely. The patch's correctness hinges on the dump-side classification being consistent with howdumpTableSchema()emits the DDL. - Dependency on non-PK unique constraints: the fix should cover any
UNIQUEconstraint, not onlyPRIMARY KEY, since SQL/PGQ permits either. - Cross-version dumps: property graphs are a very new feature; the fix only matters for server versions that support them, so
pg_dump_sort.cguarding by version is likely unnecessary, but the test should be placed appropriately.
Architectural Significance
This bug is a concrete instance of a recurring tension in pg_dump: catalog dependencies form a DAG that does not align cleanly with the coarse three-section model. Every time a new object class is added that can legitimately span the PRE-DATA/POST-DATA boundary (matviews historically, now property graphs, and conceivably future constructs like SQL/PGQ-related objects or generated-column features that reference indexes), the dependency-repair machinery needs extension. The patch's rename is a tacit acknowledgement that this is now a pattern rather than a matview-specific hack, and future postponable object types should plug into the same mechanism.
Status
As of the initial post, this is a single-message submission from Satya Narlapuram with a patch and test. No reviews, committer feedback, or CF entry discussion are visible in the provided thread. The fix is small, well-motivated, and follows established precedent, so it is likely to be accepted with at most cosmetic adjustments — pending verification that all loop topologies are covered and that the property-graph relkind detection is robust.