pgfence 0.4: Trace Mode, Verified Lock Analysis Against Real Postgres
pgfence can now execute migrations against a real Postgres instance and verify lock predictions against observed behavior. It spins up a disposable Docker container and traces statements one by one.
pgfence has always told you what lock mode each DDL statement should take. Now it can compare that prediction against what Postgres actually does for traced statements.
pgfence trace spins up a disposable Postgres container, executes your migration statements one by one, queries pg_locks and the system catalog after each statement, and compares what actually happened against what pgfence’s static analysis predicted.
pgfence trace migrations/*.sql
pgfence - Trace Report (PostgreSQL 17, Docker)
migrations/add-verified.sql [HIGH]
# Statement Lock Mode Blocks Risk Verified Duration
1 ALTER TABLE users ADD COLUMN ACCESS EXCLUSIVE R + W HIGH Confirmed 2ms
email_verified BOOLEAN NOT NULL
2 CREATE INDEX idx ON users(email) SHARE W MEDIUM Confirmed 1ms
Trace-Only Findings:
! Table rewrite detected on "users" (relfilenode changed)
=== Coverage ===
Analyzed: 2 statements | Verified: 2/2 | Mismatches: 0 | Trace-only: 1
Docker: postgres:17-alpine | Container lifetime: 4.2s
Each traced statement gets a verification status:
- Confirmed: pgfence’s prediction matched real Postgres behavior
- Mismatch: pgfence was wrong, the trace shows what actually happened
- Trace-only: something real Postgres did that static analysis can’t predict (table rewrites, implicit index creation)
- Static-only: policy checks like missing
lock_timeoutthat don’t involve locks
Why this matters
Static analysis is fast but inherently limited. It works from a lookup table: “this DDL statement takes this lock mode.” That lookup table is accurate for 99% of cases, but Postgres has version-specific behavior, implicit side effects, and edge cases that no static rule set can fully cover.
Trace mode removes the “trust me” from pgfence’s output. You can verify lock predictions against the PostgreSQL version you run in production.
How it works
- pgfence runs its normal static analysis first, including the ORM extractors and safe rewrite metadata
- It pulls
postgres:{version}-alpineand starts a container on a random localhost port - Most statements execute inside a
BEGIN/COMMITblock so locks are held when we snapshotpg_locks.CONCURRENTLYstatements use the observer path described below. - After each statement, pgfence diffs the catalog:
pg_class.relfilenode(table rewrites),pg_attribute(column changes),pg_constraint(constraint validation state),pg_index(index validity) - Static predictions are matched against trace observations
- The container is deleted
The container runs on 127.0.0.1 only, uses a random password that’s never written to disk, and is cleaned up on normal exits and handled signals. pgfence never asks for database credentials.
CONCURRENTLY statements
This is one place pgfence differs from Eugene’s transaction-based trace approach.
Eugene runs everything in a transaction and rolls back. That means CREATE INDEX CONCURRENTLY fails, because Postgres rejects CONCURRENTLY inside transactions. Eugene’s solution: skip tracing entirely for CONCURRENTLY statements.
pgfence takes a different approach. Since the container is disposable, there’s no need for transactions. For CONCURRENTLY statements, pgfence opens a second connection that polls pg_locks while the main connection executes the statement. The observer records the lock modes it sees during execution, but because it polls, very short-lived spikes can still be missed.
Custom Docker images
If your migrations use extensions like PostGIS or pgvector, the base postgres:*-alpine image won’t have them. Use --docker-image to specify a custom image:
pgfence trace --docker-image postgis/postgis:17 migrations/*.sql
pgfence trace --docker-image ankane/pgvector:v0.7.0 migrations/*.sql
Version targeting
Test against the Postgres version you run in production:
pgfence trace --pg-version 14 migrations/*.sql
pgfence trace --pg-version 12 migrations/*.sql
This catches version-specific behavior: ADD COLUMN ... DEFAULT is instant on PG11+ but requires a table rewrite on PG10. ALTER TYPE ADD VALUE takes different lock modes on PG12+ vs earlier versions.
CI mode
Trace mode supports the same --ci flag as analyze. Mismatches also fail CI, because a mismatch means pgfence’s static analysis is wrong for that statement:
pgfence trace --ci --max-risk medium migrations/*.sql
Exit code 1 if: any check exceeds your risk threshold, any policy error, any mismatch, or any execution error. Execution errors are new in 0.4.0: if a statement fails during tracing, CI catches it instead of silently passing.
Security hardening
Trace mode handles Docker containers and database connections, so we took security seriously:
- The container binds to
127.0.0.1only (not0.0.0.0) - Passwords are 20-char random base64url, never written to disk, never logged
sanitizeError()strips credentials from bothpostgres://andpostgresql://URLs before any stderr output- Plugin loading rejects paths outside the project directory (prevents code execution via malicious config files)
- Container cleanup is best-effort in
finallyblocks, so a Docker failure can’t suppress the original error RESET ALLruns between migration files to prevent session state leakage
Trust contract improvements
We ran a thorough audit of trace-merge and found two critical false-negative vectors:
-
Trace-only findings (locks that static analysis missed) were hard-coded to
LOWrisk, regardless of the actual lock observed. An ACCESS EXCLUSIVE lock caught only by trace would pass--max-risk=medium. Now risk is derived from the observed lock mode. -
Mismatch findings (where trace observed a different lock than predicted) kept the static risk level. If static said SHARE but trace observed ACCESS EXCLUSIVE, CI still saw MEDIUM risk. Now risk is upgraded to
max(static, traced).
Both of these could have allowed dangerous migrations through CI. They’re fixed in the initial release.
What’s next
Trace mode is the foundation for pgfence’s correctness guarantee. In future versions, we’ll use it to:
- Automatically regression-test every static rule against every supported Postgres version
- Detect table rewrites that no static rule covers
- Validate safe rewrite recipes by executing them and verifying the result
If you find a mismatch, please open an issue. Every mismatch report makes pgfence’s static analysis more accurate for everyone.
npm install -D @flvmnt/pgfence@latest
pgfence trace migrations/*.sql