pgfence 0.4: Trace Mode, Verified Lock Analysis Against Real Postgres

pgfence can now execute migrations against a real Postgres instance and verify lock predictions against observed behavior. It spins up a disposable Docker container and traces statements one by one.

pgfence has always told you what lock mode each DDL statement should take. Now it can compare that prediction against what Postgres actually does for traced statements.

pgfence trace spins up a disposable Postgres container, executes your migration statements one by one, queries pg_locks and the system catalog after each statement, and compares what actually happened against what pgfence’s static analysis predicted.

pgfence trace migrations/*.sql
pgfence - Trace Report (PostgreSQL 17, Docker)

  migrations/add-verified.sql  [HIGH]

  #  Statement                          Lock Mode         Blocks  Risk   Verified    Duration
  1  ALTER TABLE users ADD COLUMN       ACCESS EXCLUSIVE  R + W   HIGH   Confirmed   2ms
     email_verified BOOLEAN NOT NULL
  2  CREATE INDEX idx ON users(email)   SHARE             W       MEDIUM Confirmed   1ms

  Trace-Only Findings:
  ! Table rewrite detected on "users" (relfilenode changed)

  === Coverage ===
  Analyzed: 2 statements | Verified: 2/2 | Mismatches: 0 | Trace-only: 1
  Docker: postgres:17-alpine | Container lifetime: 4.2s

Each traced statement gets a verification status:

  • Confirmed: pgfence’s prediction matched real Postgres behavior
  • Mismatch: pgfence was wrong, the trace shows what actually happened
  • Trace-only: something real Postgres did that static analysis can’t predict (table rewrites, implicit index creation)
  • Static-only: policy checks like missing lock_timeout that don’t involve locks

Why this matters

Static analysis is fast but inherently limited. It works from a lookup table: “this DDL statement takes this lock mode.” That lookup table is accurate for 99% of cases, but Postgres has version-specific behavior, implicit side effects, and edge cases that no static rule set can fully cover.

Trace mode removes the “trust me” from pgfence’s output. You can verify lock predictions against the PostgreSQL version you run in production.

How it works

  1. pgfence runs its normal static analysis first, including the ORM extractors and safe rewrite metadata
  2. It pulls postgres:{version}-alpine and starts a container on a random localhost port
  3. Most statements execute inside a BEGIN/COMMIT block so locks are held when we snapshot pg_locks. CONCURRENTLY statements use the observer path described below.
  4. After each statement, pgfence diffs the catalog: pg_class.relfilenode (table rewrites), pg_attribute (column changes), pg_constraint (constraint validation state), pg_index (index validity)
  5. Static predictions are matched against trace observations
  6. The container is deleted

The container runs on 127.0.0.1 only, uses a random password that’s never written to disk, and is cleaned up on normal exits and handled signals. pgfence never asks for database credentials.

CONCURRENTLY statements

This is one place pgfence differs from Eugene’s transaction-based trace approach.

Eugene runs everything in a transaction and rolls back. That means CREATE INDEX CONCURRENTLY fails, because Postgres rejects CONCURRENTLY inside transactions. Eugene’s solution: skip tracing entirely for CONCURRENTLY statements.

pgfence takes a different approach. Since the container is disposable, there’s no need for transactions. For CONCURRENTLY statements, pgfence opens a second connection that polls pg_locks while the main connection executes the statement. The observer records the lock modes it sees during execution, but because it polls, very short-lived spikes can still be missed.

Custom Docker images

If your migrations use extensions like PostGIS or pgvector, the base postgres:*-alpine image won’t have them. Use --docker-image to specify a custom image:

pgfence trace --docker-image postgis/postgis:17 migrations/*.sql
pgfence trace --docker-image ankane/pgvector:v0.7.0 migrations/*.sql

Version targeting

Test against the Postgres version you run in production:

pgfence trace --pg-version 14 migrations/*.sql
pgfence trace --pg-version 12 migrations/*.sql

This catches version-specific behavior: ADD COLUMN ... DEFAULT is instant on PG11+ but requires a table rewrite on PG10. ALTER TYPE ADD VALUE takes different lock modes on PG12+ vs earlier versions.

CI mode

Trace mode supports the same --ci flag as analyze. Mismatches also fail CI, because a mismatch means pgfence’s static analysis is wrong for that statement:

pgfence trace --ci --max-risk medium migrations/*.sql

Exit code 1 if: any check exceeds your risk threshold, any policy error, any mismatch, or any execution error. Execution errors are new in 0.4.0: if a statement fails during tracing, CI catches it instead of silently passing.

Security hardening

Trace mode handles Docker containers and database connections, so we took security seriously:

  • The container binds to 127.0.0.1 only (not 0.0.0.0)
  • Passwords are 20-char random base64url, never written to disk, never logged
  • sanitizeError() strips credentials from both postgres:// and postgresql:// URLs before any stderr output
  • Plugin loading rejects paths outside the project directory (prevents code execution via malicious config files)
  • Container cleanup is best-effort in finally blocks, so a Docker failure can’t suppress the original error
  • RESET ALL runs between migration files to prevent session state leakage

Trust contract improvements

We ran a thorough audit of trace-merge and found two critical false-negative vectors:

  1. Trace-only findings (locks that static analysis missed) were hard-coded to LOW risk, regardless of the actual lock observed. An ACCESS EXCLUSIVE lock caught only by trace would pass --max-risk=medium. Now risk is derived from the observed lock mode.

  2. Mismatch findings (where trace observed a different lock than predicted) kept the static risk level. If static said SHARE but trace observed ACCESS EXCLUSIVE, CI still saw MEDIUM risk. Now risk is upgraded to max(static, traced).

Both of these could have allowed dangerous migrations through CI. They’re fixed in the initial release.

What’s next

Trace mode is the foundation for pgfence’s correctness guarantee. In future versions, we’ll use it to:

  • Automatically regression-test every static rule against every supported Postgres version
  • Detect table rewrites that no static rule covers
  • Validate safe rewrite recipes by executing them and verifying the result

If you find a mismatch, please open an issue. Every mismatch report makes pgfence’s static analysis more accurate for everyone.

npm install -D @flvmnt/pgfence@latest
pgfence trace migrations/*.sql
← All posts