False Negatives: The Silent Killer of Migration Safety Tools

Your migration linter says everything is safe. It's wrong. Here's why false negatives are more dangerous than false positives, and what we do about it.

Migration safety tools have two failure modes:

  • False positive: flags something safe as dangerous. Annoying. You investigate, realize it’s fine, and move on. Maybe you add an ignore comment.
  • False negative: says something dangerous is safe. You merge. You deploy. Your API goes down.

Many migration linters in the ecosystem optimize hard for false-positive reduction. Far fewer teams talk about the operational cost of false negatives. That’s backwards.

The asymmetry

A false positive costs you five minutes of investigation. A false negative costs you an incident.

When we built pgfence, we wrote a trust contract into the project’s design document:

pgfence’s fatal failure mode is not false positives. It is false negatives (silently passing dangerous migrations) or false safety implied (not surfacing what we couldn’t analyze).

This shapes every design decision. When we’re unsure about a pattern, we flag it. When we can’t analyze something (dynamic SQL, computed identifiers), we emit warnings and explicit coverage information instead of silently skipping it.

Where false negatives hide

We recently audited our own trace-merge logic (the code that combines static analysis with runtime verification) and found two critical false-negative vectors that would have shipped in our 0.4.0 release:

1. Hard-coded risk levels for trace-only findings

When trace mode catches a lock that static analysis missed, we create a “trace-only” finding. The original code set every trace-only finding to LOW risk:

risk: RiskLevel.LOW as RiskLevel,

If trace observed an ACCESS EXCLUSIVE lock on a user table that our static rules didn’t cover, it would be reported as LOW risk. CI with --max-risk=medium would pass. The most dangerous lock mode in Postgres, the one that blocks all reads and writes, passing your safety gate as “low risk.”

The fix: derive risk from the observed lock mode using the same severity mapping as static rules. ACCESS EXCLUSIVE maps to CRITICAL. SHARE maps to MEDIUM. The lock mode IS the risk.

2. Mismatch risk never upgraded

When static analysis predicts one lock mode but trace observes a different one, we report a “mismatch.” The original code kept the static risk level unchanged:

results.push({
  ...check,  // <-- includes the static risk
  verification: 'mismatch',
  tracedLockMode: strongest,
});

If static analysis said SHARE (MEDIUM risk) but trace observed ACCESS EXCLUSIVE, the risk field still said MEDIUM. CI enforcement reads the risk field. The migration would pass.

The fix: risk = max(static_risk, trace_derived_risk). Never downgrade. If trace says the lock is worse than predicted, believe the trace.

Why static analysis alone isn’t enough

These bugs existed because static analysis and trace mode are fundamentally different verification layers:

Static analysis maps known patterns to known lock modes. It’s fast, requires no database, and covers the common cases well. But it works from a lookup table. The lookup table can be wrong (version-specific behavior), incomplete (patterns we haven’t added yet), or inapplicable (implicit side effects that don’t appear in the DDL syntax).

Trace mode executes the actual statement against the actual Postgres version and observes the actual locks. It’s the ground truth. When trace and static disagree, trace is right.

The danger is in the merge layer. When you combine two sources of truth, the combination logic must never throw away information from the stronger source. That’s exactly what our bugs did: the merge logic deferred to static analysis even when trace had better data.

The coverage guarantee

The public pgfence reporters include coverage information so you can see what was and was not statically analyzable:

Analyzed 8 SQL statements. 2 dynamic statements not analyzable. Coverage: 75%.

If this line is missing, it’s a bug. We’d rather tell you “we couldn’t verify this” than silently skip it.

This extends to trace mode:

Verified: 6/8 | Mismatches: 1 | Trace-only: 2 | Static-only: 1

You can see exactly how much of the analysis was verified against real Postgres behavior. A report with 0 verified checks is a signal to inspect what trace could and could not verify, not proof that everything is safe.

What you can do

If you’re evaluating migration safety tools, ask these questions:

  1. What happens when the tool can’t analyze a statement? If it silently skips, you have a false-negative vector.
  2. Does the CI mode check for errors, not just warnings? If a statement fails during analysis and the tool exits 0, you have a false-negative vector.
  3. Can you verify the tool’s predictions? If there’s no way to compare predictions against reality, you’re trusting the lookup table forever.

pgfence’s trace mode exists to answer question 3. If you find a mismatch, open an issue. Every mismatch makes the static analysis more accurate for everyone.

npm install -D @flvmnt/pgfence
pgfence trace migrations/*.sql
← All posts