REPLICA IDENTITY FULL is the silent CDC killer

One line of DDL, no lock contention, no rewrite, no warning from any safety guide. Three weeks later your WAL volume has doubled and Debezium is melting. Here is what REPLICA IDENTITY FULL actually costs and how to avoid it.

A platform engineer is paged because the CDC pipeline is falling behind. Debezium’s connector has been at 40 minutes of lag for two hours. WAL volume on the publisher has doubled this month for no obvious reason. Nothing in the deploy log looks dramatic.

Three weeks ago, somebody ran this:

ALTER TABLE events REPLICA IDENTITY FULL;

The migration passed code review. It passed every Postgres linter on the market. It took an ACCESS EXCLUSIVE lock for less than a millisecond, updated a pg_class row, and committed. No table rewrite. No backfill. No warning.

It is also the most expensive DDL statement in the average CDC stack, and it never appears in any safety guide.

What someone reaching for this is usually trying to fix

A CDC consumer (Debezium, pglogical, AWS DMS, Materialize, Striim) wants to replicate UPDATE and DELETE events from a table. The connector starts up, reads the table’s schema, and fails or warns: “no replica identity defined for table.” The table has no primary key. Maybe it never did. Maybe it is a legacy event log, an audit trail, or a denormalized join table that was added before anyone thought about replication.

The connector documentation, or a Stack Overflow answer, says the fix is to set a replica identity. The first option that comes up is FULL. It works immediately. The connector starts streaming. The page closes.

Three weeks later the WAL volume problem shows up.

What REPLICA IDENTITY actually controls

Logical replication needs to identify which row was updated or deleted on the publisher so the subscriber can apply the same change. REPLICA IDENTITY is the setting that controls what gets written to the WAL for that purpose. Four modes:

ModeWhat gets written for UPDATE/DELETE
DEFAULTThe old values of the primary key columns. Requires a PK.
USING INDEX idxThe old values of the columns in a unique, non-null, non-deferrable index.
FULLThe old values of every column in the row.
NOTHINGNo old-row information. UPDATE and DELETE cannot be logically replicated.

DEFAULT and USING INDEX both write a small, bounded number of columns. On a wide table they cost a few dozen bytes per change. FULL writes the entire old row image, every time.

The blast radius

From the Postgres docs at https://www.postgresql.org/docs/current/sql-altertable.html:

FULL records the old values of all columns in the row. This generates a lot of WAL traffic, especially if many columns are updated.

The official documentation is direct, but it is buried inside the ALTER TABLE reference page next to a dozen other clauses. The cost in practice is worse than “a lot of WAL traffic” suggests.

Consider a typical events table: 50 columns, three jsonb payload fields, a couple of text columns, the usual mix of timestamps and bigints. A single row is rarely smaller than 2KB once the jsonb is in there, often 8 to 30KB.

With REPLICA IDENTITY DEFAULT, an UPDATE that touches one column writes maybe 100 bytes of identity information to the WAL plus the changed column. With REPLICA IDENTITY FULL, the same UPDATE writes the entire 8KB old row image, then the changed column, then the new column. That is 10x to 100x more WAL per change. A DELETE that previously cost 100 bytes now costs the full row size.

This amplification applies to every UPDATE and every DELETE on the table, forever, until the setting is reverted. There is no rewrite, no migration, no follow-up event that you can grep the deploy log for. It is just a slow ramp in WAL volume that shows up first as a billing surprise on managed Postgres, then as replication lag, then as a disk-full incident.

What breaks downstream

The failure modes are all variants of the same problem: the WAL is now too big for the consumers downstream of it.

Debezium connector backpressure. Debezium’s logical decoding plugin reads the WAL, decodes it into change events, and serializes them to Kafka. When the WAL grows 10x, the decoder spends 10x more time per change. The Kafka producer batches start filling. The connector’s offset commits fall behind. Lag metrics climb. The connector eventually trips its own watermarks and stops.

pglogical replicas falling behind. A logical replica streams every WAL record relevant to its subscription. If a single table’s WAL volume triples, the replica’s apply worker cannot keep up on the same hardware. The lag compounds: the replica falls further behind, the publisher retains more WAL to serve it, and disk fills on the publisher.

Disk fills up on the publisher. This is the catastrophic failure mode. Postgres retains WAL segments until every logical replication slot has consumed them. If a slot is behind because of REPLICA IDENTITY FULL amplification, the publisher’s WAL directory grows without bound. When the volume fills, the database refuses new writes. Production goes read-only with a could not write to file: No space left on device error.

The hot-spot effect during backfill. Any large UPDATE (a backfill, a soft-delete cascade, a schema migration that touches a column) now writes the full old row image for every row touched. A backfill that used to fit inside a maintenance window now generates 50GB of WAL instead of 5GB. The replica falls hours behind. The hot path stalls because the WAL writer is saturated.

Each of these has shown up in production incident reports tagged “WAL growth” or “CDC lag.” The root cause is rarely traced back to the REPLICA IDENTITY FULL statement that set it off, because the statement itself ran weeks earlier without any complaint.

The actual fix

Add a primary key first. Even a synthetic one is fine:

ALTER TABLE events ADD COLUMN id bigint GENERATED ALWAYS AS IDENTITY;
-- backfill any rows that existed before the column was added (instant on PG11+ for constant defaults)
ALTER TABLE events ADD CONSTRAINT events_pkey PRIMARY KEY (id);
ALTER TABLE events REPLICA IDENTITY DEFAULT;

Now REPLICA IDENTITY DEFAULT uses the PK, the WAL contains 8 bytes of identity per change, and CDC works the way it was designed to work.

If a PK is genuinely impossible (you do not control the schema, or the table is partitioned in a way that resists a single PK), use REPLICA IDENTITY USING INDEX with any unique, non-null, non-deferrable index. The cost is the size of the index columns, not the row:

CREATE UNIQUE INDEX CONCURRENTLY events_event_uuid_idx ON events (event_uuid) WHERE event_uuid IS NOT NULL;
ALTER TABLE events REPLICA IDENTITY USING INDEX events_event_uuid_idx;

Both approaches generate minimal WAL. Neither has the amplification problem.

Why this is so often the wrong reach

Ops teams under pressure to “make CDC work” reach for FULL because it works on the first try. The DDL is one line. The CDC connector starts streaming immediately. The Slack thread closes. The cost is deferred and shows up as a database-wide WAL volume problem, not as a table-level fault, so it never gets attributed to the statement that caused it.

By the time the WAL dashboard catches it, three other tables have been changed the same way. The runbook entry has propagated. New microservices are coming online with the same pattern. The fix is now a multi-table migration with PK backfills, not a one-line revert.

The other reason it is the wrong reach is that the safety guides do not flag it. Every major linter catches CREATE INDEX without CONCURRENTLY. None of them flagged REPLICA IDENTITY FULL before pgfence v0.6. The DDL is brief, the lock is brief, the static analysis sees nothing wrong. The damage is a runtime cost that compounds invisibly.

How pgfence catches this

pgfence v0.6 ships a replica-identity-full rule that flags this exact pattern as HIGH risk with a PK-first recipe. The rule fires only on REPLICA IDENTITY FULL. REPLICA IDENTITY DEFAULT, REPLICA IDENTITY USING INDEX, and REPLICA IDENTITY NOTHING are not flagged. The output includes the WAL amplification reasoning and the two safe alternatives.

The recipe in the report is the same as the section above: add a synthetic PK, switch to DEFAULT. The fallback to USING INDEX is offered for the cases where a PK is not feasible.

Try it

For a single-statement check without setting up files:

pgfence explain "ALTER TABLE events REPLICA IDENTITY FULL"

For a full migration pass:

npm install -D @flvmnt/pgfence@0.6
npx @flvmnt/pgfence analyze migrations/*.sql

If your team operates a CDC pipeline, run pgfence over the last six months of merged migrations. The REPLICA IDENTITY FULL statements are usually clustered around the first weeks of a CDC rollout, when somebody hit the no-replica-identity error and reached for the fix that worked.

GitHub · Docs · npm

← All posts