Upgrading PostgreSQL in production is not fun. You have real data, real users, and a version of Postgres that’s probably several major releases behind. The obvious approach (dump, upgrade, restore) works fine on a 10GB database on a Saturday night. It doesn’t work when you have TBs of data, applications that can’t afford hours of downtime, and a team watching the monitoring dashboard at 2am.
Logical replication solves this. I’ve run it in production, and it works well, but there are a few practical considerations that are easier to appreciate once you’ve been through the process yourself.
Why Logical Replication
You keep your source cluster running (old Postgres version, production traffic, and all), bring up a destination cluster with the new version, and replicate all data changes while both run in parallel. When the destination has caught up, you cut over. Downtime is seconds to minutes, not hours.
Source is your existing cluster (say, Postgres 14). Destination is your new cluster running Postgres 17.
The thing most guides skim over is that logical replication does not copy schema, it only copies data. PostgreSQL replicates INSERT, UPDATE, and DELETE operations, but not your table definitions, indexes, roles, or anything structural. An example would be if logical replication is setup then truncate operation on a single source table would also be replicated, but table1, table2, and table3, etc. won’t be replicated. You have to handle that yourself before replication starts. Miss this, and your subscription will either fail silently or throw confusing errors that don’t point you back to the actual problem.
Step 1: The Primary Key Check
Logical replication requires every replicated table to have a replica identity to replicate UPDATE and DELETE operations. By default, that’s the primary key. Tables without one will still replicate INSERTs, but any UPDATE or DELETE will error on the publisher.
Replica Identity
A published table must have a replica identity configured so that UPDATE and DELETE operations can be replicated, because the replica identity is what allows the correct row to be located on the subscriber. By default, the replica identity is the primary key. A suitable unique index can also be used, subject to certain requirements. If no suitable key exists, the replica identity can be set to FULL, which uses the entire row as the key.
Replica identity FULL should be treated as a fallback to be used only when no other option is available, because without a suitable index on the subscriber, the row search can be very inefficient.
If a table without a replica identity is added to a publication that replicates UPDATE or DELETE operations, those operations will fail on the publisher. INSERT operations proceed regardless of the replica identity.
Identifying Tables and Setting Replica Identity
Identify all tables that do not have a primary key.
ubuntu@<target servername># psql -d <DATABASE_NAME> -h <SOURCE_IP_ADDRESS>
SELECT
schemaname,
relname AS tablename,
pg_size_pretty(pg_total_relation_size(relid)) AS total_size,
pg_size_pretty(pg_relation_size(relid)) AS table_size_only
FROM
pg_stat_user_tables
WHERE
relid NOT IN (
SELECT indrelid
FROM pg_index
WHERE indisprimary = TRUE
)
ORDER BY
pg_total_relation_size(relid) DESC; -- largest table first
Identify tables that have no primary key but do have a unique key.
ubuntu@<target servername># psql -d <DATABASE_NAME> -h <SOURCE_IP_ADDRESS>
SELECT
s.schemaname,
s.relname AS tablename,
COUNT(CASE WHEN i.indisunique = TRUE THEN 1 END) AS count_of_unique_indexes,
c.relreplident
FROM
pg_stat_user_tables s
LEFT JOIN pg_class c ON s.relid = c.oid
LEFT JOIN pg_index i ON c.oid = i.indrelid
WHERE
NOT EXISTS (
SELECT 1 FROM pg_index i2
WHERE i2.indrelid = s.relid AND i2.indisprimary
)
GROUP BY
s.schemaname, s.relname, c.relreplident
HAVING
COUNT(CASE WHEN i.indisunique = TRUE THEN 1 END) > 0;
For a table that has a qualifying unique index, review the available indexes to choose one.
ubuntu@<target servername># psql -d <DATABASE_NAME> -h <SOURCE_IP_ADDRESS>
SELECT
t.relname AS table_name,
i.relname AS index_name,
pg_get_indexdef(i.oid) AS index_definition
FROM pg_class t
JOIN pg_index ix ON t.oid = ix.indrelid
JOIN pg_class i ON ix.indexrelid = i.oid
WHERE
t.relkind = 'r'
AND ix.indisunique = 't'
AND t.relname = '<tablename>';
Where a unique index qualifies, set the replica identity to that index.
ubuntu@<target servername># psql -d <DATABASE_NAME> -h <SOURCE_IP_ADDRESS>
SELECT relreplident FROM pg_class WHERE relname = '<TABLE_NAME>'; -- check current setting
ALTER TABLE <TABLE_NAME> REPLICA IDENTITY NOTHING; -- temporarily set to NOTHING
SELECT relreplident FROM pg_class WHERE relname = '<TABLE_NAME>'; -- verify the change
ALTER TABLE <TABLE_NAME> REPLICA IDENTITY USING INDEX <IDX_NAME>; -- set to the unique index
SELECT relreplident FROM pg_class WHERE relname = '<TABLE_NAME>'; -- verify the new setting
If no unique index qualifies, or there is no unique index at all, fall back to replica identity FULL.
ubuntu@<target servername># psql -d <DATABASE_NAME> -h <SOURCE_IP_ADDRESS>
SELECT relreplident FROM pg_class WHERE relname = '<TABLE_NAME>'; -- check current setting
ALTER TABLE <TABLE_NAME> REPLICA IDENTITY FULL;
SELECT relreplident FROM pg_class WHERE relname = '<TABLE_NAME>'; -- verify the new setting
Setting Replica Identity in Bulk
When many tables require a replica identity FULL, the approach below is more efficient. First, identify all tables that have neither a primary key nor a unique key.
ubuntu@<target servername># psql -d <DATABASE_NAME> -h <SOURCE_IP_ADDRESS>
SELECT
s.schemaname,
s.relname AS tablename,
COUNT(CASE WHEN i.indisunique = TRUE THEN 1 END) AS count_of_unique_indexes,
c.relreplident
FROM pg_stat_user_tables s
LEFT JOIN pg_class c ON s.relid = c.oid
LEFT JOIN pg_index i ON c.oid = i.indrelid
WHERE
NOT EXISTS (
SELECT 1 FROM pg_index i2
WHERE i2.indrelid = s.relid AND i2.indisprimary
)
GROUP BY s.schemaname, s.relname, c.relreplident
HAVING COUNT(CASE WHEN i.indisunique = TRUE THEN 1 END) = 0;
Then generate the ALTER TABLE statements automatically.
ubuntu@<target servername># psql -d <DATABASE_NAME> -h <SOURCE_IP_ADDRESS>
SELECT
'ALTER TABLE ' || s.schemaname || '.' || s.relname ||
' REPLICA IDENTITY FULL;' AS alter_statement
FROM pg_stat_user_tables s
LEFT JOIN pg_class c ON s.relid = c.oid
LEFT JOIN pg_index i ON c.oid = i.indrelid
WHERE
NOT EXISTS (
SELECT 1 FROM pg_index i2
WHERE i2.indrelid = s.relid AND i2.indisprimary
)
GROUP BY s.schemaname, s.relname, c.relreplident
HAVING COUNT(CASE WHEN i.indisunique = TRUE THEN 1 END) = 0;
Review and run the generated statements, then validate the result using the queries shown earlier.
Step 2: Export Globals From Source
Globals are things like roles, passwords, and tablespace definitions. They live outside individual databases, so pg_dump won’t capture them. Use pg_dumpall with the -g flag:
pg_dumpall -U postgres -h <source_ip> -g > globals.sqlStep 3: Copy Globals To The Destination
scp globals.sql <dest_ip>:/tmp/Step 4: Apply Globals On The Destination
psql -U <your_user> -d postgres -h localhost -f globals.sqlStep 5: Export The Schema From The Source
Logical replication will not create your tables on the destination. You have to do it manually. Export schema only (no data) with the -s flag:
pg_dump -U postgres -h localhost -d your_database -s -C > your_database.sqlStep 6: Create A Publication On The Source
A publication is PostgreSQL saying it’s willing to share this database. Run it on the source; nothing moves yet.
CREATE PUBLICATION your_database_pub FOR ALL TABLES;Step 7: Create The Replication Role And Apply The Schema
Create the replication role on the destination:
CREATE ROLE replicator_user WITH REPLICATION LOGIN PASSWORD 'your_secure_password';Step 8: Grant Access To All The Schemas
Grant the replication user read access to all schemas on the source:
GRANT SELECT ON ALL TABLES IN SCHEMA <SCHEMA_NAME> TO replicator_user;Then apply the schema exported in Step 4:
psql -U <your_user> -d postgres -h localhost -f /tmp/your_database.sqlStep 9: Create The Subscription On The Destination
This is where replication actually begins. Run it on the destination:
CREATE SUBSCRIPTION your_database_sub
CONNECTION 'host=<source_ip> port=5432 dbname=your_database user=replicator_user'
PUBLICATION your_database_pub;
PostgreSQL copies the full initial snapshot first, then switches to streaming and picks up changes as they happen. You can watch the lag from the source:
SELECT
client_addr,
state,
sent_lsn,
write_lsn,
flush_lsn,
replay_lsn,
(sent_lsn - replay_lsn) AS replication_lag_bytes
FROM pg_stat_replication;
When replication_lag_bytes sits at or near zero, you’re ready.
What logical replication doesn’t handle
Three things catch people off guard.
- DDL is not replicated. Run ALTER TABLE on the source after replication starts and the destination won’t see it. You have to apply schema changes on both sides manually, which gets annoying fast if your team runs migrations frequently.
- Sequences don’t replicate. When you cut over, sequences on the destination will be behind where the source left off. Advance them before accepting writes or you’ll hit duplicate key errors on the first insert. This one is easy to forget and painful to debug.
Note: While working in prod systems, it was necessary to disable statement_timeout for the replication user. Run the following on the source side.
ALTER ROLE replicator SET statement_timeout = 0;
None of that is a reason to avoid logical replication. For a zero-downtime PostgreSQL version upgrade, it’s still the cleanest option I’ve used. You just need to go in knowing what you’re managing yourself.

