← Writing

Zero-Downtime PostgreSQL Schema Migrations at Scale

The Lock Problem

Most ALTER TABLE operations acquire an AccessExclusiveLock:

ALTER TABLE users ADD COLUMN phone VARCHAR(20);

This locks the table for writes and reads. At 10 million rows on a busy service, the lock can hold for minutes, causing cascading timeouts.

Techniques for Zero-Downtime Migrations

1. Add Columns (Safe by Default)

Adding a column with a DEFAULT value is safe in PostgreSQL 11+:

ALTER TABLE users ADD COLUMN created_at TIMESTAMPTZ DEFAULT NOW() NOT NULL;

PostgreSQL stores the default value metadata without rewriting every row. Reads and writes proceed normally.

2. Add NOT NULL Constraint (Two-Phase)

-- Phase 1: Add constraint as NOT VALID (non-blocking)
ALTER TABLE users ADD CONSTRAINT users_email_not_null CHECK (email IS NOT NULL) NOT VALID;

-- Phase 2: Validate in background (during low-traffic window)
ALTER TABLE users VALIDATE CONSTRAINT users_email_not_null;

Phase 1 returns immediately. Phase 2 scans the table but doesn't lock writes.

3. Rename Columns (Swap Strategy)

To rename a column, don't use ALTER TABLE RENAME:

-- Step 1: Add new column
ALTER TABLE users ADD COLUMN username_new VARCHAR(255);

-- Step 2: Populate (in batches, if large table)
UPDATE users SET username_new = username WHERE username_new IS NULL LIMIT 10000;

-- Step 3: Add trigger to keep in sync
CREATE TRIGGER username_sync BEFORE INSERT OR UPDATE ON users
FOR EACH ROW EXECUTE FUNCTION sync_username_columns();

-- Step 4: Update application to use username_new (deploy code)

-- Step 5: Drop old column (after confident application uses new column)
ALTER TABLE users DROP COLUMN username;
ALTER TABLE users RENAME COLUMN username_new TO username;

4. Rebuild Indexes Concurrently

-- Doesn't lock writes
REINDEX INDEX CONCURRENTLY users_email_idx;

5. Blue-Green Table Swap

For large schema changes:

-- 1. Create new table with desired schema
CREATE TABLE users_new (LIKE users INCLUDING ALL);
ALTER TABLE users_new ADD COLUMN phone VARCHAR(20);

-- 2. Copy data in background
INSERT INTO users_new SELECT * FROM users;

-- 3. Sync ongoing changes with trigger
CREATE TRIGGER sync_to_new AFTER INSERT OR UPDATE ON users
FOR EACH ROW EXECUTE FUNCTION insert_or_update_users_new();

-- 4. Swap (brief lock, milliseconds)
ALTER TABLE users RENAME TO users_old;
ALTER TABLE users_new RENAME TO users;
DROP TABLE users_old;

Checklist

  • Test migration on production-scale data locally
  • Have a rollback plan
  • Monitor lock times during migration: SELECT * FROM pg_locks;
  • Set lock_timeout to fail fast if lock holds too long
  • Deploy application code BEFORE schema changes (forward-compatibility)
  • Run migrations during low-traffic window if possible
  • Verify application still works on new schema