Testing Backup and Restore Procedures for Self-Hosted Supabase

Learn how to validate your self-hosted Supabase backups actually work before disaster strikes. A practical guide to DR testing.

Cover Image for Testing Backup and Restore Procedures for Self-Hosted Supabase

A backup that can't be restored is not a backup. It's a false sense of security. According to community discussions, many self-hosted Supabase teams discover their backups are corrupted, incomplete, or otherwise unusable only when disaster strikes. Don't be that team.

If you've followed our backup and restore guide, you have backups running. But have you actually tested them? This guide walks through practical procedures for validating your self-hosted Supabase backups work before you need them.

Why Backup Testing Gets Skipped

Let's be honest about why this happens. Testing restores is:

  • Time-consuming: Spinning up a test environment takes effort
  • Resource-intensive: You need infrastructure to restore into
  • Seemingly unnecessary: "The backup job completed successfully" feels like enough
  • Easy to postpone: There's always something more urgent

But here's the reality: backup failures are often silent. Your cron job reports success, your S3 bucket fills with files, and everything looks fine—until you actually try to restore. Common issues that only surface during restoration:

  1. Corrupted dumps: pg_dump completed but produced invalid SQL
  2. Missing storage files: Database backed up, but storage sync failed
  3. Permission mismatches: Roles exist in the dump but can't be created on restore
  4. Version incompatibilities: Postgres 17 dump won't restore cleanly to Postgres 15
  5. Incomplete data: Backup captured mid-transaction, leaving orphaned references

Setting Up a Test Environment

Before you can test restores, you need somewhere to restore to. The goal is an isolated environment that mirrors production closely enough to validate your backup integrity.

Option 1: Local Docker Environment

The simplest approach for smaller datasets. Clone your Supabase docker-compose setup:

# Create a separate directory for testing
mkdir ~/supabase-restore-test
cd ~/supabase-restore-test

# Copy your production docker-compose.yml
cp /path/to/production/docker-compose.yml .

# Modify ports to avoid conflicts
# Change 8000 -> 8100, 5432 -> 5433, etc.
sed -i 's/8000:8000/8100:8000/g' docker-compose.yml
sed -i 's/5432:5432/5433:5432/g' docker-compose.yml

# Start with fresh volumes
docker compose up -d

Pros: Free, fast to spin up, no cloud costs

Cons: Limited by local resources, won't catch network-related issues

Option 2: Staging VPS

For production-critical systems, maintain a dedicated staging server. Many VPS providers offer hourly billing, making this cost-effective:

# On your staging server
docker compose -p supabase-staging up -d

Keep the staging environment configuration identical to production—same Postgres version, same storage backend, same environment variables (except credentials).

Option 3: Ephemeral Cloud Instances

For comprehensive DR testing, spin up temporary infrastructure:

# Using Terraform or Pulumi, create a replica of production
terraform apply -var="environment=dr-test"

# Run your restore
# Validate
# Tear down
terraform destroy

This approach catches infrastructure-level issues but requires more setup.

The Restore Testing Checklist

A successful restore test validates multiple layers. Work through this checklist methodically:

1. Database Schema Integrity

First, verify the schema restored correctly:

-- Connect to the restored database
psql -h localhost -p 5433 -U postgres -d postgres

-- Check all tables exist
SELECT table_name FROM information_schema.tables 
WHERE table_schema = 'public';

-- Verify auth schema
SELECT table_name FROM information_schema.tables 
WHERE table_schema = 'auth';

-- Check for missing foreign key relationships
SELECT conname, conrelid::regclass AS table_name
FROM pg_constraint 
WHERE contype = 'f' AND NOT convalidated;

Schema validation should produce no errors. If you see missing tables or invalid constraints, your backup is incomplete.

2. Row Level Security Policies

RLS policies are critical for Supabase security. Missing or broken policies can either lock users out or expose data:

-- List all RLS policies
SELECT schemaname, tablename, policyname, permissive, roles, cmd, qual
FROM pg_policies
WHERE schemaname = 'public';

-- Compare count with production
-- This number should match exactly
SELECT COUNT(*) FROM pg_policies WHERE schemaname = 'public';

For a more comprehensive check, keep a baseline of your RLS policies and compare after restore.

3. Data Integrity Verification

Spot-check actual data, not just structure:

-- Compare record counts with production
SELECT 'users' as table_name, COUNT(*) FROM auth.users
UNION ALL
SELECT 'your_main_table', COUNT(*) FROM public.your_main_table;

-- Check for orphaned records
SELECT id FROM public.orders 
WHERE user_id NOT IN (SELECT id FROM auth.users);

-- Verify recent data was captured
SELECT MAX(created_at) FROM public.your_main_table;

The last query is particularly important. If your most recent record is from last week but you ran a backup yesterday, something failed.

4. Storage File Validation

This is where many restore tests fail. Database metadata exists, but files are missing:

# List files in restored storage
mc ls local/supabase-storage/stub/ --recursive | wc -l

# Compare with production
mc ls production/supabase-storage/stub/ --recursive | wc -l

# Check a specific known file
mc stat local/supabase-storage/stub/avatars/user-123.jpg

Better yet, test through the application:

// In your app, try to fetch a known file
const { data, error } = await supabase.storage
  .from('avatars')
  .download('user-123.jpg');

if (error) {
  console.error('Storage restore failed:', error);
}

5. Auth System Validation

Test that authentication actually works:

# Try to sign in with a test user
curl -X POST 'http://localhost:8100/auth/v1/token?grant_type=password' \
  -H "apikey: your-anon-key" \
  -H "Content-Type: application/json" \
  -d '{"email":"[email protected]","password":"testpassword"}'

If auth fails, check that the auth.users table and auth.identities table both restored correctly.

6. Realtime and Edge Functions

For complete validation, test these components too:

// Test Realtime subscription
const channel = supabase.channel('test')
  .on('postgres_changes', { event: '*', schema: 'public' }, payload => {
    console.log('Realtime working:', payload);
  })
  .subscribe();

// Insert a test record
await supabase.from('test_table').insert({ test: true });

Automating Your DR Drills

Manual testing is important for learning, but ongoing validation should be automated. Here's a practical approach:

Scheduled Monthly Tests

Create a script that runs your full restore validation:

#!/bin/bash
# dr-test.sh

set -e

TIMESTAMP=$(date +%Y%m%d)
LOG_FILE="/var/log/dr-tests/$TIMESTAMP.log"

echo "Starting DR test at $(date)" >> $LOG_FILE

# 1. Spin up test environment
docker compose -f docker-compose.test.yml up -d

# 2. Wait for services to be healthy
sleep 30

# 3. Restore latest backup
./restore-backup.sh --target test --latest

# 4. Run validation suite
./validate-restore.sh >> $LOG_FILE

# 5. Capture results
if [ $? -eq 0 ]; then
  echo "DR test PASSED" >> $LOG_FILE
else
  echo "DR test FAILED" >> $LOG_FILE
  # Send alert
  curl -X POST "$SLACK_WEBHOOK" -d '{"text":"DR test failed!"}'
fi

# 6. Cleanup
docker compose -f docker-compose.test.yml down -v

Schedule this with cron:

# Run DR test at 3 AM on the first Sunday of each month
0 3 1-7 * 0 /opt/scripts/dr-test.sh

Tracking RTO and RPO

Every DR test should measure two critical metrics:

Recovery Point Objective (RPO): How much data could you lose? Measure the gap between your most recent backup and the latest data in production.

-- On production
SELECT MAX(created_at) AS latest_data FROM public.audit_log;

-- Compare with backup timestamp
-- RPO = latest_data - backup_timestamp

Recovery Time Objective (RTO): How long does restoration take? Time your restore process end-to-end:

START=$(date +%s)
./restore-backup.sh --target test --latest
END=$(date +%s)
RTO=$((END - START))
echo "RTO: $RTO seconds"

Document these numbers. If your business requires 15-minute RPO and 1-hour RTO, but your tests show 24-hour RPO and 4-hour RTO, you have a gap to address.

Common Restore Failures and Fixes

Through years of helping teams with self-hosted Supabase, we've seen these issues repeatedly:

"Role supabase_admin does not exist"

The dump contains ownership statements that reference Supabase's internal roles:

-- Before restore, create the role
CREATE ROLE supabase_admin WITH LOGIN;
-- Or edit the dump to remove/replace OWNER TO statements

"Database postgres is being accessed by other users"

You can't drop a database with active connections:

-- Terminate existing connections
SELECT pg_terminate_backend(pid) 
FROM pg_stat_activity 
WHERE datname = 'postgres' AND pid <> pg_backend_pid();

Storage Files Return 404

The TENANT_ID in your storage configuration doesn't match what's in the backup:

# Check what tenant ID your backup used
mc ls local/supabase-storage/
# If it shows 'stub/', ensure TENANT_ID=stub in your environment

Auth Tokens Invalid After Restore

JWT secrets must match between backup and restore environment. Copy your JWT_SECRET and ANON_KEY from production.

When to Use Supascale Instead

Testing and validating backups is essential, but it's also operational overhead. For teams managing multiple Supabase instances, the cumulative time spent on DR testing adds up.

Supascale automates this entire process. When you configure automated backups, Supascale:

  • Captures database and storage together as a consistent snapshot
  • Stores backups in your S3-compatible storage
  • Enables one-click restore to any point in time
  • Handles the permission issues and role management automatically

The backup itself is only half the equation—knowing it works is the other half. With Supascale's one-click restore, you can validate recovery in minutes rather than hours.

Building a DR Testing Culture

The goal isn't just to test once and forget. Build ongoing validation into your operations:

  1. Monthly full restore tests: Complete end-to-end validation
  2. Weekly backup integrity checks: Verify backup files exist and are correctly sized
  3. Quarterly DR drills: Include your team, practice communication and coordination
  4. Post-incident reviews: After any production issue, validate your backup captured the state

Document everything. Your restore runbook should be detailed enough that any team member can execute it under pressure.

Conclusion

The best time to discover your backups don't work is during a scheduled test, not during a 3 AM production outage. Regular restore testing transforms your disaster recovery plan from a theoretical document into a proven capability.

Start with manual testing to understand the process, then automate for ongoing validation. Track your RTO and RPO metrics to ensure they meet business requirements. And if managing all of this feels like a full-time job—well, that's exactly why tools like Supascale exist.

Your data is too important to leave to untested backups. Schedule your first DR test this week.


Further Reading