Self-hosted Supabase gives you unlimited database growth—but that's a double-edged sword. Without Supabase Cloud's managed limits nudging you toward cost-conscious decisions, databases can balloon quickly. Logs accumulate, old sessions pile up, and analytics tables grow unchecked. Before you know it, backups take hours and queries slow to a crawl.
Unlike managed services, Supabase won't automatically delete your data after a certain time. You need to implement retention policies yourself. This guide covers three approaches to managing database growth in self-hosted Supabase: simple scheduled deletions with pg_cron, efficient partitioning with pg_partman, and storage object lifecycle management.
Why Data Retention Matters for Self-Hosted Instances
On Supabase Cloud, database size directly impacts your bill—there's a built-in incentive to keep things lean. Self-hosted deployments have different economics: storage is cheap, but operational costs add up.
Large databases create several problems:
- Slower backups: A 500GB database takes significantly longer to dump and restore than a 50GB one. Our guide on backup and restore procedures becomes more complex with oversized databases.
- Degraded query performance: Indexes grow, table scans take longer, and even well-optimized queries suffer
- Higher resource requirements: More RAM for caching, more CPU for vacuum operations, more I/O during maintenance windows
- Compliance risks: Retaining user data longer than necessary can violate GDPR, CCPA, or industry regulations
The solution isn't to never store data—it's to define how long data should live and automate its cleanup.
Defining Your Retention Requirements
Before writing any code, map out what data you have and how long you need it:
| Data Type | Typical Retention | Notes |
|---|---|---|
| User sessions | 30-90 days | Balance UX (staying logged in) with security |
| Application logs | 7-30 days | Keep longer if needed for debugging patterns |
| Analytics events | 90-365 days | Consider aggregating before archiving |
| Audit trails | 1-7 years | Often legally mandated |
| Realtime presence | 24 hours | Ephemeral by nature |
| Auth tokens | Until expiry | Clean up expired tokens promptly |
Your requirements will vary based on your application, industry, and compliance needs. The key is being intentional rather than keeping everything forever "just in case."
Approach 1: Scheduled Deletions with pg_cron
For most self-hosted Supabase deployments, pg_cron provides the simplest path to automated data cleanup. It's a PostgreSQL extension that runs cron jobs inside the database—no external schedulers required.
Enabling pg_cron in Self-Hosted Supabase
If you followed our guide on scheduling cron jobs for self-hosted Supabase, you'll already have pg_cron configured. If not, you'll need to add it to your PostgreSQL configuration.
First, enable the extension:
CREATE EXTENSION IF NOT EXISTS pg_cron;
Then add pg_cron to shared_preload_libraries in your postgresql.conf (requires a restart):
shared_preload_libraries = 'pg_cron' cron.database_name = 'postgres'
Creating Retention Jobs
Here's a practical example for cleaning up old sessions:
-- Delete sessions older than 30 days, runs daily at 3 AM SELECT cron.schedule( 'cleanup-old-sessions', '0 3 * * *', $$DELETE FROM auth.sessions WHERE created_at < NOW() - INTERVAL '30 days'$$ );
For application logs:
-- Delete logs older than 7 days, runs every 6 hours SELECT cron.schedule( 'cleanup-app-logs', '0 */6 * * *', $$DELETE FROM public.app_logs WHERE created_at < NOW() - INTERVAL '7 days'$$ );
For analytics events with batch limits (prevents long-running transactions):
-- Delete up to 10,000 old events per run
SELECT cron.schedule(
'cleanup-analytics',
'*/15 * * * *',
$$DELETE FROM public.analytics_events
WHERE id IN (
SELECT id FROM public.analytics_events
WHERE created_at < NOW() - INTERVAL '90 days'
LIMIT 10000
)$$
);
Monitoring Your Jobs
Check if jobs are running successfully:
SELECT jobid, jobname, schedule, command,
(SELECT max(start_time) FROM cron.job_run_details WHERE jobid = cron.job.jobid) as last_run,
(SELECT status FROM cron.job_run_details WHERE jobid = cron.job.jobid ORDER BY start_time DESC LIMIT 1) as last_status
FROM cron.job;
Review recent job execution details:
SELECT jobid, jobname, start_time, end_time,
status, return_message
FROM cron.job_run_details
ORDER BY start_time DESC
LIMIT 20;
If jobs are failing silently and backlogs accumulate, you can end up in a situation where clearing old data becomes difficult. Monitor cron.job_run_details regularly.
Performance Considerations
For tables with millions of rows, bulk DELETEs create problems:
- Long-running transactions block other operations
- Large amounts of dead tuples trigger expensive autovacuum
- Write amplification impacts disk I/O
A 16 CPU/64GB machine can handle batch deletes every 5 minutes without issues—delete queries typically complete under 35ms. But this depends on your specific workload and table structure.
For high-volume tables, consider Approach 2 instead.
Approach 2: Partition Management with pg_partman
When DELETE statements can't keep up with your data growth, partitioning offers a more efficient alternative. Instead of deleting rows, you drop entire partitions—a nearly instant operation regardless of partition size.
pg_partman automates partition creation and maintenance. It removes the need to manually create and detach partitions for tables partitioned by time or an integer series.
When to Use Partitioning
Partitioning makes sense when:
- Tables grow by millions of rows per day
- Retention periods are well-defined (30 days, 90 days, etc.)
- Query patterns naturally align with time ranges
- DELETE operations are causing performance issues
Partitioning adds complexity. For tables under 10 million rows with predictable growth, pg_cron deletions are simpler to maintain.
Setting Up pg_partman
Enable the extension:
CREATE EXTENSION IF NOT EXISTS pg_partman;
Let's partition an analytics events table by month:
-- Create the parent table with partitioning
CREATE TABLE public.analytics_events (
id UUID DEFAULT gen_random_uuid(),
event_type TEXT NOT NULL,
payload JSONB,
user_id UUID REFERENCES auth.users(id),
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
) PARTITION BY RANGE (created_at);
-- Configure pg_partman to manage partitions
SELECT partman.create_parent(
p_parent_table := 'public.analytics_events',
p_control := 'created_at',
p_type := 'native',
p_interval := 'monthly',
p_premake := 3 -- Create 3 months ahead
);
Configuring Retention
Set up automatic partition archival after 6 months:
UPDATE partman.part_config
SET retention = '6 months',
retention_keep_table = false -- Actually drop old partitions
WHERE parent_table = 'public.analytics_events';
If you want to archive partitions to a separate schema before dropping:
UPDATE partman.part_config
SET retention = '6 months',
retention_keep_table = true,
retention_schema = 'archive'
WHERE parent_table = 'public.analytics_events';
Scheduling Maintenance
pg_partman requires regular maintenance runs. Schedule this with pg_cron:
-- Run maintenance daily at 4 AM
SELECT cron.schedule(
'partman-maintenance',
'0 4 * * *',
$$CALL partman.run_maintenance_proc()$$
);
The maintenance procedure creates new partitions as needed and drops or archives old ones based on your retention settings.
Migrating Existing Tables to Partitioned Tables
Converting an existing large table to partitioned requires planning. Supabase provides guidance on this process—you'll essentially create a new partitioned table, migrate data in batches, and swap the tables.
This is an advanced operation that can cause downtime if not handled carefully. Test thoroughly in a staging environment first.
Approach 3: Storage Object Lifecycle Management
Database retention only covers half the picture. Supabase Storage objects—uploads, images, documents—accumulate separately and won't be cleaned up by database retention policies.
The Challenge
When uploading files to S3-compatible storage, you'd typically use lifecycle policies to expire objects automatically. Supabase Storage doesn't expose this API directly, requiring workarounds for self-hosted deployments.
Using Metadata for Expiration
One community approach uses storage object metadata to track expiration:
-- When uploading, set expiration in metadata
INSERT INTO storage.objects (bucket_id, name, metadata)
VALUES (
'user-uploads',
'temp-file.pdf',
jsonb_build_object('expires_at', NOW() + INTERVAL '7 days')
);
Then create a cleanup function:
CREATE OR REPLACE FUNCTION cleanup_expired_storage()
RETURNS void AS $$
DECLARE
expired_object RECORD;
BEGIN
FOR expired_object IN
SELECT id, name, bucket_id
FROM storage.objects
WHERE (metadata->>'expires_at')::timestamptz < NOW()
LOOP
-- Delete via storage API or direct removal
DELETE FROM storage.objects WHERE id = expired_object.id;
END LOOP;
END;
$$ LANGUAGE plpgsql SECURITY DEFINER;
Schedule with pg_cron:
SELECT cron.schedule(
'cleanup-expired-storage',
'0 5 * * *',
$$SELECT cleanup_expired_storage()$$
);
Direct S3 Lifecycle Rules
If your self-hosted Supabase uses MinIO or another S3-compatible backend for storage, you can configure lifecycle rules at the storage layer:
# Using mc (MinIO client)
mc ilm rule add myminio/supabase-storage \
--prefix "temp/" \
--expiry-days 7
This handles cleanup at the storage level, independent of Supabase's metadata. Check our guide on storage backup for self-hosted Supabase for more on storage architecture.
Archival Strategies for Compliance
Some data can't simply be deleted—audit logs, financial records, and user activity may need long-term retention for compliance. Archiving moves this data out of your active database while keeping it accessible when needed.
Simple Archive Schema
Create an archive schema and move old data there:
CREATE SCHEMA IF NOT EXISTS archive; -- Archive old audit logs INSERT INTO archive.audit_logs SELECT * FROM public.audit_logs WHERE created_at < NOW() - INTERVAL '1 year'; -- Then delete from active table DELETE FROM public.audit_logs WHERE created_at < NOW() - INTERVAL '1 year';
Wrap this in a function and schedule with pg_cron for automated archival.
External Archival
For data that must be retained but rarely accessed, consider:
- Exporting to cold storage (S3 Glacier, Backblaze B2)
- Dumping to separate archive database
- Converting to Parquet files for analytical queries
A rotation strategy might look like:
| Age | Storage Tier | Access Pattern |
|---|---|---|
| 0-7 days | Primary database | Real-time |
| 7-30 days | Hot archive | Minutes |
| 30-365 days | Cold storage (S3) | Hours |
| 1+ years | Deep archive (Glacier) | Days |
This balances cost with recovery needs across different data ages.
Putting It Together: A Complete Retention Strategy
Here's how these pieces fit together in practice:
- Audit your data: Identify tables, growth rates, and retention requirements
- Choose your approach: pg_cron for small-medium tables, pg_partman for high-volume
- Implement incrementally: Start with one table, monitor, then expand
- Set up monitoring: Track job success rates and table sizes over time
- Document policies: Make retention periods explicit for compliance
For most self-hosted Supabase deployments, a combination works best:
- pg_cron deletions for auth sessions, tokens, and app logs
- pg_partman for high-volume analytics or time-series data
- Storage lifecycle rules for temporary file uploads
- Archive schema for compliance-required historical data
Simplifying Data Management with Supascale
Implementing retention policies manually works, but it's another operational task to maintain. You're writing SQL, scheduling jobs, monitoring execution, and updating policies as requirements change.
Supascale can help reduce this operational burden for self-hosted Supabase. While you'll still need to define your retention requirements, having automated backups with configurable schedules means you're not losing data while you implement cleanup policies. One-click restore capability provides a safety net as you test retention jobs.
Check out our documentation on backup storage to see how automated S3 backups complement your data lifecycle management.
Summary
Data retention isn't glamorous, but it's essential for healthy self-hosted Supabase deployments:
- Use pg_cron for straightforward scheduled deletions on smaller tables
- Use pg_partman when table sizes or growth rates make DELETE operations impractical
- Don't forget storage—implement expiration for uploaded files separately
- Archive strategically—move compliance data rather than keeping it in active tables
- Monitor continuously—silent failures lead to backlogs that are hard to clear
The unlimited growth of self-hosted databases is a feature, not a bug. But like any feature, it requires intentional management to work in your favor.
