If you're self-hosting Supabase, you already know the appeal: complete control, predictable costs, and no vendor lock-in. But what happens when your users need to search through thousands of records? Do you spin up Elasticsearch or Algolia, adding more infrastructure complexity to your stack?
You don't have to. PostgreSQL's native full text search is surprisingly capable, and it comes baked into your self-hosted Supabase instance. This guide walks you through implementing full text search from scratch, covering everything from basic queries to production-ready indexed searches.
Why Use PostgreSQL Full Text Search
Before diving into implementation, let's address the elephant in the room: when does Postgres full text search make sense versus dedicated search services?
Choose PostgreSQL full text search when:
- Your dataset is under 10 million rows
- You need real-time consistency between data and search results
- You want to minimize infrastructure complexity
- Your search requirements are straightforward (no ML-powered relevance ranking)
- You're optimizing for cost efficiency
Consider external search services when:
- You need typo tolerance and fuzzy matching at scale
- Your search requires ML-based ranking or personalization
- You're indexing unstructured data from multiple sources
- Sub-10ms response times are critical
For most self-hosted Supabase deployments, the built-in search handles 80% of use cases without adding operational burden.
Understanding Text Search Fundamentals
PostgreSQL full text search works by converting text into tsvector (a searchable document format) and queries into tsquery (a search pattern format). The @@ operator checks if a document matches a query.
Here's a minimal example:
SELECT 'The quick brown fox' @@ to_tsquery('quick & fox');
-- Returns: true
The magic happens in how Postgres processes text:
- Tokenization - Breaking text into individual words
- Normalization - Converting to lowercase, stemming (running → run)
- Stop word removal - Filtering common words (the, a, is)
- Lexeme creation - Building the searchable representation
Setting Up Full Text Search in Self-Hosted Supabase
Let's build a practical search system for a products table. This example works identically whether you're using Docker Compose or another deployment method.
Step 1: Create Your Table with a Search Column
CREATE TABLE products (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name TEXT NOT NULL,
description TEXT,
category TEXT,
price DECIMAL(10, 2),
created_at TIMESTAMPTZ DEFAULT NOW(),
-- Generated column for search
fts TSVECTOR GENERATED ALWAYS AS (
setweight(to_tsvector('english', coalesce(name, '')), 'A') ||
setweight(to_tsvector('english', coalesce(description, '')), 'B') ||
setweight(to_tsvector('english', coalesce(category, '')), 'C')
) STORED
);
The fts column is a generated column that automatically updates whenever name, description, or category changes. The setweight function assigns importance levels (A highest, D lowest), so matches in the product name rank higher than description matches.
Step 2: Create a GIN Index
Without an index, Postgres scans every row. A GIN (Generalized Inverted Index) makes searches nearly instant:
CREATE INDEX idx_products_fts ON products USING GIN (fts);
This is critical for performance. On a 100,000 row table, unindexed search might take 200ms while indexed search takes under 5ms.
Step 3: Create a Search Function
For cleaner API access, wrap the search logic in a database function:
CREATE OR REPLACE FUNCTION search_products(search_query TEXT)
RETURNS TABLE (
id UUID,
name TEXT,
description TEXT,
category TEXT,
price DECIMAL,
rank REAL
)
LANGUAGE plpgsql
AS $$
BEGIN
RETURN QUERY
SELECT
p.id,
p.name,
p.description,
p.category,
p.price,
ts_rank(p.fts, websearch_to_tsquery('english', search_query)) AS rank
FROM products p
WHERE p.fts @@ websearch_to_tsquery('english', search_query)
ORDER BY rank DESC
LIMIT 50;
END;
$$;
The websearch_to_tsquery function parses queries like users actually type them:
wireless headphones→ match both terms"wireless headphones"→ match exact phrasewireless OR bluetooth→ match either termwireless -cheap→ match wireless, exclude cheap
Step 4: Call from Your Application
With Supabase's RPC support, calling this function is straightforward:
const { data, error } = await supabase
.rpc('search_products', { search_query: 'wireless headphones' });
Handling Multiple Languages
The default english configuration works for English text. For multilingual support, you have several options.
Option 1: Specify Language Per Column
fts TSVECTOR GENERATED ALWAYS AS (
to_tsvector('french', coalesce(name_fr, '')) ||
to_tsvector('german', coalesce(name_de, ''))
) STORED
Option 2: Use Simple Configuration for Language-Agnostic Search
to_tsvector('simple', coalesce(name, ''))
The simple configuration performs no stemming or stop word removal, making it work across languages at the cost of reduced accuracy.
Option 3: Install PGroonga for CJK Languages
For Chinese, Japanese, and Korean text, PostgreSQL's native search falls short. Self-hosted Supabase supports installing extensions:
CREATE EXTENSION IF NOT EXISTS pgroonga;
PGroonga uses Groonga's ICU tokenizer, which properly handles non-space-separated languages.
Combining Full Text Search with Filters
Real applications rarely use pure text search. Users filter by price range, category, or date. PostgreSQL handles this efficiently when you structure queries correctly:
CREATE OR REPLACE FUNCTION search_products_filtered(
search_query TEXT,
min_price DECIMAL DEFAULT 0,
max_price DECIMAL DEFAULT 999999,
category_filter TEXT DEFAULT NULL
)
RETURNS TABLE (
id UUID,
name TEXT,
description TEXT,
category TEXT,
price DECIMAL,
rank REAL
)
LANGUAGE plpgsql
AS $$
BEGIN
RETURN QUERY
SELECT
p.id,
p.name,
p.description,
p.category,
p.price,
ts_rank(p.fts, websearch_to_tsquery('english', search_query)) AS rank
FROM products p
WHERE
p.fts @@ websearch_to_tsquery('english', search_query)
AND p.price BETWEEN min_price AND max_price
AND (category_filter IS NULL OR p.category = category_filter)
ORDER BY rank DESC
LIMIT 50;
END;
$$;
For optimal performance with combined filters, create a partial index:
CREATE INDEX idx_products_electronics_fts ON products USING GIN (fts) WHERE category = 'Electronics';
Implementing Search Suggestions
Autocomplete requires a different approach than full text search. For prefix matching, use trigrams:
-- Enable the extension CREATE EXTENSION IF NOT EXISTS pg_trgm; -- Create a trigram index CREATE INDEX idx_products_name_trgm ON products USING GIN (name gin_trgm_ops); -- Query for suggestions SELECT DISTINCT name FROM products WHERE name ILIKE '%wire%' ORDER BY similarity(name, 'wire') DESC LIMIT 10;
This handles partial matches and typos that standard full text search misses.
Integrating with Vector Search
For semantic search (finding conceptually similar content), combine full text search with pgvector. This hybrid approach delivers both keyword precision and semantic understanding:
-- Assuming you have an embedding column
SELECT
id,
name,
-- Combine text rank and vector similarity
(0.7 * ts_rank(fts, websearch_to_tsquery('english', $1))) +
(0.3 * (1 - (embedding <=> $2))) AS combined_score
FROM products
WHERE fts @@ websearch_to_tsquery('english', $1)
ORDER BY combined_score DESC
LIMIT 20;
This approach is particularly effective for product catalogs and content search where users might describe what they want rather than knowing exact terms.
Performance Optimization
As your data grows, monitor and optimize search performance.
Analyze Query Plans
EXPLAIN ANALYZE
SELECT * FROM products
WHERE fts @@ websearch_to_tsquery('english', 'wireless headphones');
Look for "Index Scan using idx_products_fts" in the output. If you see "Seq Scan", your index isn't being used.
Update Statistics
PostgreSQL uses table statistics for query planning. After bulk inserts or major changes:
ANALYZE products;
Monitor Index Size
GIN indexes can grow large. Check their size:
SELECT
pg_size_pretty(pg_relation_size('idx_products_fts')) as index_size,
pg_size_pretty(pg_table_size('products')) as table_size;
If the index exceeds 20% of table size, consider limiting indexed columns or using partial indexes.
For deeper performance insights, see our guide on PostgreSQL performance tuning.
Triggering Search Index Updates
When your search needs to include data from related tables, generated columns won't work. Use triggers instead:
CREATE OR REPLACE FUNCTION update_product_fts()
RETURNS TRIGGER AS $$
BEGIN
NEW.fts :=
setweight(to_tsvector('english', coalesce(NEW.name, '')), 'A') ||
setweight(to_tsvector('english', coalesce(NEW.description, '')), 'B') ||
setweight(to_tsvector('english', coalesce(
(SELECT string_agg(t.name, ' ')
FROM product_tags pt
JOIN tags t ON pt.tag_id = t.id
WHERE pt.product_id = NEW.id),
'')), 'C');
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER trg_products_fts
BEFORE INSERT OR UPDATE ON products
FOR EACH ROW EXECUTE FUNCTION update_product_fts();
Common Pitfalls and Solutions
Empty Results for Valid Searches
The most common issue is language mismatch. If your data is in Spanish but you're using english configuration, stemming won't work correctly:
-- Check what tokens are being generated
SELECT to_tsvector('english', 'running');
-- Returns: 'run':1 (correct)
SELECT to_tsvector('english', 'corriendo');
-- Returns: 'corriendo':1 (not stemmed)
SELECT to_tsvector('spanish', 'corriendo');
-- Returns: 'corr':1 (correctly stemmed)
Slow Queries Despite Indexing
If queries are slow with a GIN index present, check if you're inadvertently bypassing it:
-- This bypasses the index (coalesce prevents index usage)
WHERE coalesce(fts, '') @@ to_tsquery('search term')
-- This uses the index
WHERE fts @@ to_tsquery('search term')
Memory Issues on Large Indexes
GIN index maintenance happens in memory. For large imports, increase maintenance_work_mem temporarily:
SET maintenance_work_mem = '1GB'; -- Perform bulk operations RESET maintenance_work_mem;
Managing Search with Supascale
If you're using Supascale to manage your self-hosted Supabase deployment, the SQL editor provides a straightforward way to create and test these search functions. The one-time purchase model means you're not paying per-query for search like you would with external search services.
For production deployments, consider:
- Using automated backups to protect your search indexes
- Setting up monitoring to track query performance
- Configuring your reverse proxy to handle search endpoint traffic
Conclusion
PostgreSQL's full text search is a pragmatic choice for self-hosted Supabase deployments. It eliminates external dependencies, keeps search results perfectly synchronized with your data, and handles most use cases without specialized infrastructure.
Start with the generated column approach for simple implementations. As requirements grow, layer in trigram search for autocomplete and vector search for semantic queries. The composability of PostgreSQL means you're not locked into any single approach.
For teams evaluating search solutions, try the built-in approach first. You can always migrate to a dedicated service later if you outgrow it—but many teams find that PostgreSQL's capabilities exceed their expectations.
