pgvector at 10M+ rows · index choice, query patterns, real performance numbers
pgvector at 10M rows is not scary · if you pick the right index. HNSW vs IVFFlat, filter patterns, real numbers.
pgvector at 10M rows is not scary · if you pick the right index. HNSW vs IVFFlat, filter patterns, real numbers.
pgvector has a reputation for 'toy-scale'. The reputation is outdated. We run production RAG on 10M+ row pgvector instances with p95 query latency under 80ms. The key is choosing the right index and writing filter-friendly queries. Here is how.
Most RAG queries filter by tenant_id, document_type, or recency before similarity. pgvector 0.5+ added proper filtered HNSW, but naive queries still scan too much. Always apply the selective filter first.
-- GOOD: tenant filter narrows first, vector search on small set
SELECT * FROM chunks
WHERE tenant_id = $1 AND created_at > now() - interval '30 days'
ORDER BY embedding <=> $2
LIMIT 10;
-- Index: btree on (tenant_id, created_at) + HNSW on embeddingIf p95 creeps above 200ms, 95% of the time the index is not being used. Run EXPLAIN ANALYZE, confirm the HNSW index is hit, not a sequential scan. Usually it is a WHERE clause that disables the index.

By
Founder, DField Solutions
I've shipped production products from fintech to creator-tooling · for startups and enterprises, from Budapest to San Francisco.
Keep reading
RELATED PROJECTS