This post covers four critical pgvector configuration mistakes that silently degrade RAG pipeline performance in production.
- •Missing HNSW index causes full table scans, pushing query times from under 50ms to multiple seconds at 500K+ vectors
- •High-dimensional embeddings (e.g., 1,536-dim ada-002) cost ~6KB per row, ballooning to 6GB+ at 1M documents; switching to 384-dim models cuts storage by 75%
- •Wrong distance function (cosine vs. inner product vs. L2) directly changes retrieval results; inner product is preferred for normalized embeddings
- •pgvector scales comfortably to ~5M vectors on db.r6g.xlarge with HNSW, but degrades past 10M under concurrent load
This summary was automatically generated by AI based on the original article and may not be fully accurate.