Vector Databases·7 min read

Vector Databases Explained: What RAG Actually Needs

Every RAG and AI search application depends on finding semantically similar content fast. This guide explains what a vector database actually does, and when you don't need a separate one at all.

An embedding is a numerical representation of text, an image, or other data, positioned in space so that semantically similar items end up close together. A vector database stores embeddings and finds the nearest ones to a query embedding — the core lookup behind retrieval-augmented generation (RAG), semantic search, and recommendation systems built on top of language models.

Do you need a dedicated vector database at all?

If you're already running Postgres and your scale is moderate (roughly up to a few million vectors for most workloads), pgvector adds vector similarity search directly as a Postgres extension — avoiding the operational cost of introducing an entirely separate database. Dedicated vector databases earn their place at higher scale or when you need specialised indexing algorithms that a general-purpose database extension doesn't implement.

Use pgvector if — You're already on Supabase or Neon, your scale is moderate, and you want to avoid managing a second database for embeddings alone.
Use a dedicated vector database if — You're at meaningful scale, need specialised ANN (approximate nearest neighbour) indexing for speed, or query performance has become a measurable bottleneck.

Managed vs open-source self-hosted

Pinecone — The most widely adopted managed option for production RAG — trades flexibility for operational simplicity and a proven reliability track record at scale.
Weaviate and Qdrant — Both open-source with strong hybrid search (combining vector similarity with keyword/metadata filtering) — self-host for free or use their managed cloud tiers.
Chroma — Explicitly designed to be the simplest path from prototype to a working AI app demo — ideal before you've committed to a production architecture.
Milvus — Built for billion-scale production deployments at large enterprises — real operational complexity in exchange for capacity most teams will never need.

What "hybrid search" means and why it matters

Pure vector search alone often misses exact-match cases — a specific product SKU, a name, a date — that keyword search catches easily. Hybrid search combines vector similarity with traditional keyword/metadata filtering in a single query, which is why Weaviate and Qdrant both invested heavily in this rather than offering vector search alone.

A typical RAG retrieval call (conceptual)

results = vector_db.query(
    vector=embed(user_question),
    top_k=5,
    filter={"category": "docs"},  # hybrid: metadata filter alongside vector similarity
)

Next step

Use the RadarTrek Vector Databases screener to compare query performance, hybrid search support, and price/value before adding one to your stack.

Ready to decide?

Use the Vector Databases Screener to filter by your criteria and compare options head-to-head.

Open screener View all tools

Do you need a dedicated vector database at all?

Use pgvector if — You're already on Supabase or Neon, your scale is moderate, and you want to avoid managing a second database for embeddings alone.

Use a dedicated vector database if — You're at meaningful scale, need specialised ANN (approximate nearest neighbour) indexing for speed, or query performance has become a measurable bottleneck.

Managed vs open-source self-hosted

Pinecone — The most widely adopted managed option for production RAG — trades flexibility for operational simplicity and a proven reliability track record at scale.

Weaviate and Qdrant — Both open-source with strong hybrid search (combining vector similarity with keyword/metadata filtering) — self-host for free or use their managed cloud tiers.

Chroma — Explicitly designed to be the simplest path from prototype to a working AI app demo — ideal before you've committed to a production architecture.

Milvus — Built for billion-scale production deployments at large enterprises — real operational complexity in exchange for capacity most teams will never need.

What "hybrid search" means and why it matters

A typical RAG retrieval call (conceptual)

results = vector_db.query(
    vector=embed(user_question),
    top_k=5,
    filter={"category": "docs"},  # hybrid: metadata filter alongside vector similarity
)

Next step

Use the RadarTrek Vector Databases screener to compare query performance, hybrid search support, and price/value before adding one to your stack.