What is a vector index?

A vector index is the data structure that stores embedding vectors with approximate-nearest-neighbour lookups, returning the top-K most similar items in milliseconds across millions of records. It's the production infrastructure that makes semantic search and RAG fast.

Do I need a dedicated vector database?

Probably not at first. Postgres + pgvector is the right default for most teams: the vector index lives in the same database as application data, transactions are atomic, operational complexity is minimal. Dedicated stores (Pinecone, Weaviate, Qdrant) earn their cost above a few million vectors or when retrieval is the bottleneck.

What's the difference between a vector index and a vector store?

A vector index is the data structure (HNSW, IVF, flat). A vector store is the managed service or database that hosts it. Postgres + pgvector is a vector store with an HNSW vector index. Pinecone is a vector store that hides its index implementation behind an API. The choice between hosting models matters more than the algorithm choice for most teams.

Should I tune the vector index for higher recall or lower latency?

Measure both against your actual query distribution. HNSW exposes two knobs: efConstruction (build-time recall) and efSearch (query-time recall). Lower efSearch is faster but misses more relevant chunks; higher efSearch is slower but catches more. The eval harness picks the right point for your specific retrieval-quality vs. latency budget.

Vector index · Morvion Glossary

A vector index is the data structure that makes semantic search fast at production scale. Storing raw embedding vectors and comparing them one by one against a query embedding works at small scale but collapses past a few thousand records. A vector index uses approximate-nearest-neighbour algorithms to return the top-K most similar items in single-digit milliseconds across millions of records.

The common algorithms.

HNSW (Hierarchical Navigable Small World). The production default. Excellent recall, fast queries, moderate memory footprint. Used by FAISS, Pinecone, Weaviate, pgvector, Qdrant.
IVF (Inverted File). Clusters embeddings into cells; queries scan only nearby cells. Faster build, lower recall. Useful at very large scale.
Flat / brute force. Scan every vector. Highest recall, lowest latency at small scale (under ~50k vectors), no index- build step.

Where to host it.

Postgres + pgvector is the right default for most teams: the vector index lives in the same database as the rest of the application data, transactions are atomic, and operational complexity is minimal. Dedicated vector stores (Pinecone, Weaviate, Qdrant) earn their cost above a few million vectors, or when retrieval is the bottleneck and the team needs specialised index tuning.

Hybrid retrieval.

A vector index alone misses exact-match queries — names, codes, and rare terms get poor recall in pure semantic search. The production answer is hybrid retrieval: combine vector index results with BM25 keyword scores via reciprocal-rank fusion. The eval harness measures which mix wins for the specific query distribution and the team tunes accordingly. See RAG in production for the full architecture decisions.

Frequently asked.

What is a vector index?: A vector index is the data structure that stores embedding vectors with approximate-nearest-neighbour lookups, returning the top-K most similar items in milliseconds across millions of records. It's the production infrastructure that makes semantic search and RAG fast.
Do I need a dedicated vector database?: Probably not at first. Postgres + pgvector is the right default for most teams: the vector index lives in the same database as application data, transactions are atomic, operational complexity is minimal. Dedicated stores (Pinecone, Weaviate, Qdrant) earn their cost above a few million vectors or when retrieval is the bottleneck.
What's the difference between a vector index and a vector store?: A vector index is the data structure (HNSW, IVF, flat). A vector store is the managed service or database that hosts it. Postgres + pgvector is a vector store with an HNSW vector index. Pinecone is a vector store that hides its index implementation behind an API. The choice between hosting models matters more than the algorithm choice for most teams.
Should I tune the vector index for higher recall or lower latency?: Measure both against your actual query distribution. HNSW exposes two knobs: efConstruction (build-time recall) and efSearch (query-time recall). Lower efSearch is faster but misses more relevant chunks; higher efSearch is slower but catches more. The eval harness picks the right point for your specific retrieval-quality vs. latency budget.

Vector index

The common algorithms.

Where to host it.

Hybrid retrieval.

Frequently asked.

Intelligent Systems & AI Infrastructure

Keep reading the glossary.

AI infrastructure

CRM intelligence

Immersive website

AI agent

Business intelligence dashboard

Client portal

Discovery sprint

Digital operating layer

Document intelligence

Eval-driven AI

Hospitality website

Marketplace platform

Multi-agent workflow

Real-time dashboard

Retrieval-augmented generation (RAG)

Prompt engineering

Vector database

AI observability

Embedding model

Fine-tuning

Vector search

Semantic search

Hallucination

Chain-of-thought

Function calling

Model distillation

Safety rails

Eval harness

Regression gate

Model Context Protocol (MCP)

Structured output

Agent tool use

Prompt injection

Agentic search

Observability traces

LLM guardrails

Agent handoff

Token budget

Retrieval rerank

Embedding space

Semantic cache

Context window

Faithfulness

Cross-encoder

Model router

AI cost control

Agent memory

Structured extraction

AI evaluation framework

Retrieval quality

AI guardrail policy

Eval fixture

Eval rubric

AI incident

Agent orchestration

Eval versioning

Model fallback

Fine-grained routing

AI policy version control