A vector database stores text, images, or other data as high-dimensional embeddings and retrieves the most similar items by mathematical distance. It is the substrate underneath retrieval-augmented generation, semantic search, and recommendation systems — the part of an AI stack that lets the model find relevant context without keyword matching.

What a vector database stores.

Each row is an embedding (typically a 384, 768, or 1536-dimensional float vector) plus the original text and a handful of metadata fields. Queries are also embedded, then the database returns the top-k rows closest in vector space by cosine similarity or dot product.

When to use a vector database.

  • Retrieval for RAG. Grounding an LLM in your own documents, policies, transcripts, or product specs.
  • Semantic search. Search interfaces where keyword match alone misses obvious answers ("how do I cancel" should match "cancellation policy").
  • De-duplication and clustering. Finding near- duplicate records, similar leads, similar tickets.
  • Recommendation. "Items similar to this one" without building an explicit taxonomy.

What a vector database is not.

It is not a replacement for a relational database. It is not where you put your customer records, transaction history, or source of truth. Vector databases live alongside Postgres, MySQL, or whatever the system of record already is, and they store embeddings derived from that data.

“Postgres is the source of truth. The vector store is the index for similarity.”

Morvion's defaults.

For most engagements, Morvion uses pgvector — the Postgres extension that adds vector columns and similarity operators. It keeps everything in one database, simplifies backup and access control, and scales comfortably to millions of embeddings. Dedicated vector services (Pinecone, Qdrant, Weaviate) earn their complexity at the tens-of- millions-of-vectors threshold, not before.

Frequently asked.

What is a vector database?
A vector database stores text and other data as high-dimensional embeddings and retrieves the most similar items by mathematical distance. It is the substrate underneath RAG, semantic search, and recommendation systems.
Do I need a vector database to use AI?
No. You only need one when your AI workflow has to retrieve from a large corpus of your own data. Direct prompts to an LLM, function-calling, and well-scoped agentic workflows can all work without retrieval at all.
What vector database does Morvion use by default?
pgvector — the Postgres extension that adds vector columns and similarity operators. It keeps everything in one database, simplifies backup and access control, and scales to millions of embeddings. Dedicated services (Pinecone, Qdrant, Weaviate) earn their complexity at the tens-of-millions-of-vectors threshold.
How does a vector database fit into a RAG pipeline?
It is the retrieval store. Documents are chunked, embedded, and inserted into the vector database. At query time, the user's question is embedded and the top-k most similar chunks are retrieved and passed to the LLM as context. Reranking, metadata filtering, and hybrid search with a keyword index sit on top of this basic flow.