Semantic search is the user-facing search capability that returns results matching the meaning of a query, not its literal tokens. The shopper types “cozy aperitivo spot,” the system returns the wine bar with live music near the lake, even though none of those words appear in the search box.

The semantic-search stack.

  1. Query understanding. The query is normalized, expanded with synonyms or rephrased by an LLM, and tagged with intent and entities.
  2. Vector retrieval. The rewritten query is embedded and used to find the top-k semantically similar records.
  3. Hybrid fusion. Vector results are combined with keyword results (BM25 is still the default keyword scorer) using reciprocal rank fusion or a learned mixer.
  4. Reranking. A second-stage model (cross-encoder or small LLM) scores the fused candidate list against the original query for relevance.
  5. Filtering and grouping. Metadata filters apply (tenant, language, date), and near-duplicates are grouped or suppressed.

Why semantic search beats keyword search.

Operators describe their experience in their own words; customers ask in theirs. Keyword search makes the customer carry the burden of guessing the catalog's vocabulary. Semantic search puts that burden on the system, which is where it belongs.

Where semantic search struggles.

Exact identifiers (SKUs, account numbers, ISBNs) and rare named entities are where semantic search underperforms keyword search; hybrid retrieval mitigates this. Sensitive domains (legal, medical) require careful query understanding and explicit constraint handling so the system does not guess at intent.

Frequently asked.

What is semantic search?
Semantic search is a search experience that returns results matching the meaning of a query rather than the literal words. It is built on embedding-based vector retrieval, query understanding, hybrid keyword fusion, and reranking, so users get the right answer even when their words and the catalog's words differ.
How is semantic search different from keyword search?
Keyword search matches strings; semantic search matches meanings. Production systems usually run both in parallel and fuse the results, because each excels where the other fails: semantic on intent and synonymy, keyword on exact identifiers and rare names.
Does semantic search need an LLM at query time?
Not strictly. The embedding step does not require a chat model. But most modern semantic-search systems use an LLM for query rewriting and reranking, where the per-query cost is justified by the accuracy lift.
Where does semantic search go wrong?
On exact identifiers (SKUs, account numbers), on rare named entities the embedding model has not seen, and on queries with hard constraints the system treats as soft preferences. Hybrid retrieval, structured metadata, and explicit filters address most of these.