Retrieval

Language models (LMs) are powerful, but they inherently have limited context windows, imperfect factual recall, and no direct access to your private or frequently changing data.

Retrieval is the process of fetching relevant information from external knowledge sources so an LM can ground its answers—without retraining the model.

In a Retrieval-Augmented Generation (RAG) pipeline, your query may be transformed, used to retrieve candidate documents or chunks (for example, from vector, SQL, or graph stores), and then the most relevant results are passed to the LM. This improves factual accuracy, enables citations, and reduces hallucinations.

Our Retrieval components allow you to:

Perform basic retrieval via specialized retrievers for SQL, Vector, and Graph databases.
Tune the retrieval process by inferring parameters from the query.
Enhance results by manipulating the queries.
Improve retrieved context quality by merging or deduplicating retrieved chunks.
Upgrade the context relevance by reranking retrieved chunks.

PreviousGLLM Inference v0.5 to v0.6 NextQuery Transformer

Last updated 1 month ago

Was this helpful?