Reranker

gllm-retrieval | Involves EM | Tutorial: Reranker | API Reference

What's a Reranker?

A reranker is a component that reorders retrieved chunks based on their relevance to a query. After initial retrieval returns a set of candidate chunks, the reranker scores and sorts them to ensure the most relevant content appears first. This improves the quality of context provided to language models in RAG pipelines.

Rerankers are particularly useful when:

Initial retrieval returns many candidates that need prioritization
You want to combine results from multiple retrieval sources
The retrieval method does not perfectly capture semantic relevance

Prerequisites

This example specifically requires completion of all setup steps listed on the Prerequisites page.

Installation

pip install --extra-index-url https://oauth2accesstoken:$(gcloud auth print-access-token)@glsdk.gdplabs.id/gen-ai-internal/simple/ "gllm-retrieval"

Quickstart

Let's start with a basic example using SimilarityBasedReranker, which uses embedding similarity to rerank chunks:

import asyncio
from gllm_core.schema import Chunk
from gllm_inference.em_invoker import OpenAIEMInvoker
from gllm_inference.model import OpenAIEM
from gllm_retrieval.reranker import SimilarityBasedReranker

em_invoker = OpenAIEMInvoker(OpenAIEM.TEXT_EMBEDDING_3_SMALL)

# Create the reranker
reranker = SimilarityBasedReranker(embeddings=em_invoker)

# Sample chunks to rerank
chunks = [
    Chunk(id="1", content="Python is a programming language"),
    Chunk(id="2", content="Machine learning uses algorithms to learn from data"),
    Chunk(id="3", content="Deep learning is a subset of machine learning"),
]

# Rerank based on query relevance
query = "What is machine learning?"
reranked = asyncio.run(reranker.rerank(chunks, query))

for i, chunk in enumerate(reranked, 1):
    print(f"{i}. {chunk.content}")

Expected Output

The chunks are reordered with the most relevant content first:

1. Machine learning uses algorithms to learn from data
2. Deep learning is a subset of machine learning
3. Python is a programming language

Available Rerankers

The SDK provides multiple reranker implementations for different use cases:

Reranker

Description

Best For

SimilarityBasedReranker

Uses embedding similarity scores

General-purpose semantic reranking

TEIReranker

Uses Text Embedding Inference endpoint

High-performance, self-hosted deployments

FlagEmbeddingReranker

Uses FlagEmbedding models

Multilingual and specialized domains

CohereBedrockReranker

Uses AWS Bedrock Cohere service

Cloud-based, managed reranking

Similarity-Based Reranking

The SimilarityBasedReranker calculates embedding similarity between the query and each chunk, then sorts by score:

from gllm_inference.em_invoker import OpenAIEMInvoker
from gllm_inference.model import OpenAIEM
from gllm_retrieval.reranker import SimilarityBasedReranker

em_invoker = OpenAIEMInvoker(OpenAIEM.TEXT_EMBEDDING_3_SMALL)
reranker = SimilarityBasedReranker(embeddings=em_invoker)

reranked = await reranker.rerank(chunks, query)

Custom Similarity Functions: You can provide a custom similarity function that takes two embedding vectors and returns a float score. Higher scores indicate greater similarity.

TEI Reranking

The TEIReranker uses a reranker model hosted on Text Embedding Inference (TEI):

from gllm_retrieval.reranker import TEIReranker

reranker = TEIReranker(
    url="https://your-tei-endpoint.com/rerank",
    timeout=10,
    fallback_to_original=True,
)

reranked = await reranker.rerank(chunks, query)

Authentication: TEIReranker supports both basic auth (username/password) and bearer token (api_key) authentication methods.

FlagEmbedding Reranking

The FlagEmbeddingReranker uses FlagEmbedding models for reranking:

from gllm_retrieval.reranker import FlagEmbeddingReranker

reranker = FlagEmbeddingReranker(
    model_path="BAAI/bge-reranker-base",
    use_fp16=True,
)

reranked = await reranker.rerank(chunks, query)

Installation: FlagEmbeddingReranker requires the FlagEmbedding package. Install with:

pip install "gllm-retrieval[flag_embedding]"

Cohere Bedrock Reranking

The CohereBedrockReranker uses Cohere's reranker models hosted on AWS Bedrock:

from gllm_retrieval.reranker import CohereBedrockReranker

reranker = CohereBedrockReranker(
    model_name="cohere.rerank-v3-5:0",
    region_name="us-east-1",
    fallback_to_original=True,
)

reranked = await reranker.rerank(chunks, query)

Installation: CohereBedrockReranker requires the cohere package. Install with:

pip install "gllm-retrieval[cohere]"

AWS Credentials: Ensure your AWS credentials have Bedrock permissions. See AWS Bedrock documentation for supported models.

Using Rerankers in Pipelines

Rerankers integrate seamlessly with the SDK's pipeline system:

from gllm_pipeline import Pipeline, step
from gllm_retrieval.reranker import SimilarityBasedReranker
from gllm_retrieval.retriever import VectorRetriever
from gllm_retrieval.vector_data_store import ChromaVectorDataStore
from gllm_retrieval.vector_retriever import BasicVectorRetriever
from gllm_inference.em_invoker import OpenAIEMInvoker
from gllm_inference.model import OpenAIEM

em_invoker = OpenAIEMInvoker(OpenAIEM.TEXT_EMBEDDING_3_SMALL)

# Initialize Chroma data store in-memory
vector_store = ChromaVectorDataStore(
    collection_name="documents",
    embedding=em_invoker
)

# Initialize the vector retriever
retriever = BasicVectorRetriever(vector_store)
pipeline = Pipeline(
    steps=[
        step(retriever, {"query": "query"}, "chunks"),
        step(reranker, {"chunks": "chunks", "query": "query"}, "reranked_chunks"),
    ]
)

result = await pipeline.run(query="What is machine learning?")

API Reference

For detailed API documentation, see the Reranker API Reference.

PreviousSQL Retriever NextChunk Processor

Last updated 1 month ago

Was this helpful?

hashtagWhat's a Reranker?

hashtagInstallation

hashtagQuickstart

hashtagAvailable Rerankers

hashtagSimilarity-Based Reranking

hashtagTEI Reranking

hashtagFlagEmbedding Reranking

hashtagCohere Bedrock Reranking

hashtagUsing Rerankers in Pipelines

hashtagAPI Reference