Query Transformer

gllm-retrieval | Involves LM | Tutorial : Query Transformer | Use Case: Query Transformation | API Reference

Query transformation is the process of modifying a user's original query using LMs to generate one or more refined or expanded queries, aiming to improve the retrieval of relevant documents by enhancing the search scope and effectiveness.

Query transformation specifically attempts to address three problems when processing user queries for retrieval:

  1. Nuanced user intent: Users often don't mean what they write. They are prone to assuming that since an LM is an "expert", the LM would have the same knowledge domain as them.

  2. Query-document mismatch: Queries to a vector store are often based on semantic similarity. However, queries are short; documents are long. Moreover, the vocabulary present in the target document may not be present in the query.

  3. Complexity: Some queries are not as simple: they may be multi-part (containing multiple points to address) or multi-hop (requiring multiple reasoning steps).

Prerequisites

This example specifically requires completion of all setup steps listed on the Prerequisites page.

You should be familiar with these concepts:

Installation

# you can use a Conda environment
pip install --extra-index-url https://oauth2accesstoken:$(gcloud auth print-access-token)@glsdk.gdplabs.id/gen-ai-internal/simple/ "gllm-retrieval"

Quickstart

We will use GPT-4.1 nano to initialize the one-to-one query transformation component. The component will transform 1 query into another query.

1

Imports

Import the following:

import asyncio

from gllm_inference.builder import build_lm_request_processor
from gllm_retrieval.query_transformer.one_to_one_query_transformer import OneToOneQueryTransformer
from gllm_retrieval.query_transformer.query_transformer import BaseQueryTransformer
2

Create an LM request processor

The LM request processor will assist us in filling in the prompt template and sending our request to the LM.

lmrp = build_lm_request_processor(
    model_id="openai/gpt-4.1-nano",
    credentials="<your-api-key>",     # or use the environment variable OPENAI_API_KEY
    system_template="You are a helpful assistant that rewrites queries for better retrieval. Rewrite the following query. Only output the transformed query.",
    user_template="Query: {query}",          # 'query' will be supplied below (string or dict with matching keys)
)
3

Create a one-to-one query transformer

Once the LM request processor is created, you can call the one-to-one query transformer and see it get transformed.

transformer = OneToOneQueryTransformer(
    lm_request_processor=lmrp
)

single_input = "Find recent research on diffusion transformers."

result = asyncio.run(transformer.transform(single_input))
print(result[0])  # Result is a string with a single element.

Extracting transformation result from structured output

You can control how LM outputs are converted into the final transformed queries by supplying the extract_func argument to the transformer.

Example 1: Extracting from JSON Output

The query transformers have a convenience function json_extractor which outputs an extractor function that extracts the output from a specific key, assuming that the LM output is already parsed as a dictionary.

Example 2: Extracting from Structured Output

You can take advantage of our LM Invoker's capability to produce structured output. To do so, define a response_schema during the LM request processor building. It is recommended that you use a Pydantic BaseModel as the response_schema.

Error handling

Sometimes, the query transformation did not go as planned, usually due to malformed responses. To handle this, the Query Transformer has 3 error handling modes available: KEEP, EMPTY, and RAISE.

  1. KEEP: Returns the original input, coerced into list[str].

  2. EMPTY: Return an empty list.

  3. RAISE: Re-raise the exception.

The error handling mode could be set using the on_error constructor argument.

Last updated