Catalog

gllm-inference | Tutorial: Catalog | API Reference

Prerequisites

This example specifically requires completion of all setup steps listed on the Prerequisites page.

You should be familiar with these concepts and components:

Installation

# you can use a Conda environment
pip install --extra-index-url https://oauth2accesstoken:$(gcloud auth print-access-token)@glsdk.gdplabs.id/gen-ai-internal/simple/ gllm-inference

# you can use a Conda environment
FOR /F "tokens=*" %T IN ('gcloud auth print-access-token') DO pip install --extra-index-url "https://oauth2accesstoken:%T@glsdk.gdplabs.id/gen-ai-internal/simple/"  gllm-inference

Prompt Builder Catalog

Prompt Builder Catalog enables you to load and manage multiple prompt builders from various data sources like Google Sheets, CSV files, or Python records. This allows you to centralize prompt management, making it easier to maintain, version, and share prompts across your applications.

For example, instead of hard-coding prompts in your code, you can store them in a Google Sheet with names like "summarize", "transform_query", and "draft_document", then load them all at once using the catalog.

Catalog Configuration

The PromptBuilderCatalog can be configured using a table (such as a CSV file or Google Sheet), or directly from a list of Python dictionaries (records). To function correctly, the table or records must include specific columns or keys:

Required Columns:

name: The unique identifier for the prompt builder
system: The system template (instructions for the AI)
user: The user template (how user input is formatted)

Optional Column:

kwargs: Advanced prompt builder configuration (JSON format) for Jinja templating, defining key_defaults , etc.

Important Notes:

At least one of system or user columns must be filled
Templates support variable placeholders using {variable_name} syntax

Loading Catalog

Option 1: From Google Sheets

Obtain Worksheet ID and Credentials

From your Google Sheets URL, you can obtain:

sheet_id: between /d/ and /edit
worksheet_id: 0 (usually 0 for the first sheet)

Obtain Google Service Account JSON Credentials

Follow these steps:

Load with .from_gsheets() method

from gllm_inference.catalog.prompt_builder_catalog import PromptBuilderCatalog

# Method 1: Using client email and private key
catalog = PromptBuilderCatalog.from_gsheets(
    sheet_id="your_sheet_id",
    worksheet_id="0",
    client_email="your_service_account_email",
    private_key="your_private_key",
)

# Method 2: Using credential file
catalog = PromptBuilderCatalog.from_gsheets(
    sheet_id="your_sheet_id",
    worksheet_id="0",
    credential_file_path="path/to/credentials.json" #contains client_email and private_key
)

Option 2: From CSV File

Download/create CSV file

Prepare a CSV file that contains your prompt catalog definitions. You can download a template from: prompt_builder_catalog_template.csv

Load with .from_csv() method

catalog = PromptBuilderCatalog.from_csv(csv_path="path/to/prompt_builder_catalog.csv")
prompt_builder = catalog.transform_query

Option 3: From JSON File

Download/create JSON file

Prepare a JSON file that contains your prompt catalog definitions. You can download a template from: prompt_builder_catalog_template.json

Load with .from_json() method

import json

# Load from JSON file
with open("path/to/prompt_builder_catalog.json") as f:
    records = json.load(f)

catalog = PromptBuilderCatalog.from_records(records=records)
prompt_builder = catalog.summarize

Option 4: From Python Records

Define the catalog

Provide the records in the format of list of dictionaries. An example can be found below:

records = [
    {
        "name": "summarize",
        "system": "You are an AI expert\nSummarize the following context.\n\nContext:\n```{context}```",
        "user": "",
        "kwargs": None
    },
    {
        "name": "transform_query",
        "system": "",
        "user": "Transform the following query into a simpler form.\n\nQuery:\n```{query}```",
        "kwargs": None
    },
    {
        "name": "draft_document",
        "system": "You are an AI expert.\nDraft a document following the provided format and context.\n\nFormat:\n```{format}```",
        "user": "User instruction:\n{query}",
        "kwargs": {
            "key_defaults": {
                "format": "I. Background\nII. Content\nIII. Conclusion"
            }
        }
    },
    {
        "name": "transform_query_jinja",
        "system": "You are a helpful AI assistant. Use the provided examples to infer the correct transformation style.\n\n{% for ex in examples -%}\nExample {{ loop.index }}:\nInput: {{ ex.input }}\nOutput: {{ ex.output }}\n\n{%- endfor %}\nNow transform the following query consistently with the examples.",
        "user": None,
        "kwargs": {
            "use_jinja": True,
            "jinja_env": "restricted"
        }
    },
]````

Load using `.from_records()` method

catalog = PromptBuilderCatalog.from_records(records=records)

Using Catalog

Once loaded, you can access any prompt builder by its name:

# Use the prompt builders
summary_prompt = catalog.summarize.format(context="Some text to summarize")
query_prompt = catalog.transform_query.format(query="Complex query here")

# This uses the default format from key_defaults
document_prompt = catalog.draft_document.format(
    context="Background information",
    query="Write a summary report"
)

# Or override the default format
document_prompt = catalog.draft_document.format(
    format="Custom Format",
    context="Background information",
    query="Write a summary report"
)

# Using Jinja templates with dynamic data
jinja_prompt = catalog.transform_query_jinja.format(
    examples=[
        {"input": "What is AI?", "output": "Define AI"},
        {"input": "How does ML work?", "output": "Explain ML"}
    ],
    query="What are neural networks?"
)

LM Request Processor Catalog

LM Request Processor Catalog enables you to load and manage multiple LM request processors from various data sources like Google Sheets, CSV files, or Python records. This allows you to centralize the configuration of complete AI pipelines, including prompts, models, credentials, and output parsing in one place.

For example, instead of manually configuring each LM request processor with its model, credentials, and prompts, you can store all configurations in a Google Sheet and load them by name like "summarizer", "question_answerer", and "code_generator".

Think of it as:

LM Request Processor Catalog is like having a configuration management system for your AI pipelines, where each row defines a complete, ready-to-use AI processor.

Catalog Configuration

The LMRequestProcessorCatalog can be configured using a table (such as a CSV file or Google Sheet), or directly from a list of Python dictionaries (records). To function correctly, the table or records must include specific columns or keys:

Required Columns:

name: The unique identifier for the LM request processor
system_template: The system template for the prompt builder
user_template: The user template for the prompt builder
model_id: The model identifier for the LM invoker
credentials: Authentication credentials for the model
output_parser_type: Type of output parser to use

Optional Columns:

config: Additional configuration for the LM invoker (JSON format)
prompt_builder_kwargs: Advanced prompt builder configuration (JSON format) for Jinja templating and other features

Important Notes:

At least one of system_template or user_template must be filled
model_id supports environment variable substitution using ${ENV_VAR_KEY} syntax
credentials and config are optional but provide advanced functionality
prompt_builder_kwargs enables advanced features like Jinja templating and history formatting

Loading Catalog

Option 1: From Google Sheets

Obtain Worksheet ID and Credentials

From your Google Sheets URL, you can obtain:

sheet_id: between /d/ and /edit
worksheet_id: 0 (usually 0 for the first sheet)

Obtain Google Service Account JSON Credentials

Follow these steps:

Load with .from_gsheets() method

from gllm_inference.catalog.lm_request_processor_catalog import LMRequestProcessorCatalog

# Method 1: Using client email and private key
catalog = LMRequestProcessorCatalog.from_gsheets(
    sheet_id="your_sheet_id",
    worksheet_id="0",
    client_email="your_service_account_email",
    private_key="your_private_key",
)

# Method 2: Using credential file
catalog = LMRequestProcessorCatalog.from_gsheets(
    sheet_id="your_sheet_id",
    worksheet_id="0",
    credential_file_path="path/to/credentials.json"
)

Option 2: From CSV File

Download/create CSV file

Prepare a CSV file that contains your LM request processor catalog definitions. You can download a template from: lm_request_processor_catalog_template.csv

Load with .from_csv() method

catalog = LMRequestProcessorCatalog.from_csv(csv_path="path/to/lm_request_processor_catalog.csv")

Option 3: From JSON File

Download/create JSON file

Prepare a JSON file that contains your LM request processor catalog definitions. You can download a template from: lm_request_processor_catalog_template.json

Load with .from_json() method

import json

# Load from JSON file
with open("path/to/lm_request_processor_catalog.json") as f:
    records = json.load(f)

catalog = LMRequestProcessorCatalog.from_records(records=records)

Option 4: From Python Records

Define the catalog

Provide the records in the format of list of dictionaries. An example can be found below:

records = [
    {
        "name": "router",
        "model_id": "openai/gpt-4.1-nano",
        "credentials": "OPENAI_API_KEY",
        "config": {
            "default_hyperparameters": {
                "temperature": 0.7,
                "max_output_tokens": 100
            }
        },
        "system_template": "You are an AI expert.\nYour job is to define which use case is the most suitable for the user query.\nUse case options:\n1. \"qa\": Question answering.\n2. \"sum\": Summarization.\n3. \"dd\": Document drafting.",
        "user_template": "Below is the user query:\n{query}",
        "prompt_builder_kwargs": {"use_jinja": False},
        "output_parser_type": "none"
    },
    {
        "name": "chat_with_history",
        "model_id": "openai/gpt-4.1-nano",
        "credentials": "OPENAI_API_KEY",
        "config": {
            "default_hyperparameters": {
                "temperature": 0.7,
                "max_tokens": 500
            }
        },
        "system_template": "You are a helpful AI assistant. Continue the conversation based on the chat history provided.",
        "user_template": "{{ history }}\n\n{{ message }}",
        "prompt_builder_kwargs": {
            "use_jinja": True,
            "jinja_env": "restricted",
            "history_formatter": {
                "prefix_user_message": "<user>",
                "suffix_user_message": "</user>",
                "prefix_assistant_message": "<assistant>",
                "suffix_assistant_message": "</assistant>"
            }
        },
        "output_parser_type": "none"
    }
]

Load using .from_records() method

catalog = LMRequestProcessorCatalog.from_records(records=records)

Using Catalog

Once loaded, you can use the LM request processors directly:

import asyncio

# Example: Router - determine use case
router_result = await catalog.router.process(
    prompt_kwargs={
        "query": "What are the main benefits of renewable energy sources?"
    }
)
print("Router Result:", router_result)

Output

[Build 'OpenAILMInvoker'] Config:
  {
    'model_name': 'gpt-4.1-nano',
    'default_hyperparameters': { 'temperature': 0.7, 'max_output_tokens': 100 }
  }

Available LM Request Processors:
  - router → Model: gpt-4.1-nano

Running examples:
  [Invoke LM] POST /v1/responses → 200
  [LM Result]
    "The most suitable use case for this query is: 1. 'qa' (Question answering)."

Router Result:
  The most suitable use case selected: "qa" (Question answering)

PreviousEmbedding Model (EM) Invoker Next[BETA] Realtime Session

Last updated 24 days ago

Was this helpful?

hashtagInstallation

hashtagPrompt Builder Catalog

hashtagCatalog Configuration

hashtagLoading Catalog

hashtagOption 1: From Google Sheets

hashtagOption 2: From CSV File

hashtagOption 3: From JSON File

hashtagOption 4: From Python Records

hashtagUsing Catalog

hashtagLM Request Processor Catalog

hashtagCatalog Configuration

hashtagLoading Catalog

hashtagOption 1: From Google Sheets

hashtagOption 2: From CSV File

hashtagOption 3: From JSON File

hashtagOption 4: From Python Records

hashtagUsing Catalog

Installation

Prompt Builder Catalog

Catalog Configuration

Loading Catalog

Option 1: From Google Sheets

Option 2: From CSV File

Option 3: From JSON File

Option 4: From Python Records

Using Catalog

LM Request Processor Catalog

Catalog Configuration

Loading Catalog

Option 1: From Google Sheets

Option 2: From CSV File

Option 3: From JSON File

Option 4: From Python Records

Using Catalog