Structured Output
In many real-world applications, we don't just want natural language responses — we want structured data that our programs can parse and use directly.
The good news? You don’t need to embed JSON schemas directly in your prompt. Instead, just define your expected output using:
A Pydantic
BaseModel
class (recommended)A JSON schema dictionary compatible with Pydantic's schema format
When structured output is enabled:
The model will not stream.
The output will be stored in the
.structured_output
attribute of the response.The output type depends on the input schema:
Pydantic instance → Pydantic BaseModel instance
JSON schema dict → Python dictionary
Example 1: Using a Pydantic BaseModel
(Recommended)
BaseModel
(Recommended)You can define your expected response format as a Pydantic class. This ensures strong type safety and makes the output easier to work with in Python.
from pydantic import BaseModel
from typing import List
from gllm_inference.lm_invoker import OpenAILMInvoker
from gllm_inference.prompt_builder import PromptBuilder
class Activity(BaseModel):
type: str
activity_location: str
description: str
class ActivityList(BaseModel):
location: str
activities: List[Activity]
system_template = (
"You are a helpful assistant who specializes in recommending activities.\n"
)
user_template = "{question}"
builder = PromptBuilder(system_template=system_template, user_template=user_template)
prompt = builder.format(question="I want to go to Tokyo, Japan. What should I do?")
response = asyncio.run(lm_invoker.invoke(prompt, output_schema=ActivityList))
print(f"Response: {response}")
Output:
LMOutput(
structured_output=ActivityList(
location="Tokyo, Japan",
activities=[
Activity(
type="Cultural Experience",
activity_location="Asakusa",
description="Visit the iconic Senso-ji Temple."
),
Activity(
type="Shopping",
activity_location="Shibuya",
description="Experience the bustling Shibuya Crossing."
),
]
)
)
Example 2: Using a JSON Schema Dictionary
Alternatively, you can define the structure using a JSON schema dictionary. This is useful in situations where dynamic schema generation is required or when operating in environments that don’t use Pydantic.
from gllm_inference.lm_invoker import OpenAILMInvoker
from gllm_inference.prompt_builder import PromptBuilder
# Define JSON schema
activity_schema = {
"title": "ActivityList",
"type": "object",
"properties": {
"location": {"type": "string"},
"activities": {
"type": "array",
"items": {
"type": "object",
"properties": {
"type": {"type": "string"},
"activity_location": {"type": "string"},
"description": {"type": "string"},
},
"required": ["type", "activity_location", "description"],
"additionalProperties": False
}
}
},
"required": ["location", "activities"],
"additionalProperties": False
}
system_template = (
"You are a helpful assistant who specializes in recommending activities.\n"
)
user_template = "{question}"
builder = PromptBuilder(system_template=system_template, user_template=user_template)
prompt = builder.format(question="I want to go to Tokyo, Japan. What should I do?")
response = asyncio.run(lm_invoker.invoke(prompt, output_schema=activity_schema))
print(f"Response: {response}")
Output
LMOutput(
structured_output={
"location": "Tokyo, Japan",
"activities": [
{
"type": "Cultural Experience",
"activity_location": "Asakusa",
"description": "Visit the iconic Senso-ji Temple."
},
{
"type": "Shopping",
"activity_location": "Shibuya",
"description": "Experience the bustling Shibuya Crossing."
}
]
}
)
Generate JSON Schema from a BaseModel
If you're using Pydantic and want to generate the JSON Schema automatically, you can convert your model like this:
schema_dict = ActivityList.model_json_schema()
This allows you to dynamically generate a schema for environments where Pydantic models aren’t accepted directly, while still enjoying the benefits of static typing.
Last updated