imageImage Generation

gllm-inferencearrow-up-right | Tutorial: Image Generation | API Referencearrow-up-right

Supported by: GoogleLMInvoker, OpenAILMInvoker , XAILMInvoker

What is Image Generation?

Image generation is a native tool that allows the language model to generate an image based on the provided query. When it's enabled, image results are stored in the outputs attribute of the LMOutput object and can be accessed via the attachments property.

circle-exclamation

Image generation tool can be enabled with several options as seen below. Since image generation may take quite some time, it's highly recommended to set a higher timeout via the RetryConfig.

import asyncio
from gllm_core.retry import RetryConfig
from gllm_inference.lm_invoker import OpenAILMInvoker
from gllm_inference.model import OpenAILM
from gllm_inference.schema import NativeTool, NativeToolType

# Option 1: as string
image_generation_tool = "image_generation"
# Option 2: as enum
image_generation_tool = NativeToolType.IMAGE_GENERATION
# Option 3: as dictionary (useful for providing custom kwargs)
image_generation_tool = {"type": "image_generation", **kwargs}
# Option 4: as native tool object (useful for providing custom kwargs)
image_generation_tool = NativeTool.image_generation(**kwargs)

retry_config = RetryConfig(timeout=60)
lm_invoker = OpenAILMInvoker(
    OpenAILM.GPT_5_NANO, 
    tools=[image_generation_tool], 
    retry_config=retry_config,
)

With that all set, let's try it to generate a simple image!

Output:

Now let's try to save the generated image to our local path:

Generated Image:

Last updated

Was this helpful?