Tool Calling

Tool calling means letting a language model (like GPT) call external functions to help it solve a task. It allows the AI to interact with external functions and APIs during the conversation, enabling dynamic computation, data retrieval, and complex workflows.

For example, if the question is “What is 15 + 25 then multiply by 2?”, the model doesn’t guess — it calls your add and multiply functions. The tool enables the model to execute functions and use their results to inform subsequent responses.

Think of it as:

The LLM is smart at reading and reasoning, but when it needs to calculate or get external data, it picks up the phone and calls your “tool”.

How Tool Calling Works

Define Tools: Create functions decorated with @tool that the model can call
Initialize Invoker: Set up the LM Invoker with your tools
Execute Loop: The model can call tools multiple times in a conversation
Handle Results: Process tool outputs and continue the conversation

Let's break down each part of the implementation:

Single Tool Calling

Let's walk through how to set up and use tool calling step by step:

Tool Definition

First, we define the tools that our AI can use. Each tool is a Python function decorated with @tool:

from langchain_core.tools import tool

@tool
def add(a: int, b: int) -> int:
    """Add two numbers."""
    return a + b

@tool
def subtract(a: int, b: int) -> int:
    """Subtract two numbers."""
    return a - b

@tool
def multiply(a: int, b: int) -> int:
    """Multiply two numbers."""
    return a * b

Key points about tool definition:

@tool Decorator: This decorator automatically generates the tool schema that the model needs
Type Hints: Include proper type hints (a: int, b: int) -> int) - these help the model understand the expected input/output
Docstrings: The docstring becomes the tool description that the model uses to understand when to call the tool
Function Name: The function name becomes the tool name the model will reference

Setting Up the LM Invoker

from gllm_inference.lm_invoker import OpenAILMInvoker

tools = [add, subtract, multiply]
lm_invoker = OpenAILMInvoker(model_name="gpt-4o-mini", tools=tools)

Key points:

Tools List: Pass all your tools as a list to the invoker
Model Selection: Use a model that supports tool calling (like GPT-4, GPT-3.5-turbo, etc.)
Automatic Registration: The invoker automatically registers the tools with the model

When the LM Invoker is invoked once, the model typically returns only tool calls without a text response:

# Example first response
result = LMOutput(
    response='',  # Usually empty on first call
    tool_calls=[
        ToolCall(id='call_123', name='add', args={'a': 15, 'b': 25}),
        ToolCall(id='call_456', name='multiply', args={'a': 40, 'b': 2})
    ]
)

You will need to create a execution loop to ensure the response is generated.

The Tool Calling Execution Loop

This is the core logic that handles the back-and-forth between the model and tools. Here's how it works step by step:

Setup and Initialization

async def execute_tool_calling(lm_invoker, query, tools, prompt_builder):
    # Create a lookup dictionary for quick tool access
    tool_dict = {t.name: t for t in tools}
    
    # Format the initial prompt
    messages = prompt_builder.format(query=query)

Tool Dictionary: Creates a quick lookup mapping tool names to tool objects
Message Formatting: Uses PromptBuilder to create the initial conversation with system and user messages
This sets up the context for the model to understand what tools are available

Main Execution Loop

# Main execution loop (max 5 iterations)
for _ in range(5):
    # Get response from the model
    result = await lm_invoker.invoke(messages)

Iteration Limit: Prevents infinite loops in case the model keeps calling tools
Model Invocation: Sends current conversation to the model
The model decides whether to respond directly or call tools

Response Type Check

# Check if model wants to call tools
if isinstance(result, str) or not result.tool_calls:
    # No tool calls - return the final response
    return result if isinstance(result, str) else result.response

Decision Point: Determines if the conversation should end
If no tool calls are made, return the final response
This ends the conversation naturally when the model has all the information it needs

Assistant Message Construction

# Model wants to call tools - prepare assistant message
assistant_content = [result.response] if result.response else []
assistant_content.extend(result.tool_calls)
messages.append(("assistant", assistant_content))

Message Building: Constructs the assistant's message including any text response and tool calls
Conversation History: Adds this to the conversation history
Even if result.response is empty (common on first call), the tool calls are still added

Tool Execution

# Execute each tool call
for call in result.tool_calls:
    try:
        # Execute the tool
        output = await tool_dict[call.name].ainvoke(call.args)
    except Exception as e:
        # Handle tool execution errors
        output = f"Error: {e}"
    
    # Add tool result back to conversation
    messages.append(("user", [ToolResult(id=call.id, output=str(output))]))

Tool Lookup: Uses the tool dictionary for fast retrieval
Error Handling: Catches and handles tool execution errors gracefully
Result Storage: Adds tool results back to conversation as "user" messages
The loop continues until the model provides a final response or max iterations are reached

Loop Completion

return "Max iterations reached"

Safety Net: Returns this message if the maximum number of iterations is reached
Prevents infinite loops while still providing feedback
In practice, most tool calling scenarios complete within 2-3 iterations

Setting Up PromptBuilder

from gllm_inference.prompt_builder import PromptBuilder

prompt_builder = PromptBuilder(
    system_template="You are a mathematical assistant. Always use tools for calculations.",
    user_template="Calculate: {query}"
)

Key Points:

System Message: Instructs the model to use tools for calculations
User Template: Formats the user's query consistently
Tool Guidance: The system message is crucial for encouraging tool usage

Run the Execution Loop

import asyncio
from dotenv import load_dotenv

load_dotenv()

# Run the tool calling example
query = "What is 15 + 25 then multiply by 2?"
result = asyncio.run(execute_tool_calling(lm_invoker, query, tools, prompt_builder))
print(f"Result: {result}")

Expected Flow for the Example Query

User Query: "What is 15 + 25 then multiply by 2?"
Model Analysis: Identifies need for addition and multiplication
First Tool Call: add(15, 25) → Returns 40
Second Tool Call: multiply(40, 2) → Returns 80
Final Response: "The result is 80"

Common Tool Calling Use Cases

Mathematical Calculations: Like our example
API Calls: Fetching data from external services
Database Queries: Reading/writing data
File Operations: Reading files, processing documents
Complex Workflows: Multi-step processes requiring multiple tools

PreviousStructured Output NextStream Output

Last updated 4 days ago