Tool Calling

Tool Calling

Tool calling means letting a language model (like GPT) call external functions to help it solve a task. It allows the AI to interact with external functions and APIs during the conversation, enabling dynamic computation, data retrieval, and complex workflows.

For example, if the question is “What is 15 + 25 then multiply by 2?”, the model doesn’t guess — it calls your add and multiply functions. The tool enables the model to execute functions and use their results to inform subsequent responses.

Think of it as:

The LLM is smart at reading and reasoning, but when it needs to calculate or get external data, it picks up the phone and calls your “tool”.

How Tool Calling Works

  1. Define Tools: Create functions decorated with @tool that the model can call

  2. Initialize Invoker: Set up the LM Invoker with your tools

  3. Execute Loop: The model can call tools multiple times in a conversation

  4. Handle Results: Process tool outputs and continue the conversation

Let's break down each part of the implementation:

Single Tool Calling

Let's walk through how to set up and use tool calling step by step:

1

Tool Definition

First, we define the tools that our AI can use. Each tool is a Python function decorated with @tool:

from langchain_core.tools import tool

@tool
def add(a: int, b: int) -> int:
    """Add two numbers."""
    return a + b

@tool
def subtract(a: int, b: int) -> int:
    """Subtract two numbers."""
    return a - b

@tool
def multiply(a: int, b: int) -> int:
    """Multiply two numbers."""
    return a * b

Key points about tool definition:

  1. @tool Decorator: This decorator automatically generates the tool schema that the model needs

  2. Type Hints: Include proper type hints (a: int, b: int) -> int) - these help the model understand the expected input/output

  3. Docstrings: The docstring becomes the tool description that the model uses to understand when to call the tool

  4. Function Name: The function name becomes the tool name the model will reference

2

Setting Up the LM Invoker

from gllm_inference.lm_invoker import OpenAILMInvoker

tools = [add, subtract, multiply]
lm_invoker = OpenAILMInvoker(model_name="gpt-4o-mini", tools=tools)

Key points:

  • Tools List: Pass all your tools as a list to the invoker

  • Model Selection: Use a model that supports tool calling (like GPT-4, GPT-3.5-turbo, etc.)

  • Automatic Registration: The invoker automatically registers the tools with the model

When the LM Invoker is invoked once, the model typically returns only tool calls without a text response:

# Example first response
result = LMOutput(
    response='',  # Usually empty on first call
    tool_calls=[
        ToolCall(id='call_123', name='add', args={'a': 15, 'b': 25}),
        ToolCall(id='call_456', name='multiply', args={'a': 40, 'b': 2})
    ]
)

You will need to create a execution loop to ensure the response is generated.

The Tool Calling Execution Loop

This is the core logic that handles the back-and-forth between the model and tools. Here's how it works step by step:

1

Setup and Initialization

async def execute_tool_calling(lm_invoker, query, tools, prompt_builder):
    # Create a lookup dictionary for quick tool access
    tool_dict = {t.name: t for t in tools}
    
    # Format the initial prompt
    messages = prompt_builder.format(query=query)
  • Tool Dictionary: Creates a quick lookup mapping tool names to tool objects

  • Message Formatting: Uses PromptBuilder to create the initial conversation with system and user messages

  • This sets up the context for the model to understand what tools are available

2

Main Execution Loop

# Main execution loop (max 5 iterations)
for _ in range(5):
    # Get response from the model
    result = await lm_invoker.invoke(messages)
  • Iteration Limit: Prevents infinite loops in case the model keeps calling tools

  • Model Invocation: Sends current conversation to the model

  • The model decides whether to respond directly or call tools

3

Response Type Check

# Check if model wants to call tools
if isinstance(result, str) or not result.tool_calls:
    # No tool calls - return the final response
    return result if isinstance(result, str) else result.response
  • Decision Point: Determines if the conversation should end

  • If no tool calls are made, return the final response

  • This ends the conversation naturally when the model has all the information it needs

4

Assistant Message Construction

# Model wants to call tools - prepare assistant message
assistant_content = [result.response] if result.response else []
assistant_content.extend(result.tool_calls)
messages.append(("assistant", assistant_content))
  • Message Building: Constructs the assistant's message including any text response and tool calls

  • Conversation History: Adds this to the conversation history

  • Even if result.response is empty (common on first call), the tool calls are still added

5

Tool Execution

# Execute each tool call
for call in result.tool_calls:
    try:
        # Execute the tool
        output = await tool_dict[call.name].ainvoke(call.args)
    except Exception as e:
        # Handle tool execution errors
        output = f"Error: {e}"
    
    # Add tool result back to conversation
    messages.append(("user", [ToolResult(id=call.id, output=str(output))]))
  • Tool Lookup: Uses the tool dictionary for fast retrieval

  • Error Handling: Catches and handles tool execution errors gracefully

  • Result Storage: Adds tool results back to conversation as "user" messages

  • The loop continues until the model provides a final response or max iterations are reached

6

Loop Completion

return "Max iterations reached"
  • Safety Net: Returns this message if the maximum number of iterations is reached

  • Prevents infinite loops while still providing feedback

  • In practice, most tool calling scenarios complete within 2-3 iterations

7

Setting Up PromptBuilder

from gllm_inference.prompt_builder import PromptBuilder

prompt_builder = PromptBuilder(
    system_template="You are a mathematical assistant. Always use tools for calculations.",
    user_template="Calculate: {query}"
)

Key Points:

  • System Message: Instructs the model to use tools for calculations

  • User Template: Formats the user's query consistently

  • Tool Guidance: The system message is crucial for encouraging tool usage

8

Run the Execution Loop

import asyncio
from dotenv import load_dotenv

load_dotenv()

# Run the tool calling example
query = "What is 15 + 25 then multiply by 2?"
result = asyncio.run(execute_tool_calling(lm_invoker, query, tools, prompt_builder))
print(f"Result: {result}")

Expected Flow for the Example Query

  1. User Query: "What is 15 + 25 then multiply by 2?"

  2. Model Analysis: Identifies need for addition and multiplication

  3. First Tool Call: add(15, 25) → Returns 40

  4. Second Tool Call: multiply(40, 2) → Returns 80

  5. Final Response: "The result is 80"

Common Tool Calling Use Cases

  1. Mathematical Calculations: Like our example

  2. API Calls: Fetching data from external services

  3. Database Queries: Reading/writing data

  4. File Operations: Reading files, processing documents

  5. Complex Workflows: Multi-step processes requiring multiple tools

Last updated