codeProgrammatic Tool Calling

Programmatic Tool Calling

Enable AI agents to orchestrate multiple tool calls through code execution, reducing context pollution and improving efficiency for complex multi-step workflows. This guide shows how to use Programmatic Tool Calling (PTC) to let agents chain tool calls programmatically in a sandboxed environment.

circle-check
circle-info

PTC currently supports local runs via agent.run(local=True). Remote execution support is coming soon. The SDK automatically handles sandbox lifecycle management and cleanup after each run.

Overview

Programmatic Tool Calling (PTC) enables agents to orchestrate tools through Python code rather than through individual API round-trips. Instead of requesting tools one at a time with each result entering the agent's context, the agent writes code that calls multiple tools, processes their outputs programmatically, and controls what information enters its context window.

Traditional tool calling creates two fundamental problems as workflows become more complex:

  • Context pollution from intermediate results: When processing large datasets (10MB log files, database queries, API responses), all intermediate data enters the agent's context window, consuming massive token budgets and potentially pushing important information out of context.

  • Inference overhead and manual synthesis: Sequential tool orchestration requires multiple model inference passes. The agent must parse results, compare values, and synthesize conclusions through natural language processing—both slow and error-prone.

PTC solves these problems by letting agents express orchestration logic in Python code. Loops, conditionals, data transformations, and error handling become explicit in code rather than implicit in the agent's reasoning.

Key Features

  • Code-Based Orchestration: Agents write Python code to chain multiple tool calls

  • Context Window Protection: Intermediate results stay in the sandbox, only final outputs reach the agent

  • Parallel Execution: Run multiple tool calls concurrently using asyncio.gather

  • Tool Integration: Works seamlessly with MCP tools

  • Automatic Cleanup: Sandbox resources are released after each run

  • Sandboxed Execution: Code runs in a secure E2B environment

Installation

PTC requires the local runner dependencies and an E2B API key for sandbox execution.

circle-exclamation

Quick Start

Basic PTC Setup

Example: Processing Team Expenses

Consider a common task: "Which team members exceeded their Q3 travel budget?"

With three MCP tools:

  • get_team_members(department) - Returns team member list with IDs

  • get_expenses(user_id, quarter) - Returns expense line items

  • get_budget_by_level(level) - Returns budget limits

Without PTC (traditional approach):

  • Fetch 20 team members → 20 tool calls for expenses

  • Each returns 50-100 line items (2,000+ expenses total)

  • All data enters agent context (200KB+)

  • Agent manually sums expenses, compares against budgets

  • Multiple inference passes required

With PTC:

The agent writes orchestration code that runs in the sandbox:

Results:

  • Agent context receives only the final result (2-3 people who exceeded budgets)

  • Token consumption drops from ~200KB to ~1KB

  • Multiple inference passes reduced to code execution

  • Parallel execution reduces latency

How PTC Works

1. Agent Writes Orchestration Code

When PTC is enabled and the agent needs to orchestrate multiple tools, it uses the execute_ptc_code tool (automatically available) to generate Python code:

2. Code Executes in E2B Sandbox

The code runs in a secure E2B sandbox environment. When the code calls tools, the sandbox pauses and requests tool execution from the API.

3. Tool Results Stay in Sandbox

Tool results are returned to the sandbox environment and processed by the Python code—they do not enter the agent's context window.

4. Final Output Returns to Agent

Only the code's final output (via print() or return value) is sent back to the agent's context:

The agent sees only the summary, not the thousands of intermediate expense records.

Configuration

PTC Class

Configuration Options:

  • enabled (bool, required): Must be True to activate PTC. When False, all other fields are ignored.

  • sandbox_timeout (float, optional): Maximum execution time for sandbox code in seconds. Default: 120.0

  • prompt (dict, optional): Customize the PTC prompt configuration

    • mode: "auto" (default), "minimal", "index", or "full"

    • auto_threshold: Tool count threshold for auto mode (default: 10)

    • include_example: Whether to include example code in the prompt

Prompt Modes:

  • "auto" (default): Automatically selects "minimal" if tools > auto_threshold (10), otherwise "full"

  • "minimal": Shows only package list with discovery helper (tools.ptc_helper)

  • "index": Shows tool names grouped by package with discovery helper

  • "full": Shows complete tool signatures with descriptions

Agent Integration

PTC is configured via the ptc parameter on the Agent:

Constraints and Limitations

circle-exclamation

When to Use PTC vs Other Techniques

Use PTC When:

Processing large datasets with minimal relevant output:

  • Example: Processing 10MB log file to extract 3 error patterns

  • Without PTC: 10MB enters agent context

  • With PTC: Only error summary (~1KB) enters context

Running multi-step workflows with 3+ dependent tool calls:

  • Example: Fetch data → Filter → Aggregate → Compare → Report

  • Benefit: Reduces round-trips and keeps intermediate data out of context

Parallel operations across many items:

  • Example: Check health of 50 endpoints, aggregate results

  • Benefit: Runs checks concurrently, only returns summary

Data transformation before agent sees results:

  • Example: Fetch raw DB records → Normalize → Deduplicate → Format

  • Benefit: Agent sees clean final output, not raw records

Filtering or aggregating tool outputs:

  • Example: Fetch 1000 records → Filter by criteria → Return 10 matches

  • Without PTC: All 1000 records enter context

  • With PTC: Only 10 filtered results enter context

Use Traditional Tool Calling When:

Making simple single-tool invocations:

  • Example: Get current weather for a city

  • Reason: PTC overhead not justified for single lookup

Agent needs to reason about intermediate results:

  • Example: "Analyze this error message and decide next debugging step"

  • Reason: Agent should see the error to make informed decisions

Working with small, relevant datasets:

  • Example: Fetch user profile (~100 bytes)

  • Reason: All data is relevant, no filtering needed

Quick lookups with small responses:

  • Example: Dictionary lookup, simple API call

  • Reason: PTC adds unnecessary execution overhead

Exploratory workflows where agent needs full context:

  • Example: "Review these 3 documents and compare themes"

  • Reason: Agent needs to see all content to reason effectively

Real-World Performance: Tested Demo Scenario

To demonstrate PTC's real-world impact, we tested a common multi-tool workflow: fetching content from Google Drive and sending it via email. This scenario represents a typical use case where an agent needs to chain multiple tool calls together.

Demo Scenario

Task: Get markdown content from a Google Drive file and send it via email.

Tools Used:

  • google_drive_get_markdown_content - Retrieves file content from Google Drive

  • google_mail_send_email - Sends email with the content

Test Setup:

  • Same task executed with PTC enabled and disabled

  • Measured: execution time, token usage, tool calls

  • Model: GPT-5.2

  • Environment: Local execution with MCP tools

Execution Flow Comparison

spinner

Detailed Metrics

Metric
Without PTC
With PTC
Improvement

Total Time

148.85s

30.22s

79.7% ↓

Input Tokens

36,780

5,461

85.1% ↓

Output Tokens

10,250

396

96.1% ↓

Total Tokens

47,030

5,857

87.5% ↓

Tool Calls

2

1

50% ↓

Why PTC Made a Difference

Without PTC:

  1. Agent makes first inference to understand task

  2. Calls google_drive_get_markdown_content → Large content enters context

  3. Second inference with full content in context (high token cost)

  4. Calls google_mail_send_email with content

  5. Third inference to synthesize response

  6. Multiple round-trips and context pollution

With PTC:

  1. Agent generates orchestration code via execute_ptc_code

  2. Code executes in sandbox:

    • Fetches Drive content (stays in sandbox)

    • Sends email with content (stays in sandbox)

    • Returns only status/confirmation

  3. Agent processes minimal output for final response

  4. Reduced context pollution, no intermediate data

Key Takeaways

  • 5x faster execution: PTC reduced total time from 149s to 30s

  • 8x fewer tokens: Token usage dropped from 47K to 5.8K tokens

  • Eliminated context pollution: Large Drive content never entered agent context

  • Reduced API costs: 87.5% reduction in tokens = proportional cost savings

  • Simpler orchestration: Single code execution handles the entire workflow

circle-check

Usage Patterns

circle-info

The code examples below show what the agent might generate when using PTC. You configure the agent and give it a task—the agent writes the orchestration code automatically.

Pattern 1: Parallel Data Collection

Fetch data from multiple sources concurrently:

Example agent-generated code:

Pattern 2: Filter and Aggregate

Process large datasets and return only relevant summaries:

Example agent-generated code:

Pattern 3: Multi-Step Data Pipeline

Chain multiple operations without context pollution:

Example agent-generated code:

Error Handling

Missing E2B API Key

Solution: Set your E2B API key before running:

Sandbox Timeout

Solution: Increase sandbox_timeout for long-running operations:

Runtime Override Attempts

Solution: Configure PTC only via Agent.ptc parameter.

Best Practices

Specifying Tool Response Formats

For tools with nested or complex response structures, include the format in your agent instruction to help the agent generate correct code:

Performance Optimization

  1. Use parallel execution for independent operations:

  2. Filter data early to minimize processing:

  3. Return only essential data:

Code Quality in PTC

  1. Handle errors gracefully:

  2. Use structured output formats:

  3. Set reasonable timeouts:

Security Considerations

  1. Validate tool outputs before processing:

  2. Avoid exposing sensitive data in outputs:

  3. Use sandbox timeout to prevent runaway code:

Troubleshooting

Common Issues

"PTC module not found"

  • Install local dependencies: pip install glaip-sdk[local]

"E2B_API_KEY not set"

"execute_ptc_code tool not available"

  • Ensure PTC is enabled: ptc=PTC(enabled=True)

  • Verify you're running locally: agent.run(local=True)

  • Check that tools are configured

"Sandbox execution failed"

  • Check E2B service status

  • Verify network connectivity

  • Review code for syntax errors in agent-generated code

"Code timeout exceeded"

  • Increase sandbox_timeout in PTC config

  • Optimize code for faster execution

  • Consider splitting into smaller operations

Debugging PTC Execution

Enable detailed logging:

Check sandbox output in logs:

API Reference

Core Classes

PTC - Configuration object for Programmatic Tool Calling:

Properties:

  • enabled: Boolean flag to activate/deactivate PTC

  • sandbox_timeout: Maximum execution time for sandbox code

  • prompt: Dictionary with mode ("full" | "minimal") and include_example (bool)

Automatic Tool Registration

When PTC is enabled, the execute_ptc_code tool is automatically registered and available to the agent. This tool allows the agent to execute Python code in the E2B sandbox with access to all configured tools.

Additional Resources

Last updated