robotTongyi Deep Research

Audience: Developers

Tongyi Deep Research Provider

Overview

Tongyi Deep Research is an iterative deep-research agent that uses a multi-turn ReAct (Reasoning and Acting) paradigm. It's designed to emulate sophisticated cognitive workflows of human experts by breaking down complex research tasks into discrete rounds of reasoning, tool use, and synthesis.

What is Tongyi Deep Research?

Tongyi Deep Research is based on the DeepResearch framework from Alibaba-NLP/DeepResearcharrow-up-right. It implements an iterative research approach where the agent:

  1. Decomposes Problems: Breaks down complex questions into manageable sub-problems

  2. Iterates Through Rounds: Conducts multiple rounds of research, each building on previous findings

  3. Uses Tools Strategically: Leverages web search and page fetching tools to gather information

  4. Synthesizes Results: Combines findings from multiple sources into coherent answers

  5. Provides Evidence: Returns evidence-backed responses with proper citations

Monitored Sources

GL Open DeepResearch uses the open-source Tongyi Deep Research implementation. The following are the main references for the upstream project and related assets:

Resource
Description

Official DeepResearch framework repository (source code)

Model card and weights on Hugging Face

Official blog post introducing Tongyi Deep Research

How It Works

Research Process

The Tongyi agent follows an iterative ReAct pattern:

spinner

Key Components

  1. Multi-Turn ReAct Agent: The core agent that orchestrates the research process

  2. Tool System: Custom tools for web search and page fetching

  3. Smart Search Integration: Uses Smart Search SDK for enhanced web search capabilities

  4. LLM Integration: Communicates with LLM APIs for reasoning and synthesis

Research Flow

  1. Initialization: Agent receives query and initializes with system prompt and tools

  2. Planning: LLM generates research plan and identifies needed tools

  3. Execution: Agent calls tools (search, fetch pages) to gather information

  4. Synthesis: LLM analyzes gathered information and synthesizes findings

  5. Iteration: Process repeats until sufficient information is gathered

  6. Final Answer: Agent provides final answer wrapped in <answer></answer> tags

Integration

Integration Approach

We forked the original Tongyi DeepResearch repository from Alibaba-NLP/DeepResearcharrow-up-right and integrated it into our project. The original repository code is located in open_source/tongyi/, while our custom modifications are organized in the gl_deep_research/packages/tongyi/ package folder.

The integration follows the Adapter pattern:

  • TongyiAdapter implements the OrchestratorAdapter protocol

  • Adapter bridges Tongyi multi-turn ReAct agent to the orchestrator

  • Profile-based configuration determines provider selection

  • Streaming support via adapter-specific postprocessors

This approach allows us to:

  • Maintain the original DeepResearch codebase in open_source/tongyi/

  • Contribute our custom changes in packages/tongyi/ without modifying the original code

  • Import and extend the original MultiTurnReactAgent from the forked repository

  • Keep our customizations separate and maintainable

  • Integrate seamlessly with the orchestrator system

Adapter Layer

The Tongyi provider is integrated through TongyiAdapter (research/adapter/tongyi_adapter.py):

Initialization Process

  1. Environment Validation: Checks for required environment variables (MODEL_NAME, LLM_BASE_URL, LLM_API_KEY)

  2. Adapter Creation: Creates TongyiAdapter instance

  3. Orchestrator Registration: Adapter registered with OrchestratorFactory

  4. Adapter Ready: Adapter ready to accept research requests via orchestrator

Note: The agent (ContribMultiTurnReactAgent) is created per-request, not during adapter initialization. This ensures:

  • Profile-specific configuration is applied

  • Each request gets a fresh agent instance

  • Better isolation between concurrent requests

Request Flow

  1. Request received via task or taskgroup API (POST /v1/tasks or POST /v1/taskgroup)

  2. Request authenticated via API key (account or master key)

  3. Profile loaded from database based on profile parameter

  4. Orchestrator creates adapter instance via OrchestratorFactory

  5. LLM configuration built from environment variables and profile params

  6. Agent initialized with system prompt, LLM config, and tools

  7. Agent executes arun() with query

  8. Agent performs iterative research rounds

  9. Result is formatted as DeepResearchResult

  10. Response returned through orchestrator to router

Customizations and Changes

Package Location

The Tongyi integration code is located in gl_deep_research/packages/tongyi/, which contains our customizations of the original DeepResearch framework. The original repository code from Alibaba-NLP/DeepResearcharrow-up-right is maintained in open_source/tongyi/, and our package imports and extends components from there.

Key Changes Made

1. ContribMultiTurnReactAgent (agent.py)

Purpose: Modified version of the original MultiTurnReactAgent from DeepResearch repository.

Changes:

  • Custom LLM Integration: Modified call_server() to work with custom LLM APIs (not just OpenAI)

  • Token Counting: Added count_tokens() method that calls tokenization endpoint

  • Error Handling: Enhanced retry logic with exponential backoff

  • Async Support: Full async/await support for tool calls

  • Path Management: Automatic path setup for importing from open_source/tongyi/inference

Key Methods:

  • call_server(): Custom LLM API calling with retry logic

  • count_tokens(): Token counting via tokenization endpoint

  • arun(): Async execution of research workflow

  • custom_call_tool(): Async tool execution

2. ContribTool Base Class (tool.py)

Purpose: Bridge between blocking and async tool interfaces.

Changes:

  • Async-First Design: Tools implement acall() for async execution

  • Blocking bridge: call() method bridges to async using asyncio.run()

  • Event Loop Handling: Proper handling when already in an event loop

Implementation:

3. Smart Search Tools (tool_smart_search.py)

Purpose: Integration with Smart Search SDK for enhanced web search capabilities.

Tools Implemented:

  1. WebSearch: Performs web searches with multiple result types

    • Supports snippets, keypoints, and summary result types

    • Batch query support

    • Configurable result size

  2. WebSearchMap: Maps website structure

    • Discovers URL structure

    • Supports pagination and filtering

    • Subdomain inclusion options

  3. FetchWebPage: Fetches web page content

    • Returns raw HTML or cleaned text

    • Uses Smart Search SDK for reliable fetching

  4. WebPageSnippets: Extracts relevant snippets from pages

    • Query-based snippet extraction

    • Paragraph or sentence style options

  5. WebPageKeypoints: Extracts key points from pages

    • Focused topic extraction

    • Summarized highlights

Key Features:

  • All tools use Smart Search SDK's WebSearchClient

  • Async authentication and API calls

  • Formatted output for agent consumption

  • Error handling with informative messages

4. System Prompt (prompt.py)

Purpose: Custom system prompt optimized for deep research tasks.

Changes:

  • Research-Focused: Prompt emphasizes thorough, multi-source investigation

  • Tool Descriptions: Includes detailed tool function signatures

  • Answer Format: Specifies <answer></answer> tag requirements

  • Date Context: Includes current date for temporal awareness

5. Constants (constant.py)

Purpose: Centralized configuration for Smart Search SDK.

Changes:

  • Environment variable loading for Smart Search credentials

  • Base URL and token configuration

Integration Points

Adapter Integration

The TongyiAdapter integrates all components. The agent is created per-request in the run() method:

Path Management

The agent automatically sets up import paths:

This allows importing react_agent from the original DeepResearch repository.

Configuration

Required Environment Variables

LLM Configuration

The adapter builds LLM configuration from environment variables and profile parameters:

Usage

Basic Usage

Use the task or taskgroup API with a profile. The profile determines the provider and configuration:

Use the returned taskgroup_id and tasks to stream (GET /v1/taskgroup/{id}/stream) or poll for status and result (GET /v1/tasks/{id}). See Quick Start Guide.

Request Format

For taskgroup: query and profile (form or JSON per API contract). For a single task: same fields via POST /v1/tasks.

Note: Profile-specific options (like max_depth, timeout_seconds, etc.) are configured in the profile itself, not in the request. See Research Profiles for more information.

Response Format

Technical Details

Execution Limits

  • Max Execution Time: 150 minutes (9000 seconds)

  • Max Context Tokens: 110,000 tokens

  • Max LLM Calls: Configurable via MAX_LLM_CALL_PER_RUN

  • API Timeout: 10 minutes per LLM call

Error Handling

  • Retry Logic: Exponential backoff with max 30 seconds between retries

  • Token Limit Handling: Automatic context trimming when limit reached

  • Tool Errors: Graceful error messages returned to agent

  • Timeout Handling: Research marked as failed if timeout exceeded

Performance Considerations

  • Async Operations: All tool calls are async for better concurrency

  • Context Management: Automatic token counting and context trimming

  • Result Caching: Tool results cached in conversation context

  • Iterative Refinement: Multiple rounds allow for deeper research

Last updated