Tongyi Deep Research
Audience: Developers
Tongyi Deep Research Provider
Overview
Tongyi Deep Research is an iterative deep-research agent that uses a multi-turn ReAct (Reasoning and Acting) paradigm. It's designed to emulate sophisticated cognitive workflows of human experts by breaking down complex research tasks into discrete rounds of reasoning, tool use, and synthesis.
What is Tongyi Deep Research?
Tongyi Deep Research is based on the DeepResearch framework from Alibaba-NLP/DeepResearch. It implements an iterative research approach where the agent:
Decomposes Problems: Breaks down complex questions into manageable sub-problems
Iterates Through Rounds: Conducts multiple rounds of research, each building on previous findings
Uses Tools Strategically: Leverages web search and page fetching tools to gather information
Synthesizes Results: Combines findings from multiple sources into coherent answers
Provides Evidence: Returns evidence-backed responses with proper citations
Monitored Sources
GL Open DeepResearch uses the open-source Tongyi Deep Research implementation. The following are the main references for the upstream project and related assets:
Official DeepResearch framework repository (source code)
Model card and weights on Hugging Face
Official blog post introducing Tongyi Deep Research
How It Works
Research Process
The Tongyi agent follows an iterative ReAct pattern:
Key Components
Multi-Turn ReAct Agent: The core agent that orchestrates the research process
Tool System: Custom tools for web search and page fetching
Smart Search Integration: Uses Smart Search SDK for enhanced web search capabilities
LLM Integration: Communicates with LLM APIs for reasoning and synthesis
Research Flow
Initialization: Agent receives query and initializes with system prompt and tools
Planning: LLM generates research plan and identifies needed tools
Execution: Agent calls tools (search, fetch pages) to gather information
Synthesis: LLM analyzes gathered information and synthesizes findings
Iteration: Process repeats until sufficient information is gathered
Final Answer: Agent provides final answer wrapped in
<answer></answer>tags
Integration
Integration Approach
We forked the original Tongyi DeepResearch repository from Alibaba-NLP/DeepResearch and integrated it into our project. The original repository code is located in open_source/tongyi/, while our custom modifications are organized in the gl_deep_research/packages/tongyi/ package folder.
The integration follows the Adapter pattern:
TongyiAdapterimplements theOrchestratorAdapterprotocolAdapter bridges Tongyi multi-turn ReAct agent to the orchestrator
Profile-based configuration determines provider selection
Streaming support via adapter-specific postprocessors
This approach allows us to:
Maintain the original DeepResearch codebase in
open_source/tongyi/Contribute our custom changes in
packages/tongyi/without modifying the original codeImport and extend the original
MultiTurnReactAgentfrom the forked repositoryKeep our customizations separate and maintainable
Integrate seamlessly with the orchestrator system
Adapter Layer
The Tongyi provider is integrated through TongyiAdapter (research/adapter/tongyi_adapter.py):
Initialization Process
Environment Validation: Checks for required environment variables (MODEL_NAME, LLM_BASE_URL, LLM_API_KEY)
Adapter Creation: Creates
TongyiAdapterinstanceOrchestrator Registration: Adapter registered with
OrchestratorFactoryAdapter Ready: Adapter ready to accept research requests via orchestrator
Note: The agent (ContribMultiTurnReactAgent) is created per-request, not during adapter initialization. This ensures:
Profile-specific configuration is applied
Each request gets a fresh agent instance
Better isolation between concurrent requests
Request Flow
Request received via task or taskgroup API (
POST /v1/tasksorPOST /v1/taskgroup)Request authenticated via API key (account or master key)
Profile loaded from database based on
profileparameterOrchestrator creates adapter instance via
OrchestratorFactoryLLM configuration built from environment variables and profile params
Agent initialized with system prompt, LLM config, and tools
Agent executes
arun()with queryAgent performs iterative research rounds
Result is formatted as
DeepResearchResultResponse returned through orchestrator to router
Customizations and Changes
Package Location
The Tongyi integration code is located in gl_deep_research/packages/tongyi/, which contains our customizations of the original DeepResearch framework. The original repository code from Alibaba-NLP/DeepResearch is maintained in open_source/tongyi/, and our package imports and extends components from there.
Key Changes Made
1. ContribMultiTurnReactAgent (agent.py)
Purpose: Modified version of the original MultiTurnReactAgent from DeepResearch repository.
Changes:
Custom LLM Integration: Modified
call_server()to work with custom LLM APIs (not just OpenAI)Token Counting: Added
count_tokens()method that calls tokenization endpointError Handling: Enhanced retry logic with exponential backoff
Async Support: Full async/await support for tool calls
Path Management: Automatic path setup for importing from
open_source/tongyi/inference
Key Methods:
call_server(): Custom LLM API calling with retry logiccount_tokens(): Token counting via tokenization endpointarun(): Async execution of research workflowcustom_call_tool(): Async tool execution
2. ContribTool Base Class (tool.py)
Purpose: Bridge between blocking and async tool interfaces.
Changes:
Async-First Design: Tools implement
acall()for async executionBlocking bridge:
call()method bridges to async usingasyncio.run()Event Loop Handling: Proper handling when already in an event loop
Implementation:
3. Smart Search Tools (tool_smart_search.py)
Purpose: Integration with Smart Search SDK for enhanced web search capabilities.
Tools Implemented:
WebSearch: Performs web searches with multiple result types
Supports
snippets,keypoints, andsummaryresult typesBatch query support
Configurable result size
WebSearchMap: Maps website structure
Discovers URL structure
Supports pagination and filtering
Subdomain inclusion options
FetchWebPage: Fetches web page content
Returns raw HTML or cleaned text
Uses Smart Search SDK for reliable fetching
WebPageSnippets: Extracts relevant snippets from pages
Query-based snippet extraction
Paragraph or sentence style options
WebPageKeypoints: Extracts key points from pages
Focused topic extraction
Summarized highlights
Key Features:
All tools use Smart Search SDK's
WebSearchClientAsync authentication and API calls
Formatted output for agent consumption
Error handling with informative messages
4. System Prompt (prompt.py)
Purpose: Custom system prompt optimized for deep research tasks.
Changes:
Research-Focused: Prompt emphasizes thorough, multi-source investigation
Tool Descriptions: Includes detailed tool function signatures
Answer Format: Specifies
<answer></answer>tag requirementsDate Context: Includes current date for temporal awareness
5. Constants (constant.py)
Purpose: Centralized configuration for Smart Search SDK.
Changes:
Environment variable loading for Smart Search credentials
Base URL and token configuration
Integration Points
Adapter Integration
The TongyiAdapter integrates all components. The agent is created per-request in the run() method:
Path Management
The agent automatically sets up import paths:
This allows importing react_agent from the original DeepResearch repository.
Configuration
Required Environment Variables
LLM Configuration
The adapter builds LLM configuration from environment variables and profile parameters:
Usage
Basic Usage
Use the task or taskgroup API with a profile. The profile determines the provider and configuration:
Use the returned taskgroup_id and tasks to stream (GET /v1/taskgroup/{id}/stream) or poll for status and result (GET /v1/tasks/{id}). See Quick Start Guide.
Request Format
For taskgroup: query and profile (form or JSON per API contract). For a single task: same fields via POST /v1/tasks.
Note: Profile-specific options (like max_depth, timeout_seconds, etc.) are configured in the profile itself, not in the request. See Research Profiles for more information.
Response Format
Technical Details
Execution Limits
Max Execution Time: 150 minutes (9000 seconds)
Max Context Tokens: 110,000 tokens
Max LLM Calls: Configurable via
MAX_LLM_CALL_PER_RUNAPI Timeout: 10 minutes per LLM call
Error Handling
Retry Logic: Exponential backoff with max 30 seconds between retries
Token Limit Handling: Automatic context trimming when limit reached
Tool Errors: Graceful error messages returned to agent
Timeout Handling: Research marked as failed if timeout exceeded
Performance Considerations
Async Operations: All tool calls are async for better concurrency
Context Management: Automatic token counting and context trimming
Result Caching: Tool results cached in conversation context
Iterative Refinement: Multiple rounds allow for deeper research
Last updated
