Tongyi Deep Research

Audience: Developers

Tongyi Deep Research Provider

Overview

Tongyi Deep Research is an iterative deep-research agent that uses a multi-turn ReAct (Reasoning and Acting) paradigm. It's designed to emulate sophisticated cognitive workflows of human experts by breaking down complex research tasks into discrete rounds of reasoning, tool use, and synthesis.

What is Tongyi Deep Research?

Tongyi Deep Research is based on the DeepResearch framework from Alibaba-NLP/DeepResearch. It implements an iterative research approach where the agent:

Decomposes Problems: Breaks down complex questions into manageable sub-problems
Iterates Through Rounds: Conducts multiple rounds of research, each building on previous findings
Uses Tools Strategically: Leverages web search and page fetching tools to gather information
Synthesizes Results: Combines findings from multiple sources into coherent answers
Provides Evidence: Returns evidence-backed responses with proper citations

Monitored Sources

GL Open DeepResearch uses the open-source Tongyi Deep Research implementation. The following are the main references for the upstream project and related assets:

Resource

Description

Alibaba-NLP/DeepResearch

Official DeepResearch framework repository (source code)

Tongyi-DeepResearch-30B-A3B (Hugging Face)

Model card and weights on Hugging Face

Introducing Tongyi Deep Research

Official blog post introducing Tongyi Deep Research

How It Works

Research Process

The Tongyi agent follows an iterative ReAct pattern:

Key Components

Multi-Turn ReAct Agent: The core agent that orchestrates the research process
Tool System: Custom tools for web search and page fetching
Smart Search Integration: Uses Smart Search SDK for enhanced web search capabilities
LLM Integration: Communicates with LLM APIs for reasoning and synthesis

Research Flow

Initialization: Agent receives query and initializes with system prompt and tools
Planning: LLM generates research plan and identifies needed tools
Execution: Agent calls tools (search, fetch pages) to gather information
Synthesis: LLM analyzes gathered information and synthesizes findings
Iteration: Process repeats until sufficient information is gathered
Final Answer: Agent provides final answer wrapped in <answer></answer> tags

Integration

Integration Approach

We forked the original Tongyi DeepResearch repository from Alibaba-NLP/DeepResearch and integrated it into our project. The original repository code is located in open_source/tongyi/, while our custom modifications are organized in the gl_deep_research/packages/tongyi/ package folder.

The integration follows the Adapter pattern:

TongyiAdapter implements the OrchestratorAdapter protocol
Adapter bridges Tongyi multi-turn ReAct agent to the orchestrator
Profile-based configuration determines provider selection
Streaming support via adapter-specific postprocessors

This approach allows us to:

Maintain the original DeepResearch codebase in open_source/tongyi/
Contribute our custom changes in packages/tongyi/ without modifying the original code
Import and extend the original MultiTurnReactAgent from the forked repository
Keep our customizations separate and maintainable
Integrate seamlessly with the orchestrator system

Adapter Layer

The Tongyi provider is integrated through TongyiAdapter (research/adapter/tongyi_adapter.py):

class TongyiAdapter:
    """Adapter for running tongyi deep research agent.

    This adapter bridges the Tongyi multi-turn ReAct agent to the orchestrator
    using the Adapter pattern. It implements the OrchestratorAdapter protocol.
    """

    def __init__(self):
        """Initialize the TongyiAdapter."""
        self._check_env_variables()

    @property
    def name(self) -> str:
        return "Tongyi Deep Research Agent"

    @property
    def provider_type(self) -> ProviderType:
        return ProviderType.TONGYI

    @property
    def streaming_postprocessor(self) -> Callable[[str], str]:
        return self._postprocess_streaming_event_tongyi

Initialization Process

Environment Validation: Checks for required environment variables (MODEL_NAME, LLM_BASE_URL, LLM_API_KEY)
Adapter Creation: Creates TongyiAdapter instance
Orchestrator Registration: Adapter registered with OrchestratorFactory
Adapter Ready: Adapter ready to accept research requests via orchestrator

Note: The agent (ContribMultiTurnReactAgent) is created per-request, not during adapter initialization. This ensures:

Profile-specific configuration is applied
Each request gets a fresh agent instance
Better isolation between concurrent requests

Request Flow

Request received via task or taskgroup API (POST /v1/tasks or POST /v1/taskgroup)
Request authenticated via API key (account or master key)
Profile loaded from database based on profile parameter
Orchestrator creates adapter instance via OrchestratorFactory
LLM configuration built from environment variables and profile params
Agent initialized with system prompt, LLM config, and tools
Agent executes arun() with query
Agent performs iterative research rounds
Result is formatted as DeepResearchResult
Response returned through orchestrator to router

Customizations and Changes

Package Location

The Tongyi integration code is located in gl_deep_research/packages/tongyi/, which contains our customizations of the original DeepResearch framework. The original repository code from Alibaba-NLP/DeepResearch is maintained in open_source/tongyi/, and our package imports and extends components from there.

Key Changes Made

1. ContribMultiTurnReactAgent (agent.py)

Purpose: Modified version of the original MultiTurnReactAgent from DeepResearch repository.

Changes:

Custom LLM Integration: Modified call_server() to work with custom LLM APIs (not just OpenAI)
Token Counting: Added count_tokens() method that calls tokenization endpoint
Error Handling: Enhanced retry logic with exponential backoff
Async Support: Full async/await support for tool calls
Path Management: Automatic path setup for importing from open_source/tongyi/inference

Key Methods:

call_server(): Custom LLM API calling with retry logic
count_tokens(): Token counting via tokenization endpoint
arun(): Async execution of research workflow
custom_call_tool(): Async tool execution

2. ContribTool Base Class (tool.py)

Purpose: Bridge between blocking and async tool interfaces.

Changes:

Async-First Design: Tools implement acall() for async execution
Blocking bridge: call() method bridges to async using asyncio.run()
Event Loop Handling: Proper handling when already in an event loop

Implementation:

class ContribTool(BaseTool):
    def call(self, params: str | dict, **kwargs) -> str:
        """Bridge blocking tool interface to async."""
        return asyncio.run(self.acall(params, **kwargs))

    async def acall(self, params: str | dict, **kwargs) -> str:
        """Async tool execution - must be implemented by subclasses."""
        raise NotImplementedError

3. Smart Search Tools (tool_smart_search.py)

Purpose: Integration with Smart Search SDK for enhanced web search capabilities.

Tools Implemented:

WebSearch: Performs web searches with multiple result types
- Supports snippets, keypoints, and summary result types
- Batch query support
- Configurable result size
WebSearchMap: Maps website structure
- Discovers URL structure
- Supports pagination and filtering
- Subdomain inclusion options
FetchWebPage: Fetches web page content
- Returns raw HTML or cleaned text
- Uses Smart Search SDK for reliable fetching
WebPageSnippets: Extracts relevant snippets from pages
- Query-based snippet extraction
- Paragraph or sentence style options
WebPageKeypoints: Extracts key points from pages
- Focused topic extraction
- Summarized highlights

Key Features:

All tools use Smart Search SDK's WebSearchClient
Async authentication and API calls
Formatted output for agent consumption
Error handling with informative messages

4. System Prompt (prompt.py)

Purpose: Custom system prompt optimized for deep research tasks.

Changes:

Research-Focused: Prompt emphasizes thorough, multi-source investigation
Tool Descriptions: Includes detailed tool function signatures
Answer Format: Specifies <answer></answer> tag requirements
Date Context: Includes current date for temporal awareness

5. Constants (constant.py)

Purpose: Centralized configuration for Smart Search SDK.

Changes:

Environment variable loading for Smart Search credentials
Base URL and token configuration

Integration Points

Adapter Integration

The TongyiAdapter integrates all components. The agent is created per-request in the run() method:

def run(
    self,
    query: str,
    profile: Profile,
    event_emitter: WebSocketEventEmitter | None = None,
) -> DeepResearchResult:
    # Build LLM config from profile params
    llm_cfg = self._build_llm_config(profile)

    # Create agent instance per request
    agent = ContribMultiTurnReactAgent(
        system_prompt=SYSTEM_PROMPT,
        llm=llm_cfg,
        tools=[WebSearch(), WebSearchMap(), FetchWebPage()],
    )

    # Execute research
    result = agent.arun(query)
    return self._format_result(result, profile)

Path Management

The agent automatically sets up import paths:

_tongyi_dir = Path(__file__).parent.parent.parent.parent / "open_source" / "tongyi"
_inference_dir = _tongyi_dir / "inference"
if _inference_dir.exists():
    sys.path.insert(0, str(_inference_dir.resolve()))

This allows importing react_agent from the original DeepResearch repository.

Configuration

Required Environment Variables

# LLM Configuration
MODEL_NAME=your-model-name
LLM_BASE_URL=https://your-llm-base-url
LLM_API_KEY=your-api-key

# Smart Search SDK Configuration
SMART_SEARCH_BASE_URL=https://your-smart-search-url
SMART_SEARCH_IDENTIFIER=your-smart-search-identifier
SMART_SEARCH_SECRET=your-smart-search-secret

# Optional LLM Parameters
MAX_TOKENS=10000
MAX_RETRIES=10
TEMPERATURE=0.6
TOP_P=0.95
PRESENCE_PENALTY=1.1
PLANNING_PORT=8000

LLM Configuration

The adapter builds LLM configuration from environment variables and profile parameters:

{
    "model": MODEL_NAME,
    "generate_cfg": {
        "max_tokens": MAX_TOKENS,
        "max_retries": MAX_RETRIES,
        "temperature": TEMPERATURE,
        "top_p": TOP_P,
        "presence_penalty": PRESENCE_PENALTY,
    },
    "model_type": "qwen_dashscope",
    "base_url": LLM_BASE_URL,
    "api_key": LLM_API_KEY,
}

Usage

Basic Usage

Use the task or taskgroup API with a profile. The profile determines the provider and configuration:

curl -X POST 'https://stag-gl-deep-research.obrol.id/v1/taskgroup' \
  -H "X-API-Key: your-api-key" \
  -H "Content-Type: application/x-www-form-urlencoded" \
  --data-urlencode 'query=What are the latest developments in quantum computing?' \
  --data-urlencode 'profile=TONGYI'

Use the returned taskgroup_id and tasks to stream (GET /v1/taskgroup/{id}/stream) or poll for status and result (GET /v1/tasks/{id}). See Quick Start Guide.

Request Format

For taskgroup: query and profile (form or JSON per API contract). For a single task: same fields via POST /v1/tasks.

Note: Profile-specific options (like max_depth, timeout_seconds, etc.) are configured in the profile itself, not in the request. See Research Profiles for more information.

Response Format

{
  "result": {
    "question": "Your research question",
    "answer": "",
    "prediction": "Generated research result...",
    "status": "completed",
    "termination": "",
    "provider": "tongyi",
    "metadata": {},
    "duration_seconds": 45.23,
    "created_at": "2024-12-02T10:00:00Z",
    "completed_at": "2024-12-02T10:00:45Z"
  },
  "success": true,
  "error": null
}

Technical Details

Execution Limits

Max Execution Time: 150 minutes (9000 seconds)
Max Context Tokens: 110,000 tokens
Max LLM Calls: Configurable via MAX_LLM_CALL_PER_RUN
API Timeout: 10 minutes per LLM call

Error Handling

Retry Logic: Exponential backoff with max 30 seconds between retries
Token Limit Handling: Automatic context trimming when limit reached
Tool Errors: Graceful error messages returned to agent
Timeout Handling: Research marked as failed if timeout exceeded

Performance Considerations

Async Operations: All tool calls are async for better concurrency
Context Management: Automatic token counting and context trimming
Result Caching: Tool results cached in conversation context
Iterative Refinement: Multiple rounds allow for deeper research

PreviousAvailable Open Source Deep Research NextGPT-Researcher

Last updated 4 days ago

hashtagTongyi Deep Research Provider

hashtagOverview

hashtagWhat is Tongyi Deep Research?

hashtagMonitored Sources

hashtagHow It Works

hashtagResearch Process

hashtagKey Components

hashtagResearch Flow

hashtagIntegration

hashtagIntegration Approach

hashtagAdapter Layer

hashtagInitialization Process

hashtagRequest Flow

hashtagCustomizations and Changes

hashtagPackage Location

hashtagKey Changes Made

hashtagIntegration Points

hashtagConfiguration

hashtagRequired Environment Variables

hashtagLLM Configuration

hashtagUsage

hashtagBasic Usage

hashtagRequest Format

hashtagResponse Format

hashtagTechnical Details

hashtagExecution Limits

hashtagError Handling

hashtagPerformance Considerations