Core design principles
Audience: Developers
Core GL Open DeepResearch Design Principles
This section describes the core design principles of GL Open DeepResearch: pipelines, modularity, composability, and the patterns that support them. For a list of core components and their roles, see Core components. Understanding these principles helps when extending the system, adding providers or tools, and reasoning about request and event flow.
Pipelines
The application is structured around clear pipelines: well-defined flows from input to output, with consistent stages and boundaries. Pipelines make behavior predictable and make it easier to add observability, error handling, and new capabilities at specific points.
Task and taskgroup pipeline
Clients create research via the task or taskgroup API. The orchestration path from API to provider runs inside the worker:
Router — Validates the request and authenticates the client (API key).
Factory — Resolves the profile, creates the adapter for the profile’s provider, and optionally builds the tool list from the profile. Produces a request-scoped orchestrator (one adapter + one profile per request).
Orchestrator — Holds the adapter and profile, handles timing and errors, and calls the adapter’s
run().Adapter — Executes the underlying research engine (e.g. Tongyi, GPT-Researcher) with the given query, profile, tools, and optional event emitter.
Clients use the task or taskgroup API; the same orchestration path runs inside the worker (see Asynchronous task pipeline below).
Asynchronous task pipeline
For async execution, the pipeline extends to a background worker and storage:
Router — Accepts task creation; TaskService validates input, persists the task, and enqueues a Celery task.
Celery worker — Picks up the task and calls TaskService.execute_research with the task ID, query, and profile.
TaskService — Updates status, sets up streaming (event capture to Redis), creates an orchestrator via the same Factory, runs conduct_research, then updates status, stores the completion event, and runs webhooks.
So the same “Router → Factory → Orchestrator → Adapter” pipeline runs inside the worker, with the addition of task lifecycle, Redis-backed streaming, and webhooks.
Streaming event pipeline
Streaming is implemented as a capture–store–retrieve pipeline:
Capture — During research execution, a
StreamEventHandlerandEventEmittercapture adapter events; a postprocessor (from the adapter) can transform them.Store — Events are appended to Redis lists (e.g.
task_stream:{task_id},taskgroup_stream:{taskgroup_id}) with a TTL.Retrieve — Clients consume events via SSE. The streaming handler exposes a generic get_redis_stream(redis_key, ...) that polls a Redis list and yields SSE-formatted events; get_task_stream and group stream logic use this same primitive.
Task-level and taskgroup-level streams share the same storage and retrieval pattern, so behavior and extensions (e.g. completion detection, timeouts) stay consistent.
Research flow inside adapters
Within an adapter, the research engine often implements its own pipeline. For example, Tongyi Deep Research:
Decomposes the question into sub-problems.
Iterates over rounds: reason → act (tools) → observe.
Uses tools (e.g. web search, page fetch) to gather information.
Synthesizes and returns an evidence-backed answer.
The orchestrator does not dictate these steps; it only invokes adapter.run(). Pipeline design inside each adapter is provider-specific.
Modularity
The codebase is split into modules with clear responsibilities and stable boundaries. This keeps changes local and makes testing and replacement of parts easier.
Layered structure
Router layer — HTTP, validation, authentication; no business logic.
Service layer — Task, TaskGroup, Profile, Account: orchestration of use cases and delegation to repositories and the orchestrator.
Orchestrator layer — Single responsibility: run the right adapter with the right profile and tools.
Adapter layer — Provider-specific implementations behind a common protocol.
Repository layer — Data access; services depend on abstractions (e.g. base repository interfaces), not concrete DB implementations.
Infrastructure — Redis, Celery, streaming handler; used by services but not tied to a single domain.
Routers depend on services; services depend on orchestrator, repositories, and streaming/cache; the orchestrator depends only on the adapter protocol and domain models.
Protocol-based adapters
Adapters are not tied by inheritance but by structural typing: they implement the OrchestratorAdapter protocol (name, provider_type, description, streaming_postprocessor, run). So:
New providers can be added without changing orchestrator or router code.
The orchestrator stays provider-agnostic and only calls the protocol methods.
Registries as extension points
OrchestratorRegistry — Registers adapter factories by provider key; the factory creates an adapter instance (e.g. from a class path). Used at startup and by the Factory when creating the orchestrator.
ToolRegistry — Registers tool factories by name; the factory creates a tool instance. Profiles refer to tools by name; the Factory uses create_many(tool_names) to build the list passed to the orchestrator.
Adding a provider = register an adapter. Adding a tool = register a tool factory. No change to the core request or task pipeline.
Streaming and task execution
TaskStreamingHandler — Encapsulates “capture events and store in Redis” and “read from a Redis key and stream as SSE.” TaskService and TaskGroupService use it but do not implement Redis or SSE details.
get_redis_stream — Generic stream reader for any Redis list key; task stream and taskgroup stream both use it with different keys and stop conditions (e.g. completion event, “all tasks done”).
This keeps streaming logic in one place and reusable for tasks and taskgroups.
Composability
The system is designed so that small, well-defined pieces are combined to form requests, tasks, and streams. Composability is visible in profiles, the factory, taskgroups, and tools.
Profile as composition unit
A Profile combines:
provider — Which adapter to use (e.g. Tongyi, GPTR).
params — Provider- and use-case-specific options (e.g.
llm_model,max_depth,tools).
The same orchestrator and factory work with any profile; the profile determines adapter and tools. New behavior can be added by new profiles or new params without changing the core pipeline.
Factory as composer
OrchestratorFactory.create(request) composes the runtime for a single request:
Resolves Profile from the request (by name or default).
Creates the Adapter via OrchestratorRegistry from the profile’s provider.
Optionally creates Tools via ToolRegistry from profile.params (e.g.
params["tools"]).Builds one DeepResearchOrchestrator(adapter, profile, tools).
So: one request → one profile → one adapter + one optional tool list → one orchestrator. The factory is the only place that ties profile, adapter, and tools together for a request.
Taskgroups as composition of tasks
A TaskGroup is a batch of research tasks that share configuration:
Shared: profile, webhook (and thus provider and tools).
Per task: query (from the group’s
querylist).
TaskGroupService creates one task per query via TaskService, associating each task with the same taskgroup. Group status is derived from member task statuses (e.g. SUCCESS when all succeed, PARTIAL_FAILURE when some fail). So “taskgroup” is a composition of many tasks plus shared config and derived status, not a new kind of execution engine.
Group stream as composition of streams
get_group_stream composes multiple streams into one SSE response:
One asyncio task per task stream (each using the same
get_redis_stream-style retrieval fromtask_stream:{task_id}).One stream from taskgroup_stream:{taskgroup_id} for group-level events (e.g. status changes).
A shared queue and a completion signal (e.g. when all task streams are done) so the taskgroup stream can stop and the client receives a single, ordered stream.
So the group stream is “all task streams + taskgroup stream,” composed with a clear stop condition and ordering.
Tools as composed capabilities
Tools are registered by name; a profile’s params["tools"] is a list of names. The Factory calls ToolRegistry.create_many(tool_names) and passes the list to the orchestrator, which passes it to the adapter. The adapter (e.g. Tongyi) uses the tools during research. So:
Composition: A profile composes a set of tools by name.
Reuse: The same tool can appear in many profiles.
Extensibility: New tools are registered and then referenced in profiles; no change to the factory or orchestrator logic.
Related patterns
These patterns support the pipelines, modularity, and composability described above.
Factory
OrchestratorFactory
Build request-scoped orchestrator (profile + adapter + tools); isolates creation and keeps routers simple.
Registry
OrchestratorRegistry, ToolRegistry, ProfileRegistry
Pluggable providers/tools/profiles; add new ones without changing callers.
Protocol (structural typing)
OrchestratorAdapter
Loose coupling to adapters; any type that implements the protocol can be used.
Repository
Profile, Task, TaskGroup, Account repositories
Abstract data access; services depend on interfaces, not DB or Redis.
Dependency injection
FastAPI Depends()
Services and handlers receive dependencies (e.g. TaskService, streaming handler) from the container.
Single pipeline per request
Router → Factory → Orchestrator → Adapter
One clear path per research run; same path for stream, task, and taskgroup.
Generic stream primitive
get_redis_stream(redis_key, ...)
One implementation for “read list from Redis and stream as SSE”; reused for task and taskgroup.
Event-driven streaming
Capture → Redis list → get_redis_stream → SSE
Decouples producer (worker) from consumer (API); supports multiple clients and retries.
Summary
Pipelines: Request, task, and streaming flows are well-defined and staged; the same “orchestrator + adapter” pipeline runs for streaming and async tasks.
Modularity: Layers, protocol-based adapters, and registries keep responsibilities clear and extensions local (new adapters, tools, profiles).
Composability: Profiles combine provider and params; the factory composes profile, adapter, and tools per request; taskgroups compose tasks; group stream composes task and taskgroup streams; tools are composed by name in profiles.
Together, these principles keep the system predictable, testable, and easy to extend with new providers, tools, and execution modes (e.g. stream, task, taskgroup) without duplicating core logic.
Last updated
