paperclipGLChat Pipeline

This section explains how GLChat processes a user message end-to-end — from the moment it arrives to the moment a response is streamed back

Overview

When a user sends a message, it does not go directly to an AI model. It passes through a series of stages that handle input processing, safety, context enrichment, execution, and persistence. Understanding this flow helps you know exactly where things happen and why.

Full Execution Flow

Stage Summary

Guardrails

Guardrails runs a safety check on the raw user input. If the input violates configured policies, a blocked response is returned immediately. If guardrails are not enabled or the input is safe, the message continues to preprocessing.

Preprocessing

Preprocessing enriches the raw message with full conversation context — rewriting it as a standalone query, retrieving long-term memories, and checking the response cache. If a cache hit is found, the cached response is returned immediately without running the pipeline.

Routing

After preprocessing, the system revisits the routing decision made at the very start. It now applies it: the message is directed to a Pipeline (e.g. Standard RAG) or an Agent, depending on the user configuration.

Pipeline or Agent

  • Pipeline — a structured, deterministic flow that retrieves context and generates a response. Standard RAG is the primary pipeline.

  • Agent — a flexible, multi-step executor that can call tools, reason, and loop before producing a final answer.

Both paths produce a response that is streamed to the user in real time.

Postprocessing

After the pipeline or agent finishes, the system saves the interaction — chat history and memory — asynchronously so it does not delay the response stream.

Last updated