Standard RAG
What is the Standard RAG Pipeline?
When a user sends a message in GLChat, it does not go directly to an AI model. Instead, it passes through a structured pipeline designed to:
Understand the user's intent in the context of the full conversation
Search the relevant knowledge sources
Generate a grounded, accurate response
Save the interaction for future context
The pipeline is divided into four stages: Preprocessing, Retrieval, Generation, and Postprocessing.
Pipeline Overview
The diagram below shows the complete flow from the moment a user sends a query to the moment a response is streamed back.
Stage 1: Preprocessing
Goal: Prepare the user's query with full conversation context.
When a user sends a message, the system does not immediately search for information. First, it builds up context:
Retrieve Chat History
Loads the full conversation history including previous messages and any prior attachments
Retrieve Memory
If memory is enabled, fetches relevant long-term memories from previous sessions for this user
Cache Check
Checks whether a semantically similar query was recently answered and cached
Stage 2: Retrieval
Goal: Find the most relevant information from the configured knowledge sources.
The retrieval stage searches for context that the AI will use to generate its answer.
Query Transformation
Before searching, the query may be split or reformulated to improve recall — especially for complex multi-part questions.
Search Types
Depending on the chatbot configuration, one strategies are used:
Normal (Hybrid / Vector Search)
Searches an internal knowledge base using a combination of keyword matching (BM25) and semantic vector similarity. Best for document-grounded answers
Smart Search
Searches external web sources or integrated connectors (e.g. Google, Calendar).
SQL Search
Searches data through SQL execution
Post-Search Processing
After results are retrieved, they go through three steps:
Reranker
Reorders retrieved chunks by relevance to the query using a reranking model
Context Enricher
Adds extra metadata to the chunks
Repacker
Formats the final ranked chunks into clean context strings ready for the LLM
Stage 3: Generation
Goal: Generate an accurate, grounded response using the retrieved context.
Generate Response
Sends the full context and query to the LLM. The response is streamed to the user in real time as it is produced
Reference Formatter
Identifies which specific source chunks the response was based on and attaches them as references.
Stage 4: Postprocessing
Goal: Persist the interaction so it can be used in future conversations.
Save Chat History
Stores the full turn (user query, assistant response, references, events) in the database
Save Memory
Stores a summary of this interaction so it can be retrieved in future sessions
Postprocessing happens in the background after the response has already started streaming — it does not block the user from seeing their answer.
Agentic Pipelines
Agentic pipelines enable dynamic, AI-driven decision-making within your pipeline workflows. Unlike standard pipelines that follow a fixed sequence of steps, agentic pipelines use AI agents to reason and decide actions at runtime.
What is an Agentic Pipeline?
An agentic pipeline combines the reliability of pipelines with the flexibility of AI agents:
Pipeline
Predictable, auditable, repeatable steps
"Always search the database, then summarize"
Agent
LLM reasons and decides actions dynamically at runtime
"Figure out if the user needs a search or a calculation, and do it"
Agentic pipelines allow you to:
Build workflows that adapt to different scenarios
Make intelligent decisions based on user input
Combine multiple tools and capabilities dynamically
Handle complex, multi-step reasoning tasks
Default Agentic Pipeline — Agentic RAG
GLChat includes a built-in agentic pipeline called Agentic RAG that can be enabled directly within the Standard RAG pipeline.
Availability:
Only available in the Standard RAG pipeline
Currently works for web search — uses an AI agent to intelligently search the web
Will be expanded to support additional search capabilities in the future
How to enable:
Open your chatbot's Preset Config in the Admin Dashboard
Find the
enable_agentic_smart_searchconfiguration optionSet it to
true
Once enabled, the Standard RAG pipeline will use an AI agent to decide how to search the web when processing queries, providing more dynamic and context-aware results.
Custom Agentic Pipelines
If you need custom behavior, tools, or reasoning logic beyond what Agentic RAG provides, you can build your own agentic pipeline.
Integration patterns:
Pipeline-as-a-Tool
Expose a robust pipeline as a tool that an agent can call
Agent-as-a-Step
Embed an agent as a standardized step within a pipeline
Key building blocks:
Agent Orchestrator
Manages agent execution and strategy resolution
Agent Strategies
Execution strategies: Native, AIP, Remote A2A
Tools
Functions that agents can call to perform actions
Agent Configuration
Settings that define agent behavior and capabilities
For a step-by-step guide, refer to:
GL AIP (GL AI Platform) Quick Start Guide — primary resource for building custom agents with GL AI Platform
GL SDK Pipelines and Agents — patterns for integrating pipelines and agents, including Pipeline-as-a-Tool and Agent-as-a-Step
When to Use Agentic vs Standard Pipelines
You need dynamic decision-making based on user input
You need predictable, repeatable workflows
The workflow depends on complex reasoning
The sequence of steps is fixed and well-defined
You want to combine multiple tools or capabilities intelligently
You want full control over execution flow
The sequence of actions should be determined at runtime
Auditability and reproducibility are critical
Last updated