Standard RAG

What is the Standard RAG Pipeline?

When a user sends a message in GLChat, it does not go directly to an AI model. Instead, it passes through a structured pipeline designed to:

  • Understand the user's intent in the context of the full conversation

  • Search the relevant knowledge sources

  • Generate a grounded, accurate response

  • Save the interaction for future context

The pipeline is divided into four stages: Preprocessing, Retrieval, Generation, and Postprocessing.

Pipeline Overview

The diagram below shows the complete flow from the moment a user sends a query to the moment a response is streamed back.


Stage 1: Preprocessing

Goal: Prepare the user's query with full conversation context.

When a user sends a message, the system does not immediately search for information. First, it builds up context:

Step
What it does

Retrieve Chat History

Loads the full conversation history including previous messages and any prior attachments

Retrieve Memory

If memory is enabled, fetches relevant long-term memories from previous sessions for this user

Cache Check

Checks whether a semantically similar query was recently answered and cached


Stage 2: Retrieval

Goal: Find the most relevant information from the configured knowledge sources.

The retrieval stage searches for context that the AI will use to generate its answer.

Query Transformation

Before searching, the query may be split or reformulated to improve recall — especially for complex multi-part questions.

Search Types

Depending on the chatbot configuration, one strategies are used:

Search Type
Description

Normal (Hybrid / Vector Search)

Searches an internal knowledge base using a combination of keyword matching (BM25) and semantic vector similarity. Best for document-grounded answers

Smart Search

Searches external web sources or integrated connectors (e.g. Google, Calendar).

SQL Search

Searches data through SQL execution

Post-Search Processing

After results are retrieved, they go through three steps:

Step
What it does

Reranker

Reorders retrieved chunks by relevance to the query using a reranking model

Context Enricher

Adds extra metadata to the chunks

Repacker

Formats the final ranked chunks into clean context strings ready for the LLM


Stage 3: Generation

Goal: Generate an accurate, grounded response using the retrieved context.

Step
What it does

Generate Response

Sends the full context and query to the LLM. The response is streamed to the user in real time as it is produced

Reference Formatter

Identifies which specific source chunks the response was based on and attaches them as references.


Stage 4: Postprocessing

Goal: Persist the interaction so it can be used in future conversations.

Step
What it does

Save Chat History

Stores the full turn (user query, assistant response, references, events) in the database

Save Memory

Stores a summary of this interaction so it can be retrieved in future sessions

Postprocessing happens in the background after the response has already started streaming — it does not block the user from seeing their answer.


Agentic Pipelines

Agentic pipelines enable dynamic, AI-driven decision-making within your pipeline workflows. Unlike standard pipelines that follow a fixed sequence of steps, agentic pipelines use AI agents to reason and decide actions at runtime.

What is an Agentic Pipeline?

An agentic pipeline combines the reliability of pipelines with the flexibility of AI agents:

Description
Example

Pipeline

Predictable, auditable, repeatable steps

"Always search the database, then summarize"

Agent

LLM reasons and decides actions dynamically at runtime

"Figure out if the user needs a search or a calculation, and do it"

Agentic pipelines allow you to:

  • Build workflows that adapt to different scenarios

  • Make intelligent decisions based on user input

  • Combine multiple tools and capabilities dynamically

  • Handle complex, multi-step reasoning tasks

Default Agentic Pipeline — Agentic RAG

GLChat includes a built-in agentic pipeline called Agentic RAG that can be enabled directly within the Standard RAG pipeline.

Availability:

  • Only available in the Standard RAG pipeline

  • Currently works for web search — uses an AI agent to intelligently search the web

  • Will be expanded to support additional search capabilities in the future

How to enable:

  1. Open your chatbot's Preset Config in the Admin Dashboard

  2. Find the enable_agentic_smart_search configuration option

  3. Set it to true

Once enabled, the Standard RAG pipeline will use an AI agent to decide how to search the web when processing queries, providing more dynamic and context-aware results.

Custom Agentic Pipelines

If you need custom behavior, tools, or reasoning logic beyond what Agentic RAG provides, you can build your own agentic pipeline.

Integration patterns:

Pattern
Description

Pipeline-as-a-Tool

Expose a robust pipeline as a tool that an agent can call

Agent-as-a-Step

Embed an agent as a standardized step within a pipeline

Key building blocks:

Concept
Description

Agent Orchestrator

Manages agent execution and strategy resolution

Agent Strategies

Execution strategies: Native, AIP, Remote A2A

Tools

Functions that agents can call to perform actions

Agent Configuration

Settings that define agent behavior and capabilities

For a step-by-step guide, refer to:

When to Use Agentic vs Standard Pipelines

Use agentic pipelines when…
Use standard pipelines when…

You need dynamic decision-making based on user input

You need predictable, repeatable workflows

The workflow depends on complex reasoning

The sequence of steps is fixed and well-defined

You want to combine multiple tools or capabilities intelligently

You want full control over execution flow

The sequence of actions should be determined at runtime

Auditability and reproducibility are critical

Last updated