Preprocessing

Preprocessing runs automatically for every message after guardrails pass. It enriches the raw user message with everything the system needs to give an accurate, context-aware answer.

What Problem Does Preprocessing Solve?

A user's message is often incomplete on its own. Consider this exchange:

User: What is the capital of France?
Assistant: It is Paris.
User: What about Germany?

The second message — "What about Germany?" — means nothing without the conversation history. Preprocessing solves this by transforming the raw message into a fully self-contained, context-aware query before any search or generation happens.

What Happens Before Preprocessing

Before preprocessing runs, two earlier stages have already done important work:

  • Attachment stage: Uploaded files have been processed. If DPO is enabled, documents have been converted into structured content and stored in attachments.

  • URL stage: Any URLs in the message have been extracted and their content is available in processed_urls.

Preprocessing receives all of this as input and builds on top of it.

Preprocessing Flow

What Each Step Does

Step
What it does

Safety Check on User Query

Runs a safety check on the raw user input. If the input violates configured policies, a blocked response is returned immediately. If the input is safe, the process continues.

Retrieve Chat History

Loads the full conversation history including previous messages and any prior attachments

Retrieve Memory

If memory is enabled, fetches relevant long-term memories from previous sessions for this user

Cache Check

Checks whether a semantically similar query was recently answered and cached

Last updated