Postprocessing

Postprocessing runs automatically after every pipeline and agent. It persists the interaction and prepares the system for future queries — without blocking the user's response stream.

What Postprocessing Does

Once a response has been generated and is streaming to the user, there is still important work to do: storing the conversation for history continuity, updating the user's long-term memory, and caching the response so future identical queries can be answered instantly. Postprocessing handles all of this automatically.

Pipeline Flow

What Each Step Does

Step
What it does

Save Chat History

Persists the full turn — user query, assistant response, references, related questions, PII anonymization mappings, events, and step indicators

Write Memory

Stores a meaningful memory of this interaction for retrieval in future sessions


Key Behaviors

Postprocessing is asynchronous. It runs after the response has already streamed to the user. The user never waits for history saving or memory writing.

Postprocessing is automatic. You do not need to implement it in your pipeline or agent. As long as your pipeline returns the required fields (response, references, events, related, generation_query), the system handles persistence.

Pipelines and agents go through postprocessing. The same postprocessing logic runs regardless of whether the request was handled by Standard RAG, No-Op, Gemini Live, Deep Research, or an Agent.

There is no cache saving in postprocessing. Response caching is checked during preprocessing (and returned immediately if hit). Saving a new cache entry is not part of the default postprocessing pipeline — it is saved when generates FAQ.

Last updated