File Processing Guide

Attach files to agent runs, reuse artifacts from prior attachments, and manage chunk IDs for long-form analysis. Reach for this guide when agents need to consume documents, transcripts, or datasets across REST, the Python SDK, and the CLI.

File-handling support is summarised in the AIP capability matrix. The main limitation today is regenerating presigned URLs—stick with the REST helper until SDK/CLI shortcuts arrive.

aip agents run accepts either an agent ID or a unique name. Use --select to pick from partial name matches or provide the ID directly when scripting.

Attach Files to an Agent Run

When to use: Collect fresh documents from users or pipelines and supply them during execution.

Local Document Processing: For local execution, you can use document loader tools like PDFReaderTool, DocxReaderTool, and ExcelReaderTool from aip-agents to read files directly from disk without uploading to the server. Attach the file in the same run so the tool has access to it. See the Local vs Remote guide for the document loader tools quickstart and example (main_with_docproc_pdf.py).

from glaip_sdk import Agent

agent = Agent(name="analysis-agent", instruction="You analyze documents.")

response = agent.run(
    "Summarise the document and extract key metrics",
    files=["./reports/q1.pdf", "./reports/q2.pdf"],
)
print(response)

Common attachment errors

Symptom

Likely cause

Fix

413 Payload Too Large

File exceeds backend attachment/upload limits.

Compress the file or split it into smaller chunks.

Missing file in run logs

File path incorrect or permissions denied.

Double-check the path, ensure the process can read the file, or use absolute paths.

Duplicate chunks created

Run attaches files without reusing artifact_id.

Pass the stored chunk IDs using the reuse workflows in the next section.

Unsupported media type errors

File type not allowed for ingestion.

Convert to a supported format (PDF, TXT, DOCX) or register a custom ingestion pipeline.

Reuse Chunk IDs from Prior Attachments

When to use: Avoid re-ingesting the same files while keeping chunk IDs stable across runs.

When the backend returns chunk_ids, store them for later runs:

chunk_ids = ["chunk-abc", "chunk-def"]
agent.run(
    "Compare the latest reports with previous attachments",
    chunk_ids=chunk_ids,
)

CLI support for passing chunk_ids is coming soon—use the SDK or REST API today to avoid re-attaching large files.

Retrieve Artifacts and Output

When to use: Capture the processed results, enriched files, or generated reports after execution.

Capture the run ID from the streaming response (X-Run-ID).

List run history:

curl -sL "$AIP_API_URL/agents/$AGENT_ID/runs" -H "X-API-Key: $AIP_API_KEY" | jq

Download artifacts directly from the presigned URLs in the response. If a URL has expired, regenerate it with /utils/regenerate_presigned_url.

Best Practices

When to use: Create organisation-wide guardrails for storage, retention, and compliance.

Compress large files — keep attachments efficient and within allowable limits.
Track chunk IDs — store them alongside run metadata so you can reference prior attachments without retransmitting data.
Sanitise inputs — redaction or PII masking should occur before attaching sensitive documents; see the Security & privacy guide.
Automate clean-up — if you are storing artifacts locally for auditing, ensure rotation policies are in place.

Local vs Remote — local vs remote file processing comparison and built-in tools overview.
Agents guide — streaming behaviour and runtime overrides.
Automation & scripting — capture outputs in CI pipelines.
Configuration management — export/import agents that rely on file workflows.

PreviousConfiguration Management Guide NextAutomation and Scripting

Last updated 1 month ago

hashtagAttach Files to an Agent Run

hashtagCommon attachment errors

hashtagReuse Chunk IDs from Prior Attachments

hashtagRetrieve Artifacts and Output

hashtagBest Practices

hashtagRelated Documentation

Attach Files to an Agent Run

Common attachment errors

Reuse Chunk IDs from Prior Attachments

Retrieve Artifacts and Output

Best Practices

Related Documentation