File Processing Guide

Attach files to agent runs, reuse uploaded artifacts, and manage chunk IDs for long-form analysis. Reach for this guide when agents need to consume documents, transcripts, or datasets across REST, the Python SDK, and the CLI.

File-handling support is summarised in the AIP capability matrix. The main limitation today is regenerating presigned URLs—stick with the REST helper until SDK/CLI shortcuts arrive.

aip agents run accepts either an agent ID or a unique name. Use --select to pick from partial name matches or provide the ID directly when scripting.

Upload Files with an Agent Run

When to use: Collect fresh documents from users or pipelines and supply them during execution.

from glaip_sdk import Client

client = Client()
agent = client.agents.get_agent_by_id("analysis-agent")

response = client.agents.run_agent(
    agent.id,
    "Summarise the document and extract key metrics",
    files=["./reports/q1.pdf", "./reports/q2.pdf"],
)
print(response)

Common upload errors

Symptom

Likely cause

Fix

413 Payload Too Large

File exceeds backend upload limits.

Compress the file, split it into smaller chunks, or raise the limit with the platform team.

Missing file in run logs

File path incorrect or permissions denied.

Double-check the path, ensure the process can read the file, or use absolute paths.

Duplicate chunks created

Upload run without reusing artifact_id.

Pass the stored chunk IDs using the reuse workflows in the next section.

Unsupported media type errors

File type not allowed for ingestion.

Convert to a supported format (PDF, TXT, DOCX) or register a custom ingestion pipeline.

Reuse Uploaded Chunks

When to use: Avoid re-ingesting the same files while keeping chunk IDs stable across runs.

When the backend returns chunk_ids, store them for later runs:

chunk_ids = ["chunk-abc", "chunk-def"]
client.agents.run_agent(
    agent.id,
    "Compare the latest reports with previous uploads",
    chunk_ids=chunk_ids,
)

CLI support for passing chunk_ids is coming soon—use the SDK or REST API today to avoid re-uploading large files.

Retrieve Artifacts and Output

When to use: Capture the processed results, enriched files, or generated reports after execution.

Capture the run ID

Capture the run ID from the streaming response (X-Run-ID).

List run history

List run history:

curl -sL "$AIP_API_URL/agents/$AGENT_ID/runs" -H "X-API-Key: $AIP_API_KEY" | jq

Download artifacts

Download artifacts directly from the presigned URLs in the response. If a URL has expired, regenerate it with /utils/regenerate_presigned_url.

Best Practices

When to use: Create organisation-wide guardrails for storage, retention, and compliance.

Compress large files — keep uploads efficient and within platform limits.
Track chunk IDs — store them alongside run metadata so you can reference prior uploads without retransmitting data.
Sanitise inputs — redaction or PII masking should occur before uploading sensitive documents; see the Security & privacy guide.
Automate clean-up — if you are storing artifacts locally for auditing, ensure rotation policies are in place.

Agents guide — streaming behaviour and runtime overrides.
Automation & scripting — capture outputs in CI pipelines.
Configuration management — export/import agents that rely on file workflows.

Last updated 3 months ago