Repacker

gllm-generation | Tutorial: Repacker | Use Case: Your First RAG Pipeline | API Reference

What is a Repacker

A repacker rearranges a list of content chunks into an order that’s more effective for downstream model consumption. It can return the reordered chunks as a list or merge them into a single context string. This helps with long inputs where models often over-focus on the beginning or the end (“lost-in-the-middle”).

  • Benefit: Mitigates lost-in-the-middle by reordering information.

  • Outputs: List of chunks (chunk mode) or a single string (context mode).

  • Key features: Pluggable strategies, two output modes, optional size limits, configurable delimiter, custom size metric.

  • Outcomes: Either a reordered list of chunks (chunk mode) or one prompt-ready string (context mode).

  • When to use: Anytime you have multiple chunks and want better model recall across the whole input.

  • Purpose: Improve how long inputs are read by models by reordering chunks.

What a Repacker Can Do

A repacker lets you choose a repacking strategy and an output mode, optionally enforce a size limit, and control how the final context is joined with a delimiter.

  • Strategies: forward, reverse, sides.

  • Modes: chunk returns a list; context returns a string.

  • Size limit: Trims from the end before reordering; delimiter size is not counted.

  • Delimiter: Customizable in context mode (e.g., \n\n, |).

  • Size function: Default is character length; can be customized to approximate tokens.

Prerequisites

This example specifically requires completion of all setup steps listed on the Prerequisites page.

Installation

Quickstart

This quickstart shows a basic pass-through (forward order) and how to produce a single context string.

Repacker Method

Repacker provides three repacking method so you can tune the order to your needs.

Forward Method (default)

This method preserves the original order. Use when chronology matters or your source sequence is already ideal.

Reverse Method

This method flips the order so the most recent or concluding information appears first.

Sides Method

This method works by alternating the chunks from the end and start, emphasizing both beginning and end to reduce “lost-in-the-middle”.

Repacker Mode

Repacker provides two repacking mode so you can tune the format of the output.

Chunk Mode (default)

This mode returns a reordered list of Chunk objects. Best when you need per-chunk operations (filtering, windowing, budgeting, scoring) downstream.

Context Mode

This mode returns a single string joined by your delimiter. Best when you want a prompt-ready context immediately.

Advanced: size limits and custom size functions

Repacker also has several keyword arguments you can use to customize the repacker even further.

  • Limit behavior: Trims from the end until total size (by size_func) fits the budget.

  • Default metric: Character length of chunk.content.

  • Custom metric: Provide your own size_func (e.g., a rough token estimator).

  • Delimiter not counted: In context mode, delimiter length is excluded from the limit.

Last updated