Aurelio Backend

The Aurelio Backend provides advanced semantic routing capabilities through Aurelio Labs' semantic router library. This guide covers the encoder and index components that power the Aurelio backend.

Overview

The Aurelio backend consists of three main components:

  1. Encoders - Convert text/images into embeddings for semantic similarity

  2. Index - Store and retrieve routes efficiently

  3. Adapter - Orchestrates encoders and indexes for routing

Installation

pip install gllm-pipeline gllm-inference semantic-router
chevron-rightPrerequisiteshashtag

Encoders

Encoders convert input text or images into embeddings for semantic similarity matching. The Aurelio backend supports multiple encoder types.

EM Invoker Encoder

Use any GLLM embedding model as an encoder:

Advantages:

  • Use any GLLM-supported embedding model

  • Consistent with your application's embedding infrastructure

  • Supports all EM invoker features (caching, retries, etc.)

Langchain Encoder

Use Langchain embedding models:

Advantages:

  • Leverage existing Langchain integrations

  • Access to Langchain's embedding ecosystem

  • Compatible with Langchain-based applications

TEI Encoder

Use Text Embeddings Inference (TEI) for local or remote embeddings:

Advantages:

  • Local embedding inference (privacy-preserving)

  • No API costs

  • Full control over the embedding model

  • Supports various model architectures

Index Options

Indexes store and retrieve routes efficiently. Choose based on your use case and scale.

Local Index

In-memory index suitable for development and small deployments:

Characteristics:

  • In-memory storage

  • Fast for small datasets

  • No persistence

  • Single-process only

Aurelio Index

Base Aurelio index for custom implementations:

Azure AI Search Index

Use Azure AI Search for scalable, cloud-hosted indexing:

Advantages:

  • Scalable cloud storage

  • Full-text and vector search

  • High availability

  • Enterprise-grade security

Datastore Adapter Index

Use GLLM datastores (Pinecone, Weaviate, etc.) as indexes:

Advantages:

  • Leverage existing GLLM datastore infrastructure

  • Support for multiple vector databases

  • Unified interface across datastores

  • Reuse datastore configurations

Advanced Configuration

Sync Modes

Control how routes are synchronized:

Sync Modes:

  • LOCAL - Synchronous local updates

  • REMOTE - Synchronous remote updates

  • ASYNC - Asynchronous updates

Custom Route Configuration

Define routes with additional metadata:

Similarity Threshold

Control matching sensitivity:

Complete Example: Production Setup

Best Practices

  1. Encoder Selection

    • Use EMInvokerEncoder for consistency with GLLM

    • Use TEIEncoder for privacy-critical applications

    • Use LangchainEncoder if already using Langchain

  2. Index Selection

    • Use LocalIndex for development/testing

    • Use AzureAISearchAurelioIndex for production at scale

    • Use DataStoreAdapterIndex to leverage existing datastores

  3. Performance Tuning

    • Adjust similarity_threshold based on your accuracy/recall tradeoff

    • Use larger embedding models for better accuracy

    • Cache embeddings when possible

  4. Monitoring

    • Log routing decisions for analysis

    • Monitor index size and query latency

    • Track fallback to default route frequency

Troubleshooting

Encoder initialization fails?

  • Verify API credentials are correct

  • Check network connectivity for remote encoders

  • Ensure required packages are installed

Index operations slow?

  • Consider switching to a faster index type

  • Optimize similarity threshold

  • Check index size and cleanup old routes

Routes not matching?

  • Lower the similarity threshold

  • Add more diverse route examples

  • Verify encoder is working correctly

See Also

Last updated

Was this helpful?