Aurelio Backend
The Aurelio Backend provides advanced semantic routing capabilities through Aurelio Labs' semantic router library. This guide covers the encoder and index components that power the Aurelio backend.
Overview
The Aurelio backend consists of three main components:
Encoders - Convert text/images into embeddings for semantic similarity
Index - Store and retrieve routes efficiently
Adapter - Orchestrates encoders and indexes for routing
Installation
pip install gllm-pipeline gllm-inference semantic-routerPrerequisites
This tutorial requires familiarity with these concepts:
https://github.com/GDP-ADMIN/gl-sdk/blob/docs/gitbook-sync/gitbook/gen-ai-sdk/tutorials/orchestration/inference/em-invoker.md - For understanding embedding model invocation
https://github.com/GDP-ADMIN/gl-sdk/blob/docs/gitbook-sync/gitbook/gen-ai-sdk/tutorials/orchestration/routing/semantic-router.md - For understanding the router interface and basic usage
Encoders
Encoders convert input text or images into embeddings for semantic similarity matching. The Aurelio backend supports multiple encoder types.
EM Invoker Encoder
Use any GLLM embedding model as an encoder:
Advantages:
Use any GLLM-supported embedding model
Consistent with your application's embedding infrastructure
Supports all EM invoker features (caching, retries, etc.)
Langchain Encoder
Use Langchain embedding models:
Advantages:
Leverage existing Langchain integrations
Access to Langchain's embedding ecosystem
Compatible with Langchain-based applications
TEI Encoder
Use Text Embeddings Inference (TEI) for local or remote embeddings:
Advantages:
Local embedding inference (privacy-preserving)
No API costs
Full control over the embedding model
Supports various model architectures
Index Options
Indexes store and retrieve routes efficiently. Choose based on your use case and scale.
Local Index
In-memory index suitable for development and small deployments:
Characteristics:
In-memory storage
Fast for small datasets
No persistence
Single-process only
Aurelio Index
Base Aurelio index for custom implementations:
Azure AI Search Index
Use Azure AI Search for scalable, cloud-hosted indexing:
Advantages:
Scalable cloud storage
Full-text and vector search
High availability
Enterprise-grade security
Datastore Adapter Index
Use GLLM datastores (Pinecone, Weaviate, etc.) as indexes:
Advantages:
Leverage existing GLLM datastore infrastructure
Support for multiple vector databases
Unified interface across datastores
Reuse datastore configurations
Advanced Configuration
Sync Modes
Control how routes are synchronized:
Sync Modes:
LOCAL - Synchronous local updates
REMOTE - Synchronous remote updates
ASYNC - Asynchronous updates
Custom Route Configuration
Define routes with additional metadata:
Similarity Threshold
Control matching sensitivity:
Complete Example: Production Setup
Best Practices
Encoder Selection
Use EMInvokerEncoder for consistency with GLLM
Use TEIEncoder for privacy-critical applications
Use LangchainEncoder if already using Langchain
Index Selection
Use LocalIndex for development/testing
Use AzureAISearchAurelioIndex for production at scale
Use DataStoreAdapterIndex to leverage existing datastores
Performance Tuning
Adjust
similarity_thresholdbased on your accuracy/recall tradeoffUse larger embedding models for better accuracy
Cache embeddings when possible
Monitoring
Log routing decisions for analysis
Monitor index size and query latency
Track fallback to default route frequency
Troubleshooting
Encoder initialization fails?
Verify API credentials are correct
Check network connectivity for remote encoders
Ensure required packages are installed
Index operations slow?
Consider switching to a faster index type
Optimize similarity threshold
Check index size and cleanup old routes
Routes not matching?
Lower the similarity threshold
Add more diverse route examples
Verify encoder is working correctly
See Also
Semantic Router - Main Semantic Router documentation
Similarity-Based Router - Simpler alternative
LM-Based Router - Language model-based routing
Last updated
Was this helpful?