Encryption
Encryption in Data Store
The Encryption Capability provides transparent field-level encryption for data store chunks. It encrypts chunk content and metadata fields automatically during write operations and decrypts them during read operations, working seamlessly with fulltext and vector capabilities.
Encryption operates transparently—you don't need to access it directly. Once configured, it's automatically used by fulltext and vector capabilities whenever you create, update, or retrieve chunks.
Semantic Search Support: Even with encryption enabled, vector search remains fully functional. Embeddings are generated from the plaintext content before encryption, ensuring semantic search accuracy without compromising security.
Prerequisites
This example specifically requires completion of all setup steps listed on the Prerequisites page.
You should be familiar with these concepts and components:
The fulltext capability overview in Fulltext capability
The vector capability overview in Vector capability
How Encryption Works
Transparent Operation
Encryption integrates directly with fulltext and vector capabilities. When you enable encryption on a data store:
During Write Operations: Content and metadata fields specified in the encryption configuration are encrypted before being stored.
During Read Operations: Encrypted fields are automatically decrypted when chunks are retrieved.
Embedding Generation: For vector capability, embeddings are generated from plaintext content before encryption.
Field-Level Configuration
You can encrypt specific fields:
Content field: Encrypt the chunk content using
"content".Metadata fields: Encrypt specific metadata fields using dot notation, e.g.,
"metadata.secret_api_key".Nested metadata: Support for nested metadata fields, e.g.,
"metadata.user.email".
Choose an Encryptor
The data store supports multiple encryptor types:
AES-GCM Encryptor
Use AESGCMEncryptor for simple encryption with a direct key:
Key Management: Store your encryption key securely. If you lose the key, you cannot decrypt your data. Consider using a key management service for production applications.
Key Rotating Encryptor
Use KeyRotatingEncryptor for scenarios requiring key rotation:
KMS Encryptor
Use KmsEncryptor for production scenarios where encryption keys are managed by a Key Management Service (KMS). This encryptor implements envelope encryption: a Data Encryption Key (DEK) is generated and encrypted by the KMS, and the DEK is then used with AES-GCM to encrypt the actual data.
KmsEncryptor delegates key generation and key encryption to the KMS, so the plaintext DEK is never stored. This is the recommended approach for regulated environments.
Enable Encryption
Enable encryption using the .with_encryption() method. This method can be chained with other capability registration methods.
Example: Chroma Data Store
Using Encrypted Data Store
Once encryption is enabled, use the data store normally. Encryption and decryption happen automatically:
⚠️ STRICT WARNING: Filtering & Sorting
DO NOT use encrypted fields for filtering or sorting.
Filtering or sorting on encrypted fields (order_by, F.eq, etc.) will fail or yield incorrect results because the database stores randomized ciphertext (different every time), not the plaintext value.
ALWAYS use plaintext fields (like id or non-sensitive metadata) for filtering and sorting.
Correct vs. Incorrect Usage
✅ Correct: Filter by id or plaintext metadata, then read encrypted data.
❌ Incorrect: Attempting to filter by the encrypted field itself.
❌ Incorrect: Sorting by an encrypted field.
Last updated
Was this helpful?