lockEncryption

Encryption in Data Store

The Encryption Capability provides transparent field-level encryption for data store chunks. It encrypts chunk content and metadata fields automatically during write operations and decrypts them during read operations, working seamlessly with fulltext and vector capabilities.

Encryption operates transparently—you don't need to access it directly. Once configured, it's automatically used by fulltext and vector capabilities whenever you create, update, or retrieve chunks.

circle-info

Semantic Search Support: Even with encryption enabled, vector search remains fully functional. Embeddings are generated from the plaintext content before encryption, ensuring semantic search accuracy without compromising security.

chevron-rightPrerequisiteshashtag

This example specifically requires completion of all setup steps listed on the Prerequisites page.

You should be familiar with these concepts and components:

  1. The fulltext capability overview in Fulltext capability

  2. The vector capability overview in Vector capability

How Encryption Works

Transparent Operation

Encryption integrates directly with fulltext and vector capabilities. When you enable encryption on a data store:

  1. During Write Operations: Content and metadata fields specified in the encryption configuration are encrypted before being stored.

  2. During Read Operations: Encrypted fields are automatically decrypted when chunks are retrieved.

  3. Embedding Generation: For vector capability, embeddings are generated from plaintext content before encryption.

Field-Level Configuration

You can encrypt specific fields:

  1. Content field: Encrypt the chunk content using "content".

  2. Metadata fields: Encrypt specific metadata fields using dot notation, e.g., "metadata.secret_api_key".

  3. Nested metadata: Support for nested metadata fields, e.g., "metadata.user.email".

Choose an Encryptor

The data store supports multiple encryptor types:

AES-GCM Encryptor

Use AESGCMEncryptor for simple encryption with a direct key:

circle-exclamation

Key Rotating Encryptor

Use KeyRotatingEncryptor for scenarios requiring key rotation:

KMS Encryptor

Use KmsEncryptor for production scenarios where encryption keys are managed by a Key Management Service (KMS). This encryptor implements envelope encryption: a Data Encryption Key (DEK) is generated and encrypted by the KMS, and the DEK is then used with AES-GCM to encrypt the actual data.

circle-info

KmsEncryptor delegates key generation and key encryption to the KMS, so the plaintext DEK is never stored. This is the recommended approach for regulated environments.

Enable Encryption

Enable encryption using the .with_encryption() method. This method can be chained with other capability registration methods.

Example: Chroma Data Store

Using Encrypted Data Store

Once encryption is enabled, use the data store normally. Encryption and decryption happen automatically:

⚠️ STRICT WARNING: Filtering & Sorting

triangle-exclamation

Correct vs. Incorrect Usage

Correct: Filter by id or plaintext metadata, then read encrypted data.

Incorrect: Attempting to filter by the encrypted field itself.

Incorrect: Sorting by an encrypted field.

Last updated

Was this helpful?