[BETA] Realtime Chat

gllm-inference | Tutorial: [BETA] Realtime Chat | API Reference

What’s a Realtime Chat?

The realtime chat is a unified interface designed to help you interact with language models that supports realtime interactions. In this tutorial, you'll learn how to perform realtime chat using the GoogleRealtimeChat module in just a few lines of code.

Prerequisites

This example specifically requires:

  1. Completion of all setup steps listed on the Prerequisites page.

  2. Setting a Gemini API key in the GOOGLE_API_KEY environment variable.

Installation

# you can use a Conda environment
pip install --extra-index-url https://oauth2accesstoken:$(gcloud auth print-access-token)@glsdk.gdplabs.id/gen-ai-internal/simple/ gllm-inference

Quickstart

Let’s jump into a basic example using GoogleRealtimeChat.

from dotenv import load_dotenv
load_dotenv()

from gllm_inference.realtime_chat import GoogleRealtimeChat
import asyncio

realtime_chat = GoogleRealtimeChat(model_name="gemini-live-2.5-flash-preview")
asyncio.run(realtime_chat.start())

Notice that after the realtime chat starts, the following message appears in the console:

The conversation starts:

The realtime chat modules utilize a set of input and output streamers to define the input sources and output destinations when interacting with the language model. Notice that by default, it uses the following IO streamers:

  1. KeyboardInputStreamer : Sending text inputs sent via the keyboard to model.

  2. ConsoleOutputStreamer : Displaying text outputs from the model to the console.

This means that by default, the GoogleRealtimeChat modules support text inputs and text outputs. Try typing through your keyboard to start interacting with the model!

Interaction Example:

When you're done, you can type /quit to end the conversation.

Ending the conversation:

IO Streamer Customization

Now that we've learned the basics, let's try using other kinds of IO streamers! In the example below, we're going to utilize the LinuxMicInputStreamer and LinuxSpeakerOutputStreamer.

The conversation starts:

Try speaking through your microphone and have fun conversing with the language models in realtime!

After you're done, try combining them with our default IO streamers and see what happens!

Future Plans

In the future, more IO streamers can be added to allow for more robust realtime experience, this may include but are not limited to:

  1. Input streamers

    1. FileInputStreamer

    2. ScreenCaptureInputStreamer

    3. CameraInputStreamer

    4. WindowsMicInputStreamer

    5. MacMicInputStreamer

  2. Output streamers

    1. FileOutputStreamer

    2. WindowsSpeakerOutputStreamer

    3. MacSpeakerOutputStreamer

Last updated