[BETA] Realtime Session
gllm-inference | Tutorial: [BETA] Realtime Session | API Reference
The realtime session modules are currently in beta and may be subject to changes in the future. They are intended only for quick prototyping in local environments. Please avoid using them in production environments.
What’s a Realtime Session?
The realtime session is a unified interface designed to help you interact with language models that supports realtime interactions. In this tutorial, you'll learn how to perform realtime session using the GoogleRealtimeSession module in just a few lines of code.
Installation
# you can use a Conda environment
pip install --extra-index-url https://oauth2accesstoken:$(gcloud auth print-access-token)@glsdk.gdplabs.id/gen-ai-internal/simple/ gllm-inference# you can use a Conda environment
pip install --extra-index-url https://oauth2accesstoken:$(gcloud auth print-access-token)@glsdk.gdplabs.id/gen-ai-internal/simple/ gllm-inference# you can use a Conda environment
FOR /F "tokens=*" %T IN ('gcloud auth print-access-token') DO pip install --extra-index-url "https://oauth2accesstoken:%T@glsdk.gdplabs.id/gen-ai-internal/simple/" "gllm-inference"Quickstart
Let’s jump into a basic example using GoogleRealtimeSession.
from dotenv import load_dotenv
load_dotenv()
from gllm_inference.realtime_session import GoogleRealtimeSession
import asyncio
realtime_session = GoogleRealtimeSession(model_name="gemini-2.5-flash-native-audio-preview-12-2025")
asyncio.run(realtime_session.start())Notice that after the realtime session starts, the following message appears in the console:
The conversation starts:
The realtime session modules utilize a set of input and output streamers to define the input sources and output destinations when interacting with the language model. Notice that by default, it uses the following IO streamers:
KeyboardInputStreamer: Sending text inputs sent via the keyboard to model.ConsoleOutputStreamer: Displaying text outputs from the model to the console.
This means that by default, the GoogleRealtimeSession modules support text inputs and text outputs. Try typing through your keyboard to start interacting with the model!
Interaction Example:
When you're done, you can type /quit to end the conversation.
Ending the conversation:
IO Streamer Customization
Now that we've learned the basics, let's try using other kinds of IO streamers! In the example below, we're going to utilize the LinuxMicInputStreamer and LinuxSpeakerOutputStreamer.
Limitation: As the name suggests, LinuxMicInputStreamer and LinuxSpeakerOutputStreamer are only supported in Linux systems. Similar supports for other operating system, such as Windows and Mac, are not yet available.
The conversation starts:
Try speaking through your microphone and have fun conversing with the language models in realtime!
After you're done, try combining them with our default IO streamers and see what happens!
Future Plans
In the future, more IO streamers can be added to allow for more robust realtime experience, this may include but are not limited to:
Input streamers
FileInputStreamerScreenCaptureInputStreamerCameraInputStreamerWindowsMicInputStreamerMacMicInputStreamer
Output streamers
FileOutputStreamerWindowsSpeakerOutputStreamerMacSpeakerOutputStreamer
Last updated