Overview
The OpenAI Realtime conversation engine enables natural, low-latency voice conversations between users and avatars. Unlike text-based conversation engines, it handles both speech-to-text and text-to-speech natively through OpenAI’s Realtime API, providing a seamless conversational experience.Key Features
- Bidirectional Audio: Users can speak directly to the avatar and receive spoken responses
- Natural Interruptions: Users can interrupt the avatar mid-sentence, just like in real conversations
- Low Latency: Minimal delay between user speech and avatar response
- Built-in Speech Processing: No separate TTS/STT configuration needed
When to Use
Use the OpenAI Realtime engine when you need:- Natural, conversational interactions with interruption support
- Real-time voice-to-voice communication
- Low-latency responses for interactive experiences
Prerequisites
Before getting started, ensure you have:
- An OpenAI API key with access to the Realtime API
- An Avaturn API key for creating sessions
- Familiarity with OpenAI’s Realtime API basics
How It Works
1
Create OpenAI Client Secret
Your backend creates an ephemeral client secret from OpenAI’s API
2
Create Avaturn Session
Your backend creates an Avaturn session configured with the OpenAI Realtime conversation engine, passing the ephemeral client secret
3
Connect with Web SDK
Your frontend uses the Avaturn Web SDK to initialize and connect to the session
Architecture Overview
Creating OpenAI Client Secrets
OpenAI’s Realtime API uses ephemeral client secrets for secure, temporary access. These secrets are created server-side and passed to Avaturn when creating a session.Code Examples
Customizing Session Configuration
When creating an ephemeral client secret, you can customize the OpenAI Realtime session by configuring prompts, tools, voice settings, and more. Avaturn passes this configuration through to OpenAI.You have full control over the OpenAI session configuration. Configure instructions, tools, voice, temperature, and other parameters when creating the ephemeral client secret.
Available Configuration Options
instructions- Custom system prompts to guide the AI’s behaviortools- Function calling tools for extended capabilitiesvoice- Voice selection (alloy, echo, shimmer, etc.)temperature- Response randomness (0.0 - 1.0)prompt- Reference to a stored prompt by ID (see below)- Other session parameters - See OpenAI’s session configuration reference
Example: Custom Instructions and Tools
Using Stored Prompts
OpenAI allows you to save and reuse prompts across sessions. Instead of passing instructions inline, you can reference a stored prompt by its ID (format:pmpt_xxx).
- OpenAI Realtime API Guide - Comprehensive guide to the Realtime API
- Session Configuration Reference - Complete list of configuration options
- Session Update Events - Dynamically update session configuration
Client Secret Expiration
Ephemeral client secrets expire after the specified duration (7200 seconds = 2 hours in the example above). Plan your session lifecycle accordingly:- Create a new ephemeral client secret for each user session
- Handle expiration by creating new sessions
- Don’t reuse expired client secrets
Configuring the Conversation Engine
When creating an Avaturn session, configure the conversation engine with type"openai-realtime" and pass the OpenAI client secret.
Configuration Schema
The conversation engine configuration requires two fields:type: Must be"openai-realtime"client_secret: The ephemeral client secret from OpenAI (see above)
Session Creation Examples
Response
The session creation endpoint returns:session_id: Unique identifier for the sessiontoken: Session token to pass to your frontend for SDK initialization
session_token to your frontend to initialize the Avaturn Web SDK. See the Web SDK documentation for details on connecting and managing the session from the client side.
Session Management
Session Lifecycle
- Created: Session is initialized but not yet active
- Active: User has connected and conversation is ongoing
- Terminated: Session has been explicitly ended or expired
Terminating Sessions
To end a session programmatically:Sessions automatically terminate when the ephemeral client secret expires. Always handle expiration gracefully in your application.
Additional Resources
- OpenAI Realtime API Documentation
- Avaturn Web SDK Documentation
- Avaturn API Reference (for complete session creation parameters)