Skip to main content

Overview

The OpenAI Realtime conversation engine enables natural, low-latency voice conversations between users and avatars. Unlike text-based conversation engines, it handles both speech-to-text and text-to-speech natively through OpenAI’s Realtime API, providing a seamless conversational experience.

Key Features

  • Bidirectional Audio: Users can speak directly to the avatar and receive spoken responses
  • Natural Interruptions: Users can interrupt the avatar mid-sentence, just like in real conversations
  • Low Latency: Minimal delay between user speech and avatar response
  • Built-in Speech Processing: No separate TTS/STT configuration needed

When to Use

Use the OpenAI Realtime engine when you need:
  • Natural, conversational interactions with interruption support
  • Real-time voice-to-voice communication
  • Low-latency responses for interactive experiences
For scripted content with precise timing control, consider using the text-echo conversation engine instead.

Prerequisites

Before getting started, ensure you have:
  • An OpenAI API key with access to the Realtime API
  • An Avaturn API key for creating sessions
  • Familiarity with OpenAI’s Realtime API basics

How It Works

1

Create OpenAI Client Secret

Your backend creates an ephemeral client secret from OpenAI’s API
2

Create Avaturn Session

Your backend creates an Avaturn session configured with the OpenAI Realtime conversation engine, passing the ephemeral client secret
3

Connect with Web SDK

Your frontend uses the Avaturn Web SDK to initialize and connect to the session

Architecture Overview

Creating OpenAI Client Secrets

OpenAI’s Realtime API uses ephemeral client secrets for secure, temporary access. These secrets are created server-side and passed to Avaturn when creating a session.
Never expose your OpenAI API key to the frontend. Always create ephemeral client secrets on your backend server.

Code Examples

from openai import AsyncOpenAI

# Initialize OpenAI client with your API key
client = AsyncOpenAI(api_key="your-openai-api-key")

# Create an ephemeral client secret
session = await client.realtime.client_secrets.create(
    expires_after={"seconds": 7200, "anchor": "created_at"},
    session={"type": "realtime", "model": "gpt-realtime"}
)

# Use session.value as the client_secret
client_secret = session.value

Customizing Session Configuration

When creating an ephemeral client secret, you can customize the OpenAI Realtime session by configuring prompts, tools, voice settings, and more. Avaturn passes this configuration through to OpenAI.
You have full control over the OpenAI session configuration. Configure instructions, tools, voice, temperature, and other parameters when creating the ephemeral client secret.

Available Configuration Options

  • instructions - Custom system prompts to guide the AI’s behavior
  • tools - Function calling tools for extended capabilities
  • voice - Voice selection (alloy, echo, shimmer, etc.)
  • temperature - Response randomness (0.0 - 1.0)
  • prompt - Reference to a stored prompt by ID (see below)
  • Other session parameters - See OpenAI’s session configuration reference

Example: Custom Instructions and Tools

from openai import AsyncOpenAI

client = AsyncOpenAI(api_key="your-openai-api-key")

# Create ephemeral client secret with custom configuration
session = await client.realtime.client_secrets.create(
    expires_after={"seconds": 7200, "anchor": "created_at"},
    session={
        "type": "realtime",
        "model": "gpt-realtime",
        "instructions": "You are a helpful AI assistant representing a company. Be professional and friendly.",
        "voice": "alloy",
        "temperature": 0.8,
        "tools": [
            {
                "type": "function",
                "name": "get_weather",
                "description": "Get the current weather for a location",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "location": {"type": "string", "description": "City name"}
                    },
                    "required": ["location"]
                }
            }
        ]
    }
)

client_secret = session.value

Using Stored Prompts

OpenAI allows you to save and reuse prompts across sessions. Instead of passing instructions inline, you can reference a stored prompt by its ID (format: pmpt_xxx).
Stored prompts can include instructions, tools, variables, and example messages. This helps maintain consistency and simplifies prompt management.
from openai import AsyncOpenAI

client = AsyncOpenAI(api_key="your-openai-api-key")

# Create ephemeral client secret using a stored prompt
session = await client.realtime.client_secrets.create(
    expires_after={"seconds": 7200, "anchor": "created_at"},
    session={
        "type": "realtime",
        "model": "gpt-realtime",
        "prompt": {
            "id": "pmpt_abc123",  # Your stored prompt ID
            "version": "6",       # Optional: pin to specific version
            "variables": {        # Optional: pass variables to prompt
                "company_name": "Acme Corp",
                "tone": "professional"
            }
        }
    }
)

client_secret = session.value
Learn More:

Client Secret Expiration

Ephemeral client secrets expire after the specified duration (7200 seconds = 2 hours in the example above). Plan your session lifecycle accordingly:
  • Create a new ephemeral client secret for each user session
  • Handle expiration by creating new sessions
  • Don’t reuse expired client secrets

Configuring the Conversation Engine

When creating an Avaturn session, configure the conversation engine with type "openai-realtime" and pass the OpenAI client secret.

Configuration Schema

The conversation engine configuration requires two fields:
  • type: Must be "openai-realtime"
  • client_secret: The ephemeral client secret from OpenAI (see above)

Session Creation Examples

import requests

# After creating the OpenAI client secret (see above)
response = requests.post(
    "https://api.avaturn.live/api/v1/sessions",
    headers={
        "Authorization": f"Bearer {avaturn_api_key}",
        "Content-Type": "application/json"
    },
    json={
        "conversation_engine": {
            "type": "openai-realtime",
            "client_secret": client_secret  # From OpenAI
        }
    }
)

data = response.json()
session_id = data["session_id"]
session_token = data["token"]

Response

The session creation endpoint returns:
  • session_id: Unique identifier for the session
  • token: Session token to pass to your frontend for SDK initialization
Pass the session_token to your frontend to initialize the Avaturn Web SDK. See the Web SDK documentation for details on connecting and managing the session from the client side.

Session Management

Session Lifecycle

  1. Created: Session is initialized but not yet active
  2. Active: User has connected and conversation is ongoing
  3. Terminated: Session has been explicitly ended or expired

Terminating Sessions

To end a session programmatically:
import requests

requests.delete(
    f"https://api.avaturn.live/api/v1/sessions/{session_id}",
    headers={"Authorization": f"Bearer {avaturn_api_key}"}
)
Sessions automatically terminate when the ephemeral client secret expires. Always handle expiration gracefully in your application.

Additional Resources