Updated Feb 23, 2026

Function Calling in Voice

The Realtime API does not just speak—it acts. Through function calling, your voice agent executes real operations: creating tasks, querying databases, sending notifications. But voice-triggered actions require different UX patterns than chat-based tools.

This lesson teaches you to implement voice functions that feel natural: filler speech during async operations, confirmation for destructive actions, and graceful error recovery.

Voice Function Calling: The Difference

In chat interfaces, tool calls are invisible—users wait for a response. In voice, silence is awkward. A 3-second database query in chat is fine. A 3-second silence in voice feels broken.

Chat Tool Call

User: "Create a task to review the proposal"
[2 second wait - acceptable]
Agent: "I've created your task."

Voice Tool Call (Wrong)

User: "Create a task to review the proposal"
[2 seconds of silence - feels broken]
Agent: "I've created your task."

Voice Tool Call (Right)

User: "Create a task to review the proposal"
Agent: "Creating that task for you..."
[function executes]
Agent: "Done. I've added 'review the proposal' to your task list."

The pattern: Fill silence with acknowledgment. Confirm completion with results.

Defining Voice Functions

Function definitions for the Realtime API follow the OpenAI function calling schema, but require voice-specific considerations.

Basic Function Schema

task_functions = [
    {
        "type": "function",
        "name": "create_task",
        "description": "Create a new task for the user. Use when the user asks to add, create, or make a new task.",
        "parameters": {
            "type": "object",
            "properties": {
                "title": {
                    "type": "string",
                    "description": "The task title. Extract the core action from the user's request."
                },
                "due_date": {
                    "type": "string",
                    "description": "When the task is due, in YYYY-MM-DD format. Parse natural language like 'tomorrow', 'next Friday', 'end of week'."
                },
                "priority": {
                    "type": "string",
                    "enum": ["low", "medium", "high"],
                    "description": "Task priority. Default to 'medium' if not specified."
                }
            },
            "required": ["title"]
        }
    },
    {
        "type": "function",
        "name": "list_tasks",
        "description": "List the user's tasks. Use when user asks what tasks they have, what's on their list, or what's due.",
        "parameters": {
            "type": "object",
            "properties": {
                "status": {
                    "type": "string",
                    "enum": ["all", "pending", "completed", "overdue"],
                    "description": "Which tasks to show. Default to 'pending' if not specified."
                },
                "due_filter": {
                    "type": "string",
                    "enum": ["today", "tomorrow", "this_week", "all"],
                    "description": "Filter by due date. Default to 'all' if not specified."
                }
            }
        }
    },
    {
        "type": "function",
        "name": "complete_task",
        "description": "Mark a task as complete. Use when user says they finished, completed, or done with a task.",
        "parameters": {
            "type": "object",
            "properties": {
                "task_identifier": {
                    "type": "string",
                    "description": "How the user referred to the task. Could be title, number, or description."
                }
            },
            "required": ["task_identifier"]
        }
    },
    {
        "type": "function",
        "name": "delete_task",
        "description": "Permanently delete a task. This is destructive and cannot be undone. Requires confirmation.",
        "parameters": {
            "type": "object",
            "properties": {
                "task_identifier": {
                    "type": "string",
                    "description": "How the user referred to the task to delete."
                },
                "confirmed": {
                    "type": "boolean",
                    "description": "Whether the user has confirmed deletion. Must be true to proceed."
                }
            },
            "required": ["task_identifier", "confirmed"]
        }
    }
]

Voice-Specific Description Guidelines

Guideline	Why	Example
Include trigger phrases	Model knows when to call	"Use when user asks to add, create, or make..."
Mention defaults	Reduces clarification requests	"Default to 'medium' if not specified"
Flag destructive actions	Enables confirmation logic	"This is destructive and cannot be undone"
Describe natural language parsing	Improves extraction	"Parse natural language like 'tomorrow', 'next Friday'"

Registering Functions with the Session

Functions are registered during session configuration:

def send_session_config(data_channel, instructions: str, functions: list):
    """Configure session with voice functions."""

    config = {
        "type": "session.update",
        "session": {
            "instructions": f"""
            {instructions}

            When using tools:
            - Acknowledge the request before calling the function
            - Provide brief filler while waiting ("Let me check that...")
            - Confirm results clearly and concisely
            - If an error occurs, explain what happened and suggest next steps
            """,

            "tools": functions,

            "turn_detection": {
                "type": "server_vad",
                "threshold": 0.5,
                "silence_duration_ms": 500
            },

            "input_audio_format": "pcm16",
            "output_audio_format": "pcm16"
        }
    }

    data_channel.send(json.dumps(config))

Handling Function Calls

When the model decides to call a function, you receive events through the DataChannel:

Function Call Event Flow

Model decides to call function
    │
    ├── response.function_call_arguments.delta (streaming arguments)
    │
    ├── response.function_call_arguments.done (arguments complete)
    │
    └── You execute function and send result

Implementation

class FunctionHandler:
    """Handles voice function calls."""

    def __init__(self, data_channel, task_service):
        self.data_channel = data_channel
        self.task_service = task_service
        self.pending_calls = {}  # call_id -> accumulated arguments

    async def handle_event(self, event: dict):
        """Process function-related events."""
        event_type = event.get("type")

        if event_type == "response.function_call_arguments.delta":
            # Accumulate streaming arguments
            call_id = event["call_id"]
            delta = event.get("delta", "")

            if call_id not in self.pending_calls:
                self.pending_calls[call_id] = {
                    "name": event["name"],
                    "arguments": ""
                }
            self.pending_calls[call_id]["arguments"] += delta

        elif event_type == "response.function_call_arguments.done":
            # Arguments complete, execute function
            call_id = event["call_id"]
            call_info = self.pending_calls.pop(call_id, None)

            if call_info:
                result = await self._execute_function(
                    call_info["name"],
                    json.loads(call_info["arguments"])
                )
                self._send_result(call_id, result)

    async def _execute_function(self, name: str, args: dict) -> dict:
        """Execute the function and return result."""
        print(f"[function] Executing {name} with {args}")

        try:
            if name == "create_task":
                task = await self.task_service.create(
                    title=args["title"],
                    due_date=args.get("due_date"),
                    priority=args.get("priority", "medium")
                )
                return {
                    "success": True,
                    "message": f"Created task: {task.title}",
                    "task_id": task.id,
                    "due_date": task.due_date
                }

            elif name == "list_tasks":
                tasks = await self.task_service.list(
                    status=args.get("status", "pending"),
                    due_filter=args.get("due_filter", "all")
                )
                return {
                    "success": True,
                    "count": len(tasks),
                    "tasks": [
                        {"title": t.title, "due": t.due_date, "priority": t.priority}
                        for t in tasks
                    ]
                }

            elif name == "complete_task":
                task = await self.task_service.complete(args["task_identifier"])
                return {
                    "success": True,
                    "message": f"Completed: {task.title}"
                }

            elif name == "delete_task":
                if not args.get("confirmed"):
                    return {
                        "success": False,
                        "needs_confirmation": True,
                        "message": "Deletion requires confirmation. Ask user to confirm."
                    }

                await self.task_service.delete(args["task_identifier"])
                return {
                    "success": True,
                    "message": "Task deleted permanently"
                }

            else:
                return {"success": False, "error": f"Unknown function: {name}"}

        except Exception as e:
            return {
                "success": False,
                "error": str(e),
                "suggestion": "Please try again or rephrase your request."
            }

    def _send_result(self, call_id: str, result: dict):
        """Send function result back to the model."""
        response = {
            "type": "conversation.item.create",
            "item": {
                "type": "function_call_output",
                "call_id": call_id,
                "output": json.dumps(result)
            }
        }
        self.data_channel.send(json.dumps(response))

        # Trigger response generation
        self.data_channel.send(json.dumps({"type": "response.create"}))

Output:

[function] Executing create_task with {'title': 'review proposal', 'due_date': '2025-01-03'}
[realtime] Function result sent
[agent] Done. I've created a task to review the proposal, due on Friday.

Filler Speech Patterns

Users expect feedback during async operations. The model can generate filler speech before executing functions.

Instructing the Model for Filler

Include filler guidance in your instructions:

instructions = """
You are a voice assistant for Task Manager.

**Before calling any function:**
1. Acknowledge the request naturally: "Sure, I'll do that" or "Let me check"
2. For operations that might take time, add brief filler: "One moment..."

**After function results:**
1. Summarize the result concisely
2. If there are multiple items, list the top 3 and mention the total
3. Offer next steps: "Would you like to add another?" or "Anything else?"

**Filler examples:**
- Creating task: "Sure, creating that task for you..."
- Listing tasks: "Let me pull up your tasks..."
- Completing task: "Marking that as done..."
- Error: "Hmm, I ran into an issue..."
"""

Filler Timing

The model generates filler before the function call event:

User: "What tasks do I have today?"
                │
                ▼
Model generates filler: "Let me check your tasks for today..."
                │
                ▼
Function call event: list_tasks(due_filter="today")
                │
                ▼
Your code executes function (may take 500ms)
                │
                ▼
Send result to model
                │
                ▼
Model generates response: "You have 3 tasks today: review proposal,
                          send invoices, and team standup."

The user hears filler immediately, then the result—no awkward silence.

Custom Filler for Long Operations

For operations exceeding 2-3 seconds, consider multi-stage filler:

async def execute_with_progress(self, name: str, args: dict):
    """Execute function with progress updates."""

    # Stage 1: Immediate acknowledgment (model-generated)
    # Already handled by instructions

    # Stage 2: If operation is slow, send interim status
    operation = asyncio.create_task(self._execute_function(name, args))

    await asyncio.sleep(2)  # Wait 2 seconds

    if not operation.done():
        # Send interim status through DataChannel
        self._speak("Still working on that, almost there...")

    return await operation

def _speak(self, text: str):
    """Make the model speak specific text."""
    event = {
        "type": "conversation.item.create",
        "item": {
            "type": "message",
            "role": "assistant",
            "content": [{"type": "text", "text": text}]
        }
    }
    self.data_channel.send(json.dumps(event))
    self.data_channel.send(json.dumps({"type": "response.create"}))

Confirmation Patterns for Destructive Actions

Not all actions should execute immediately. Destructive operations need confirmation.

Identifying Destructive Actions

Action	Destructive?	Why
Create task	No	Easily undone
Complete task	Maybe	Could revert, but context-dependent
Delete task	Yes	Permanent data loss
Clear all tasks	Yes	Massive data loss
Send email	Yes	Cannot unsend
Cancel subscription	Yes	May have consequences

Two-Step Confirmation Pattern

# Step 1: Function called without confirmation
def handle_delete_request(self, task_identifier: str, confirmed: bool):
    if not confirmed:
        # Tell model to ask for confirmation
        return {
            "success": False,
            "needs_confirmation": True,
            "task_to_delete": task_identifier,
            "message": "Please confirm deletion by asking: 'Are you sure you want to delete this task?'"
        }

    # Step 2: User confirms, function called again with confirmed=True
    self.task_service.delete(task_identifier)
    return {"success": True, "message": "Task deleted"}

Conversation Flow

User: "Delete the proposal review task"

Agent: "You want to delete 'review proposal'. This can't be undone.
        Should I go ahead and delete it?"

User: "Yes, delete it"

Agent: "Done. I've permanently deleted the proposal review task."

Timeout for Confirmation

Confirmations should expire:

class ConfirmationManager:
    """Manages pending confirmations with timeout."""

    def __init__(self, timeout_seconds: int = 30):
        self.pending = {}  # action_id -> (action_data, timestamp)
        self.timeout = timeout_seconds

    def request_confirmation(self, action_id: str, action_data: dict) -> dict:
        """Store pending confirmation."""
        self.pending[action_id] = (action_data, time.time())
        return {
            "needs_confirmation": True,
            "action_id": action_id,
            "message": f"Confirm: {action_data['description']}?"
        }

    def confirm(self, action_id: str) -> dict | None:
        """Retrieve confirmed action if still valid."""
        if action_id not in self.pending:
            return None

        action_data, timestamp = self.pending.pop(action_id)

        if time.time() - timestamp > self.timeout:
            return {
                "expired": True,
                "message": "Confirmation expired. Please try again."
            }

        return action_data

    def cleanup_expired(self):
        """Remove expired confirmations."""
        now = time.time()
        expired = [
            action_id for action_id, (_, ts) in self.pending.items()
            if now - ts > self.timeout
        ]
        for action_id in expired:
            del self.pending[action_id]

Error Handling in Voice

Errors in voice context need different handling than chat:

Error Categories

Category	Voice Response	Example
User Error	Clarify and suggest	"I couldn't find a task called 'meeting'. Did you mean 'team meeting'?"
Temporary Failure	Apologize and retry offer	"I'm having trouble reaching the task system. Want me to try again?"
Permanent Failure	Explain and provide alternative	"I can't delete tasks right now. Try again in a few minutes, or use the app."
Unknown Error	Graceful fallback	"Something went wrong. Can you try rephrasing that?"

Error Response Implementation

def format_error_for_voice(self, error: Exception, function_name: str) -> dict:
    """Convert error to voice-appropriate response."""

    error_str = str(error).lower()

    # User error: bad input
    if "not found" in error_str:
        return {
            "success": False,
            "error_type": "not_found",
            "spoken_message": "I couldn't find that task. Could you describe it differently?",
            "suggestion": "Try using the exact task title or a unique phrase from it."
        }

    # Temporary failure: service unavailable
    if "timeout" in error_str or "connection" in error_str:
        return {
            "success": False,
            "error_type": "temporary",
            "spoken_message": "I'm having trouble reaching the task system right now.",
            "suggestion": "Would you like me to try again?"
        }

    # Permission error
    if "permission" in error_str or "unauthorized" in error_str:
        return {
            "success": False,
            "error_type": "permission",
            "spoken_message": "You don't have permission to do that.",
            "suggestion": "You may need to check your account settings."
        }

    # Unknown error: graceful fallback
    return {
        "success": False,
        "error_type": "unknown",
        "spoken_message": "Something went wrong with that request.",
        "suggestion": "Could you try rephrasing? Or we can try again."
    }

Retry Pattern

async def execute_with_retry(
    self,
    name: str,
    args: dict,
    max_retries: int = 2
) -> dict:
    """Execute function with automatic retry on temporary failures."""

    for attempt in range(max_retries + 1):
        try:
            result = await self._execute_function(name, args)

            if result.get("success"):
                return result

            # Check if retryable
            if result.get("error_type") == "temporary" and attempt < max_retries:
                await asyncio.sleep(1)  # Brief pause
                continue

            return result

        except Exception as e:
            if attempt < max_retries:
                await asyncio.sleep(1)
                continue

            return self.format_error_for_voice(e, name)

    return {
        "success": False,
        "error_type": "exhausted",
        "spoken_message": "I tried a few times but couldn't complete that. Please try again later."
    }

Complete Function Calling Example

Here is a complete voice agent with function calling:

import asyncio
import json
from datetime import datetime, timedelta

# Simulated task service
class TaskService:
    def __init__(self):
        self.tasks = [
            {"id": 1, "title": "Review proposal", "due": "2025-01-02", "priority": "high", "status": "pending"},
            {"id": 2, "title": "Send invoices", "due": "2025-01-02", "priority": "medium", "status": "pending"},
            {"id": 3, "title": "Team standup", "due": "2025-01-02", "priority": "low", "status": "pending"},
        ]
        self.next_id = 4

    async def create(self, title: str, due_date: str = None, priority: str = "medium"):
        await asyncio.sleep(0.5)  # Simulate database
        task = {
            "id": self.next_id,
            "title": title,
            "due": due_date or (datetime.now() + timedelta(days=1)).strftime("%Y-%m-%d"),
            "priority": priority,
            "status": "pending"
        }
        self.tasks.append(task)
        self.next_id += 1
        return type("Task", (), task)()

    async def list(self, status: str = "pending", due_filter: str = "all"):
        await asyncio.sleep(0.3)  # Simulate database
        filtered = [t for t in self.tasks if status == "all" or t["status"] == status]
        return [type("Task", (), t)() for t in filtered]

    async def complete(self, identifier: str):
        await asyncio.sleep(0.3)
        for task in self.tasks:
            if identifier.lower() in task["title"].lower():
                task["status"] = "completed"
                return type("Task", (), task)()
        raise ValueError(f"Task not found: {identifier}")


class VoiceFunctionAgent:
    """Voice agent with function calling."""

    def __init__(self, data_channel):
        self.data_channel = data_channel
        self.task_service = TaskService()
        self.pending_calls = {}

    def get_functions(self) -> list:
        """Return function definitions for session config."""
        return [
            {
                "type": "function",
                "name": "create_task",
                "description": "Create a new task. Use when user asks to add, create, or make a task.",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "title": {"type": "string", "description": "Task title"},
                        "due_date": {"type": "string", "description": "Due date in YYYY-MM-DD"},
                        "priority": {"type": "string", "enum": ["low", "medium", "high"]}
                    },
                    "required": ["title"]
                }
            },
            {
                "type": "function",
                "name": "list_tasks",
                "description": "List user's tasks. Use when user asks what tasks they have.",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "status": {"type": "string", "enum": ["all", "pending", "completed"]},
                        "due_filter": {"type": "string", "enum": ["today", "all"]}
                    }
                }
            },
            {
                "type": "function",
                "name": "complete_task",
                "description": "Mark task complete. Use when user says they finished a task.",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "task_identifier": {"type": "string", "description": "Task name or description"}
                    },
                    "required": ["task_identifier"]
                }
            }
        ]

    def get_instructions(self) -> str:
        """Return voice-optimized instructions."""
        return """
        You are a voice assistant for Task Manager.

        When calling functions:
        1. Acknowledge first: "Sure, I'll check that" or "Creating that now"
        2. After results, summarize concisely
        3. For errors, explain simply and suggest next steps

        Keep responses under 30 words. Be conversational, not robotic.
        """

    async def handle_event(self, event: dict):
        """Process events from DataChannel."""
        event_type = event.get("type")

        if event_type == "response.function_call_arguments.delta":
            call_id = event["call_id"]
            if call_id not in self.pending_calls:
                self.pending_calls[call_id] = {"name": event["name"], "arguments": ""}
            self.pending_calls[call_id]["arguments"] += event.get("delta", "")

        elif event_type == "response.function_call_arguments.done":
            call_id = event["call_id"]
            call_info = self.pending_calls.pop(call_id, None)

            if call_info:
                result = await self._execute(call_info["name"], call_info["arguments"])
                self._send_result(call_id, result)

    async def _execute(self, name: str, args_str: str) -> dict:
        """Execute function and return result."""
        try:
            args = json.loads(args_str) if args_str else {}
            print(f"[function] {name}({args})")

            if name == "create_task":
                task = await self.task_service.create(**args)
                return {"success": True, "message": f"Created: {task.title}", "due": task.due}

            elif name == "list_tasks":
                tasks = await self.task_service.list(**args)
                if not tasks:
                    return {"success": True, "message": "No tasks found", "count": 0}
                return {
                    "success": True,
                    "count": len(tasks),
                    "tasks": [{"title": t.title, "due": t.due} for t in tasks[:5]]
                }

            elif name == "complete_task":
                task = await self.task_service.complete(args["task_identifier"])
                return {"success": True, "message": f"Completed: {task.title}"}

            else:
                return {"success": False, "error": f"Unknown: {name}"}

        except Exception as e:
            return {"success": False, "error": str(e), "spoken": "I couldn't do that. Try again?"}

    def _send_result(self, call_id: str, result: dict):
        """Send function result to model."""
        self.data_channel.send(json.dumps({
            "type": "conversation.item.create",
            "item": {
                "type": "function_call_output",
                "call_id": call_id,
                "output": json.dumps(result)
            }
        }))
        self.data_channel.send(json.dumps({"type": "response.create"}))

Sample Conversation:

User: "What tasks do I have?"
Agent: "Let me check your tasks... You have 3 pending tasks: review proposal
        due tomorrow, send invoices also tomorrow, and team standup. Want me
        to mark any as complete?"

User: "Create a task to call the client"
Agent: "Sure, creating that... Done. I added 'call the client' to your list,
        due tomorrow. Anything else?"

User: "Mark the standup as done"
Agent: "Got it... Team standup is now marked complete. You have 3 tasks left."

Try With AI

Prompt 1: Design Voice Functions

I'm building voice function calling for my Task Manager. Help me design:

1. Function schemas for: create, list, complete, delete, and reschedule tasks
2. Voice-optimized descriptions that trigger correctly
3. Parameter defaults to minimize clarification questions
4. Destructive action flags for confirmation logic

Consider edge cases:
- User says "remind me to..." (is this a task?)
- User says "I'm done with everything" (complete all?)
- User mentions time without date ("at 3pm")

What you are learning: Function design for voice context. Good schemas reduce clarification loops and improve user experience.

Prompt 2: Implement Filler Patterns

My voice agent has these latency characteristics:
- create_task: 500ms average, 2s worst case
- list_tasks: 300ms average, 1s worst case
- external_api_call: 1-5 seconds

Help me design filler speech patterns:
1. What should the model say before each function type?
2. How do I handle operations that might take 5+ seconds?
3. Should filler be model-generated or code-generated?
4. How do I avoid filler for fast operations (feels patronizing)?

Provide example instructions and code.

What you are learning: UX timing patterns. Voice interfaces have stricter timing requirements than visual interfaces.

Prompt 3: Error Recovery Flows

My voice agent encounters these errors. Design voice-appropriate recovery:

1. Task not found: User said "complete the meeting task" but no exact match
2. Network timeout: Database didn't respond in 3 seconds
3. Validation error: User tried to create task with due date in the past
4. Rate limit: Too many requests, need to slow down
5. Permission denied: User tried to delete someone else's task

For each:
- What should the agent say?
- What action should the agent offer?
- How do we avoid user frustration?

Consider: Users are on the phone, can't see a screen, can't easily re-type.

What you are learning: Error UX design. Voice errors need immediate, actionable guidance—users cannot see error codes or retry buttons.

Safety Note

Function calling in voice carries unique risks:

Misheard commands: "Delete all tasks" vs "Delete Paul's tasks"—implement confirmation for bulk operations
Unauthorized callers: Voice does not prove identity—require PIN for sensitive operations
Background noise triggers: TV says "buy ten shares"—your agent should not execute
Social engineering: Caller pretends to be account owner—implement verification

Test with adversarial scenarios before production. Monitor early calls for unexpected function invocations.

Voice Function Calling: The Difference​

Chat Tool Call​

Voice Tool Call (Wrong)​

Voice Tool Call (Right)​

Defining Voice Functions​

Basic Function Schema​

Voice-Specific Description Guidelines​

Registering Functions with the Session​

Handling Function Calls​

Function Call Event Flow​

Implementation​

Filler Speech Patterns​

Instructing the Model for Filler​

Filler Timing​

Custom Filler for Long Operations​

Confirmation Patterns for Destructive Actions​

Identifying Destructive Actions​

Two-Step Confirmation Pattern​

Conversation Flow​

Timeout for Confirmation​

Error Handling in Voice​

Error Categories​

Error Response Implementation​

Retry Pattern​

Complete Function Calling Example​

Try With AI​

Prompt 1: Design Voice Functions​

Prompt 2: Implement Filler Patterns​

Prompt 3: Error Recovery Flows​

Safety Note​