What’s New: Week of May 18, 2026
Responsive UI Polish: A round of UI adjustments to make the app work better across viewport sizes.
New Composer-Based Onboarding Flow: A new onboarding experience built on top of the Assistant Builder and powered by Composer is rolling out to select users as part of a phased release.
What’s New: Week of May 11, 2026
What’s New: Week of May 4, 2026
assistant.transcriber (provider: soniox) for low-latency, multilingual real-time speech-to-text.What’s New: Week of April 27, 2026
What’s New: Week of April 20, 2026
Logs UX Refresh: New filter layout plus a round of UX improvements — improved date picker, active row is clearly highlighted across all log views when the flyout is opened, log tables are fully keyboard-accessible, sortable cost and duration columns, pagination, and more.
Squads contextEngineeringPlan Handoff Type — previousAssistantMessages: Forwards only the conversation history from before the current assistant’s session. The current assistant’s own messages and tool calls are excluded entirely from the handoff payload. See the updated handoff context configuration docs.
assistant.speechStarted Event — Live Captions & Word-Level Timing (GA): A new opt-in message fires as the assistant begins speaking each segment, carrying the full turn text, turn, source (model / force-say / custom-voice), and optional timing:
voice.subtitleType: "word", with correct CJK handling)Subscribe by adding "assistant.speechStarted" to your assistant’s clientMessages and/or serverMessages — now GA with no feature flag. Use it for live captions, karaoke-style highlighting, or any UI that needs to stay in sync with assistant audio. Fully backward-compatible; no existing messages changed.
Autofallbacks on Transcribers: Let Vapi pick the best transcriber to fall back to if your primary one fails — even mid-call. Opt in by setting assistant.transcriber.fallbackPlan.autoFallback.enabled to true. See the updated transcriber fallback plan docs.
What’s New: Week of April 13, 2026
Monitoring — GA: Automated call quality monitoring is now generally available. Detect issues with trigger-based rules, get alerts when something goes wrong, and surface resolution suggestions — all from the dashboard.
What’s New: October 2025 – March 2026
Here’s a summary of major items shipped from October 2025 through March 2026.
Squads v2: Visual builder to simplify sophisticated multi-assistant orchestration with seamless handoffs between specialized agents.
Composer (Alpha): Intelligent assistant inside the dashboard that allows you to describe what you need through plain text prompts to help build, adjust, and debug voice agents.
Simulations (Alpha): Voice agent testing feature to build confidence through enabling systematic, AI-powered testing in specific scenarios with evaluation of outcomes.
Monitoring & Issues: Automated call quality monitoring with trigger-based issue detection, alerting, and resolution suggestions.
HIPAA with Data Retention: New compliance mode with private storage and in-dashboard toggle/purchase flow — available for additional cost.
Zero Data Retention: Compliance mode that keeps context data during call as needed to execute tasks and retains no data afterwards.
Consolidated Logs: Unified log viewing into a single page.
Vapi Voices: 12 new ultra-realistic voices released, optimized for latency and cost with adjustable speed controls exposed. 8 legacy voices deprecated.
Deepgram Nova-3 Languages: Added Hebrew, Urdu, Tagalog, and Arabic bilingual support.
Cartesia Transcriber: ink-whisper.
Soniox: stt-rt-v4.
GPT-5 Family: OpenAI’s latest intelligence models, including GPT-5, 5-Mini, 5-Nano, 5.1, 5.2, 5.4, 5.4-Mini, 5.4-Nano.
Claude 4.5–4.6: Anthropic’s latest intelligence models Sonnet 4.5, Opus 4.5, Opus 4.6, Sonnet 4.6.
Gemini 3 Flash: Google’s latest intelligence models.
Grok 4 Fast: Reasoning and non-reasoning variants.
GPT Realtime Mini: OpenAI’s lightweight realtime model.
Cartesia: sonic-3, sonic-3-2026-01-12, sonic-3-2025-10-27.
WellSaid: Caruso (new), legacy.
Inworld: inworld-tts-1 (REST, original), inworld-tts-1.5-max (WebSocket, $10/M chars), inworld-tts-1.5-mini (WebSocket, $5/M chars).
ElevenLabs Scribe v2: Latest version of ElevenLabs speech-to-text.
Structured Outputs Improvements: Updates to our AI-powered analysis and data extraction tool, including transient structured outputs, audio-based extraction, and regex extraction.
SIP Request Tool + DTMF over SIP INFO: Send SIP requests and DTMF tones via SIP INFO messages during calls.
Variable Passing Between Tool Calls: Pass output variables from one tool call as input to subsequent tool calls.
Encrypted Tool Arguments: Encrypt sensitive tool arguments to protect data in transit.
Low Confidence Speech Hook: Hook that triggers when the transcriber returns low-confidence speech results.
Time Elapsed Hook: Hook that triggers at specified time intervals during a call.
assistant.speechStarted Event: New event fired when the assistant begins speaking.
MCP Improvements: Bearer auth, $ref dereferencing, child tool messages/discovery.
Warm Transfer Improvements: SIP support, caller ID, context engineering, variable filling.
Breaking Changes & API Cleanup
Legacy Endpoint Removal: The following deprecated endpoints have been removed as part of our API modernization effort:
/logs - Use call artifacts and monitoring instead/workflow/{id} - Access workflows through the main workflow endpoints/test-suite and related paths - Replaced by the new evaluation system/knowledge-base and related paths - Integrated into model configurationsKnowledge Base Architecture Change: The knowledgeBaseId property has been removed from all model configurations. This affects:
XaiModel, GroqModel, GoogleModelOpenAIModel, AnthropicModel, CustomLLMModelTranscriber Property Deprecation: AssemblyAITranscriber.wordFinalizationMaxWaitTime and FallbackAssemblyAITranscriber.wordFinalizationMaxWaitTime are now deprecated:
Schema Path Cleanup: Removed numerous unused schema paths from model configurations to simplify the API structure and improve performance. This cleanup affects internal schema references but doesn’t impact your existing integrations.
New v2 API: We are introducing a new API version v2. These changes are part of our ongoing effort to:
For details on the new features that replace these deprecated endpoints, see our recent changelog entries:
If you’re currently using any of the removed endpoints or properties, you must migrate to the new alternatives before this release. Contact support if you need assistance with migration strategies.
Replace /logs endpoint usage with call artifacts, monitoring plans, and end-of-call reports for comprehensive logging.
Migrate from test-suite endpoints to the new evaluation system with mock conversations and comprehensive result tracking.
Update model configurations to use the integrated knowledge base system instead of separate knowledgeBaseId references.
Replace deprecated transcriber timing properties with smart endpointing plans for better conversation flow control.
The following endpoints are no longer available:
GET /logs - Use call artifacts insteadGET /workflow/{id} - Use main workflow endpointsGET /test-suite, POST /test-suite - Use evaluation endpointsGET /test-suite/{id}, PUT /test-suite/{id}, DELETE /test-suite/{id} - Use evaluation managementPOST /test-suite/{testSuiteId}/run - Use evaluation runsGET /knowledge-base, POST /knowledge-base - Integrated into model configurationsSee Also:
Evaluation Execution & Results Processing
Evaluation Execution Engine: Run comprehensive assistant evaluations with EvalRun and CreateEvalRunDTO. Execute your mock conversations against live assistants and squads to validate performance and behavior in controlled environments.
Multiple Evaluation Models: Choose from various AI models for LLM-as-a-judge evaluation:
EvalOpenAIModel: GPT models including GPT-4.1, o1-mini, o3, and regional variantsEvalAnthropicModel: Claude models with optional thinking features for complex evaluationsEvalGoogleModel: Gemini models from 1.0 Pro to 2.5 Pro for diverse evaluation needsEvalGroqModel: High-speed inference models including Llama and custom optionsEvalCustomModel: Your own evaluation models with custom endpointsEvaluation Results: Comprehensive result tracking with EvalRunResult:
status: Pass/fail evaluation outcomesmessages: Complete conversation transcript from the evaluationstartedAt and endedAt: Precise timing information for performance analysisTarget Flexibility: Run evaluations against different targets:
EvalRunTargetAssistant: Test individual assistants with optional overridesEvalRunTargetSquad: Evaluate entire squad performance and coordinationEvaluation Status Tracking: Monitor evaluation progress with detailed status information:
running: Evaluation in progressended: Evaluation completedqueued: Evaluation waiting to startendedReason including success, error, timeout, and cancellation statesJudge Configuration: Optimize evaluation accuracy with model-specific settings:
maxTokens: Recommended 50-10000 tokens (1 token for simple pass/fail responses)temperature: 0-0.3 recommended for LLM-as-a-judge to reduce hallucinationsFor LLM-as-a-judge evaluations, the judge model must respond with exactly “pass” or “fail”. Design your evaluation prompts to ensure clear, deterministic responses.
Choose from OpenAI, Anthropic, Google, Groq, or custom models for evaluation, matching your quality and performance requirements.
Detailed pass/fail results with complete conversation transcripts and timing information for thorough analysis.
Test individual assistants or entire squads with optional configuration overrides for comprehensive validation.
Real-time evaluation status tracking with detailed reason codes for failures, timeouts, and cancellations.
Voicemail Detection & Handling Improvements
Enhanced Beep Detection: Improve voicemail detection accuracy with CreateVoicemailToolDTO.beepDetectionEnabled specifically for Twilio-based calls. This feature detects the characteristic beep sound that indicates voicemail recording has started.
Workflow Voicemail Integration: Configure comprehensive voicemail handling in workflows with enhanced message and detection capabilities:
Workflow.voicemailMessage: Custom messages for voicemail scenarios (up to 1000 characters)Workflow.voicemailDetection: Configurable detection methods for different providersAssistant Voicemail Enhancement: Improved voicemail handling in assistant configurations with Assistant.voicemailMessage and Assistant.voicemailDetection for consistent behavior across all conversation types.
Multiple Detection Methods: Choose from various voicemail detection providers:
GoogleVoicemailDetectionPlan for AI-powered detectionOpenAIVoicemailDetectionPlan for intelligent voicemail recognitionTwilioVoicemailDetectionPlan for carrier-level detectionVapiVoicemailDetectionPlan for integrated detectionBeep Detection for Call Flows: The new beep detection capability works specifically with Twilio transport, providing reliable voicemail identification when traditional detection methods may not be sufficient.
Voicemail Tool Configuration: Enhanced tool rejection and messaging capabilities ensure appropriate handling when voicemail is detected, with configurable responses based on your business requirements.
Beep detection is currently available only for Twilio-based calls. If you’re using other providers, consider combining multiple detection methods for better accuracy.
Support for Google, OpenAI, Twilio, and Vapi detection methods, allowing you to choose the best option for your use case.
Advanced audio analysis to detect voicemail beeps on Twilio calls for more reliable voicemail identification.
Configure personalized voicemail messages up to 1000 characters for better user experience and brand consistency.
Comprehensive voicemail handling throughout workflow nodes with consistent configuration across conversation flows.