Web calls

Build voice interfaces and backend integrations using Vapi's Web and Server SDKs

Overview

Build powerful voice applications that work across web browsers, mobile apps, and backend systems. This guide covers both client-side voice interfaces and server-side call management using Vapi’s comprehensive SDK ecosystem.

In this quickstart, you’ll learn to:

Create real-time voice interfaces for web and mobile
Build automated outbound and inbound call systems
Handle events and webhooks for call management
Implement voice widgets and backend integrations

Developing locally? The Vapi CLI makes it easy to initialize projects and test webhooks:

$ # Initialize Vapi in your project
$ vapi init
$ 
$ # Forward webhooks to local server
$ vapi listen --forward-to localhost:3000/webhook

Learn more about the Vapi CLI →

Choose your integration approach

Client-Side Voice Interfaces

Best for: User-facing applications, voice widgets, mobile apps

Browser-based voice assistants and widgets
Real-time voice conversations
Mobile voice applications (iOS, Android, React Native, Flutter)
Direct user interaction with assistants

Server-Side Call Management

Best for: Backend automation, bulk operations, system integrations

Automated outbound call campaigns
Inbound call routing and management
CRM integrations and bulk operations
Webhook processing and real-time events

Web voice interfaces

Build browser-based voice assistants and widgets for real-time user interaction.

Installation and setup

Web SDK

React Native

Flutter

iOS

Build browser-based voice interfaces:

$ npm install @vapi-ai/web

1 import Vapi from '@vapi-ai/web';
2 
3 const vapi = new Vapi('YOUR_PUBLIC_API_KEY');
4 
5 // Start voice conversation
6 vapi.start('YOUR_ASSISTANT_ID');
7 
8 // Listen for events
9 vapi.on('call-start', () => console.log('Call started'));
10 vapi.on('call-end', () => console.log('Call ended'));
11 vapi.on('message', (message) => {
12   if (message.type === 'transcript') {
13     console.log(`${message.role}: ${message.transcript}`);
14   }
15 });

Live captions and word-level timing

For UIs that need to render live captions or karaoke-style word highlighting as the assistant speaks, subscribe to the opt-in assistant.speechStarted message. Add it to your assistant’s clientMessages:

1 {
2   "clientMessages": ["assistant.speechStarted", "transcript", "speech-update"]
3 }

Each event carries the full assistant turn text, the turn number, the source ("model", "force-say", or "custom-voice"), and optional timing data whose shape depends on your voice provider:

1 vapi.on('message', (message) => {
2   if (message.type !== 'assistant.speechStarted') return;
3 
4   const { text, turn, source, timing } = message;
5 
6   if (timing?.type === 'word-alignment') {
7     // ElevenLabs: per-word timestamps at playback cadence (~50-200ms apart).
8     // timing.words includes spaces; join them into a char cursor and
9     // highlight `text` up to that position.
10   } else if (timing?.type === 'word-progress') {
11     // Minimax with voice.subtitleType: "word". Cursor-based:
12     // wordsSpoken / totalWords. See note below — events arrive in
13     // segment-sized jumps, not word-by-word ticks.
14   } else {
15     // Cartesia, Deepgram, Azure, OpenAI, etc.: text-only event tied
16     // to audio playback. Display `text` as a caption block.
17   }
18 });

Cadence and granularity vary significantly by voice provider — pick the one that matches your UI requirements:

ElevenLabs (word-alignment) is the only provider that emits at true playback cadence with real per-word timestamps. Best for smooth karaoke-style highlighting with no client-side interpolation.
Minimax (word-progress) with subtitleType: "word" emits once per synthesis segment, near the end of that segment’s playback. The per-word timing.words[] array carries timestamps for the segment that just finished — useful for retroactive animation or forward extrapolation, but not for driving real-time highlighting during that segment. See the Minimax provider page for details.
All other providers emit text-only events (no timing). One event per TTS chunk; you can interpolate a word cursor at a flat rate (~3.5 words/sec) between events for an approximate cursor.

force-say events (your firstMessage, say actions) always emit as text-only, even on ElevenLabs and Minimax. On user barge-in, no further events fire for the interrupted turn — pair with the user-interrupted message to know what was actually spoken.

For the full event schema and field reference, see Server events → Assistant Speech Started.

Create a voice widget for your website:

HTML Script Tag

React/TypeScript

The fastest way to get started. Copy this snippet into your website:

1 <script>
2   var vapiInstance = null;
3   const assistant = "assistant_id"; // Substitute with your assistant ID
4   const apiKey = "your_public_api_key"; // Substitute with your Public key from Vapi Dashboard.
5   const buttonConfig = {}; // Modify this as required
6 
7   (function (d, t) {
8     var g = document.createElement(t),
9       s = d.getElementsByTagName(t)[0];
10     g.src =
11       "https://cdn.jsdelivr.net/gh/VapiAI/html-script-tag@latest/dist/assets/index.js";
12     g.defer = true;
13     g.async = true;
14     s.parentNode.insertBefore(g, s);
15 
16     g.onload = function () {
17       vapiInstance = window.vapiSDK.run({
18         apiKey: apiKey, // mandatory
19         assistant: assistant, // mandatory
20         config: buttonConfig, // optional
21       });
22     };
23   })(document, "script");
24 </script>

Server-side call management

Automate outbound calls and handle inbound call processing with server-side SDKs.

Installation and setup

TypeScript

Python

Java

Ruby

C#

Go

Install the TypeScript Server SDK:

$ npm install @vapi-ai/server-sdk

1 import { VapiClient } from "@vapi-ai/server-sdk";
2 
3 const vapi = new VapiClient({
4   token: process.env.VAPI_API_KEY!
5 });
6 
7 // Create an outbound call
8 const call = await vapi.calls.create({
9   phoneNumberId: "YOUR_PHONE_NUMBER_ID",
10   customer: { number: "+1234567890" },
11   assistantId: "YOUR_ASSISTANT_ID"
12 });
13 
14 console.log(`Call created: ${call.id}`);

Creating assistants

TypeScript

Python

Java

Ruby

C#

Go

1 const assistant = await vapi.assistants.create({
2   name: "Sales Assistant",
3   firstMessage: "Hi! I'm calling about your interest in our software solutions.",
4   model: {
5     provider: "openai",
6     model: "gpt-4o",
7     temperature: 0.7,
8     messages: [{
9       role: "system",
10       content: "You are a friendly sales representative. Keep responses under 30 words."
11     }]
12   },
13   voice: {
14     provider: "11labs",
15     voiceId: "21m00Tcm4TlvDq8ikWAM"
16   }
17 });

Bulk operations

Run automated call campaigns for sales, surveys, or notifications:

TypeScript

Python

Java

Ruby

C#

Go

1 async function runBulkCallCampaign(assistantId: string, phoneNumberId: string) {
2   const prospects = [
3     { number: "+1234567890", name: "John Smith" },
4     { number: "+1234567891", name: "Jane Doe" },
5     // ... more prospects
6   ];
7 
8   const calls = [];
9   for (const prospect of prospects) {
10     const call = await vapi.calls.create({
11       assistantId,
12       phoneNumberId,
13       customer: prospect,
14       metadata: { campaign: "Q1_Sales" }
15     });
16     calls.push(call);
17 
18     // Rate limiting
19     await new Promise(resolve => setTimeout(resolve, 2000));
20   }
21 
22   return calls;
23 }

Webhook integration

Handle real-time events for both client and server applications:

TypeScript

Python

Java

Ruby

C#

Go

1 import express from 'express';
2 
3 const app = express();
4 app.use(express.json());
5 
6 app.post('/webhook/vapi', async (req, res) => {
7   const { message } = req.body;
8 
9   switch (message.type) {
10     case 'status-update':
11       console.log(`Call ${message.call.id}: ${message.call.status}`);
12       break;
13     case 'transcript':
14       console.log(`${message.role}: ${message.transcript}`);
15       break;
16     case 'function-call':
17       return handleFunctionCall(message, res);
18   }
19 
20   res.status(200).json({ received: true });
21 });
22 
23 function handleFunctionCall(message: any, res: express.Response) {
24   const { functionCall } = message;
25   
26   switch (functionCall.name) {
27     case 'lookup_order':
28       const orderData = { orderId: functionCall.parameters.orderId, status: 'shipped' };
29       return res.json({ result: orderData });
30     default:
31       return res.status(400).json({ error: 'Unknown function' });
32   }
33 }
34 
35 app.listen(3000, () => console.log('Webhook server running on port 3000'));

Next steps

Now that you understand both client and server SDK capabilities:

Explore use cases: Check out our examples section for complete implementations
Add tools: Connect your voice agents to external APIs and databases with custom tools
Configure models: Try different speech and language models for better performance
Scale with squads: Use Squads for multi-assistant setups and complex processes

Resources

Client SDKs:

Server SDKs:

Documentation:

$	# Initialize Vapi in your project
$	vapi init
$
$	# Forward webhooks to local server
$	vapi listen --forward-to localhost:3000/webhook

1	import Vapi from '@vapi-ai/web';
2
3	const vapi = new Vapi('YOUR_PUBLIC_API_KEY');
4
5	// Start voice conversation
6	vapi.start('YOUR_ASSISTANT_ID');
7
8	// Listen for events
9	vapi.on('call-start', () => console.log('Call started'));
10	vapi.on('call-end', () => console.log('Call ended'));
11	vapi.on('message', (message) => {
12	if (message.type === 'transcript') {
13	console.log(`${message.role}: ${message.transcript}`);
14	}
15	});

1	{
2	"clientMessages": ["assistant.speechStarted", "transcript", "speech-update"]
3	}

1	vapi.on('message', (message) => {
2	if (message.type !== 'assistant.speechStarted') return;
3
4	const { text, turn, source, timing } = message;
5
6	if (timing?.type === 'word-alignment') {
7	// ElevenLabs: per-word timestamps at playback cadence (~50-200ms apart).
8	// timing.words includes spaces; join them into a char cursor and
9	// highlight `text` up to that position.
10	} else if (timing?.type === 'word-progress') {
11	// Minimax with voice.subtitleType: "word". Cursor-based:
12	// wordsSpoken / totalWords. See note below — events arrive in
13	// segment-sized jumps, not word-by-word ticks.
14	} else {
15	// Cartesia, Deepgram, Azure, OpenAI, etc.: text-only event tied
16	// to audio playback. Display `text` as a caption block.
17	}
18	});

1	<script>
2	var vapiInstance = null;
3	const assistant = "assistant_id"; // Substitute with your assistant ID
4	const apiKey = "your_public_api_key"; // Substitute with your Public key from Vapi Dashboard.
5	const buttonConfig = {}; // Modify this as required
6
7	(function (d, t) {
8	var g = document.createElement(t),
9	s = d.getElementsByTagName(t)[0];
10	g.src =
11	"https://cdn.jsdelivr.net/gh/VapiAI/html-script-tag@latest/dist/assets/index.js";
12	g.defer = true;
13	g.async = true;
14	s.parentNode.insertBefore(g, s);
15
16	g.onload = function () {
17	vapiInstance = window.vapiSDK.run({
18	apiKey: apiKey, // mandatory
19	assistant: assistant, // mandatory
20	config: buttonConfig, // optional
21	});
22	};
23	})(document, "script");
24	</script>

1	import { VapiClient } from "@vapi-ai/server-sdk";
2
3	const vapi = new VapiClient({
4	token: process.env.VAPI_API_KEY!
5	});
6
7	// Create an outbound call
8	const call = await vapi.calls.create({
9	phoneNumberId: "YOUR_PHONE_NUMBER_ID",
10	customer: { number: "+1234567890" },
11	assistantId: "YOUR_ASSISTANT_ID"
12	});
13
14	console.log(`Call created: ${call.id}`);

1	const assistant = await vapi.assistants.create({
2	name: "Sales Assistant",
3	firstMessage: "Hi! I'm calling about your interest in our software solutions.",
4	model: {
5	provider: "openai",
6	model: "gpt-4o",
7	temperature: 0.7,
8	messages: [{
9	role: "system",
10	content: "You are a friendly sales representative. Keep responses under 30 words."
11	}]
12	},
13	voice: {
14	provider: "11labs",
15	voiceId: "21m00Tcm4TlvDq8ikWAM"
16	}
17	});

1	async function runBulkCallCampaign(assistantId: string, phoneNumberId: string) {
2	const prospects = [
3	{ number: "+1234567890", name: "John Smith" },
4	{ number: "+1234567891", name: "Jane Doe" },
5	// ... more prospects
6	];
7
8	const calls = [];
9	for (const prospect of prospects) {
10	const call = await vapi.calls.create({
11	assistantId,
12	phoneNumberId,
13	customer: prospect,
14	metadata: { campaign: "Q1_Sales" }
15	});
16	calls.push(call);
17
18	// Rate limiting
19	await new Promise(resolve => setTimeout(resolve, 2000));
20	}
21
22	return calls;
23	}

1	import express from 'express';
2
3	const app = express();
4	app.use(express.json());
5
6	app.post('/webhook/vapi', async (req, res) => {
7	const { message } = req.body;
8
9	switch (message.type) {
10	case 'status-update':
11	console.log(`Call ${message.call.id}: ${message.call.status}`);
12	break;
13	case 'transcript':
14	console.log(`${message.role}: ${message.transcript}`);
15	break;
16	case 'function-call':
17	return handleFunctionCall(message, res);
18	}
19
20	res.status(200).json({ received: true });
21	});
22
23	function handleFunctionCall(message: any, res: express.Response) {
24	const { functionCall } = message;
25
26	switch (functionCall.name) {
27	case 'lookup_order':
28	const orderData = { orderId: functionCall.parameters.orderId, status: 'shipped' };
29	return res.json({ result: orderData });
30	default:
31	return res.status(400).json({ error: 'Unknown function' });
32	}
33	}
34
35	app.listen(3000, () => console.log('Webhook server running on port 3000'));

Overview

Choose your integration approach

Web voice interfaces

Installation and setup

Web SDK

React Native

Flutter

iOS

Live captions and word-level timing

Voice widget implementation

HTML Script Tag

React/TypeScript

Server-side call management

Installation and setup

TypeScript

Python

Java

Ruby

C#

Go

Creating assistants

TypeScript

Python

Java

Ruby

C#

Go

Bulk operations

TypeScript

Python

Java

Ruby

C#

Go

Webhook integration

TypeScript

Python

Java

Ruby

C#

Go

Next steps

Resources

Overview

Choose your integration approach

Web voice interfaces

Installation and setup

Web SDK

React Native

Flutter

iOS

Live captions and word-level timing

Voice widget implementation

HTML Script Tag

React/TypeScript

Server-side call management

Installation and setup

TypeScript

Python

Java

Ruby

C#

Go

Creating assistants

TypeScript

Python

Java

Ruby

C#

Go

Bulk operations

TypeScript

Python

Java

Ruby

C#

Go

Webhook integration

TypeScript

Python