diff --git a/examples/voice_solutions/arduino_ai_speech_assets/flowchart.png b/examples/voice_solutions/arduino_ai_speech_assets/flowchart.png new file mode 100644 index 0000000000..e7e7ada909 Binary files /dev/null and b/examples/voice_solutions/arduino_ai_speech_assets/flowchart.png differ diff --git a/examples/voice_solutions/arduino_ai_speech_assets/mockups.png b/examples/voice_solutions/arduino_ai_speech_assets/mockups.png new file mode 100644 index 0000000000..5cd2ab0f60 Binary files /dev/null and b/examples/voice_solutions/arduino_ai_speech_assets/mockups.png differ diff --git a/examples/voice_solutions/arduino_ai_speech_assets/structure.png b/examples/voice_solutions/arduino_ai_speech_assets/structure.png new file mode 100644 index 0000000000..d7c9c4496a Binary files /dev/null and b/examples/voice_solutions/arduino_ai_speech_assets/structure.png differ diff --git a/examples/voice_solutions/arduino_ai_speech_assets/thumbnail.png b/examples/voice_solutions/arduino_ai_speech_assets/thumbnail.png new file mode 100644 index 0000000000..90e40e5714 Binary files /dev/null and b/examples/voice_solutions/arduino_ai_speech_assets/thumbnail.png differ diff --git a/examples/voice_solutions/running_realtime_api_speech_on_esp32_arduino_edge_runtime_elatoai.md b/examples/voice_solutions/running_realtime_api_speech_on_esp32_arduino_edge_runtime_elatoai.md index b1af288ac0..6a651a806c 100644 --- a/examples/voice_solutions/running_realtime_api_speech_on_esp32_arduino_edge_runtime_elatoai.md +++ b/examples/voice_solutions/running_realtime_api_speech_on_esp32_arduino_edge_runtime_elatoai.md @@ -1,36 +1,22 @@ -

+![Elato Logo](https://raw.githubusercontent.com/openai/openai-cookbook/refs/heads/main/examples/voice_solutions/arduino_ai_speech_assets/elato-alien.png) -# 👾 ElatoAI: Running OpenAI Realtime API Speech on ESP32 on Arduino with Deno Edge Functions +## 👾 ElatoAI: Running OpenAI Realtime API Speech on ESP32 on Arduino with Deno Edge Functions This guide shows how to build a AI voice agent device with Realtime AI Speech powered by OpenAI Realtime API, ESP32, Secure WebSockets, and Deno Edge Functions for >10-minute uninterrupted global conversations. An active version of this README is available at [ElatoAI](https://github.com/akdeb/ElatoAI).

- -[![Discord Follow](https://dcbadge.vercel.app/api/server/KJWxDPBRUj?style=flat)](https://discord.gg/KJWxDPBRUj) -[![License: MIT](https://img.shields.io/badge/license-MIT-blue)](https://www.gnu.org/licenses/gpl-3.0.en.html) -![Node.js](https://img.shields.io/badge/Node.js-22.13.0-yellow.svg) -![Next.js](https://img.shields.io/badge/Next.js-14.2.7-brightgreen.svg) -![React](https://img.shields.io/badge/React-18.2.0-blue.svg) - -

- -## Demo Video - -https://github.com/user-attachments/assets/aa60e54c-5847-4a68-80b5-5d6b1a5b9328 - -

- -

- + +

-## Hardware Design +## ⚡️ DIY Hardware Design The reference implementation uses an ESP32-S3 microcontroller with minimal additional components: - Hardware Setup

**Required Components:** - ESP32-S3 development board @@ -40,13 +26,23 @@ The reference implementation uses an ESP32-S3 microcontroller with minimal addit - RGB LED for visual feedback - Optional: touch sensor for alternative control -**Optional hardware:** +**Hardware options:** A fully assembled PCB and device is available in the [ElatoAI store](https://www.elatoai.com/products). -## 🚀 Quick Start Guide +## 📱 App Design + +Control your ESP32 AI device from your phone with your own webapp. + + App Screenshots

+ +| Select from a list of AI characters | Talk to your AI with real-time responses | Create personalized AI characters | +|:--:|:--:|:--:| + + +## ✨ Quick Start Tutorial -

1. **Clone the repository** @@ -192,35 +188,13 @@ We have a [Usecases.md](https://github.com/akdeb/ElatoAI/tree/main/Usecases.md) ## 🗺️ High-Level Flow -```mermaid -flowchart TD - User[User Speech] --> ESP32 - ESP32[ESP32 Device] -->|WebSocket| Edge[Deno Edge Function] - Edge -->|OpenAI API| OpenAI[OpenAI Realtime API] - OpenAI --> Edge - Edge -->|WebSocket| ESP32 - ESP32 --> User[AI Generated Speech] -``` + App Screenshots

## Project Structure -```mermaid -graph TD - repo[ElatoAI] - repo --> frontend[Frontend Vercel NextJS] - repo --> deno[Deno Edge Function] - repo --> esp32[ESP32 Arduino Client] - deno --> supabase[Supabase DB] - - frontend --> supabase - esp32 --> websockets[Secure WebSockets] - esp32 --> opus[Opus Codec] - esp32 --> audio_tools[arduino-audio-tools] - esp32 --> libopus[arduino-libopus] - esp32 --> ESPAsyncWebServer[ESPAsyncWebServer] -``` + App Screenshots

-## ⚙️ PlatformIO Configuration +## ⚙️ PlatformIO Config ```ini [env:esp32-s3-devkitc-1] @@ -264,4 +238,4 @@ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file --- -**If you find this project interesting or useful, drop a GitHub ⭐️ at [ElatoAI](https://github.com/akdeb/ElatoAI). It helps a lot!** \ No newline at end of file +**This example is part of the [OpenAI Cookbook](https://github.com/openai/openai-cookbook). For the full project and latest updates, check out [ElatoAI](https://github.com/akdeb/ElatoAI) and consider giving it a ⭐️ if you find it useful!**