⚡ Real-time · 📡 WebRTC · 🎙️ Multi-Voice · 🧠 EVA-flash · 🔌 Embedded Ready
Multimodal interaction powering intelligent devices · Instantly upgrade any hardware into a real-time multimodal AI terminal
[2025-12-07] Release · EVA OS v1.0.0 Officially Launched!
- 🎧 Full Duplex Interaction: millisecond-level latency, supports barge-in during conversations, achieving truly human-like real-time dialogue
- 🛠️ MCP Tools Suite: built-in weather forecast, web search, smart map and more utilities, works out-of-the-box
- 🎙️ Multi-Voice TTS Engine: added 10+ humanlike voices (emotional voices, professional broadcast voices and so on), switch in one click within a Solution
- 🧠 Agent Workflow / Multi-Agent Collaboration System: enables cross-capability cooperation for complex tasks such as poem creation and story generation
- 🔑 SDK Release / Fully Open Source: iOS & Android SDKs are now completely opensourced, providing zero-barrier hardware integration
- 🌐 LiveKit Deep Optimization: ultra-low latency full-duplex communication
EVA OS is an open-source, multimodal, low-latency real-time AI Agent engine designed for next-generation AI hardware. It is deeply optimized for mobile devices, IoT hardware, and embedded systems. It fills the gap between “device intelligence” and “user experience.”
Through the EVA platform, developers can quickly create Solutions (AI Agents) with real-time multimodal interaction, and with a single API Key achieving “develop once, run on all devices.” Any hardware can instantly become a “real-time multimodal interactive, agent-cooperative, memory-capable AI hub.”
Core Benefits: EVA-flash model permanently free | Mobile SDK 100% open source | ESP32/RK/MCU embedded SDK coming soon
Our belief:
- Foundation models are merely “neurons”
- AI hardware needs a “nervous system”
- EVA OS aims to become the nervous system of next-generation AI devices
-
Interaction Layer: From single-modality to multi-modality
The system can simultaneously understand text, speech, images, video, etc. It also supports barge-in, speech interruption, and real-time responsiveness, making AI “converse like a human.” -
Memory Layer: From text memory to multimodal memory And from storage-based memory to parameterized memory.
The AI remembers not only text, but also images, audio, and contextual modalities —- making memory deeper and more integrated. -
Execution Layer: From simple API calls to complex reasoning EVA OS evolves from executing fixed commands to performing logical reasoning and solving complex tasks.
-
Persona Layer (Representation): From programmed behavior to model-driven dynamics
The AI’s “persona” becomes dynamic, adaptive, and humanlike—shaped by large model reasoning rather than rigid code.
- Streaming architecture with response latency as low as 300ms
- Native multimodal support: speech, vision, text
- Lightweight optimization for mid-range hardware
- iOS/Android SDK fully open source with detailed demos
- ESP32/RK/MCU embedded SDK in final testing
- Best-practice samples for IoT / toys / speakers
- Real-time lip-sync
- Multiple default avatars included
- Custom user-generated avatars coming soon
- 10+ humanlike voices with emotional modulation
- Full-duplex communication with natural barge-in
- Smart auto-response + customizable greetings
- Real-time audio/video streaming
- Truly full-duplex—no: more “waiting for reply” gaps
- Custom prompt templates for education, home, office, etc.
- Multi-Agent routing for complex task orchestration
- Built-in MCP tools for personalized functions
- Unified Solution API Key supporting mainstream hardware
- Fully open resources: SDKs, sample projects, docs
- [Coming Soon]: Embedded system SDK, edge-cloud collaboration, embodied intelligence APIs
1️⃣ Register an EVA Platform Account
Visit: https://eva.autoarkai.com
Create:
- Solution
- Configure Voices, Prompts, Tools, Agents
- Obtain API Key
2️⃣ Use API Key to Generate a LiveKit Token
Refer to the example in the mobile SDK:
eva-client.ts
3️⃣ Client Connection Example
Refer to React Native example:
React Native Example Docs
- 🎧 Real-time full-duplex audio/video interaction
- ⚡ EVA-flash real-time multimodal LM
- 🧩 Solution (AI Agent) framework
- 🔑 API Key device access
- 📱 iOS & Android SDK fully open source
- 🛠️ Built-in MCP Tools
- 🔌 Embedded SDK: ESP32/RK/MCU hardware-level integration
- ☁️ Edge-Cloud Collaboration: cloud compute scheduling + precise device-side command delivery
- 🧠 Intelligent Memory: short-term interaction memory + long-term preference memory
- 🔧 MCP Tool Extensions: third-party tool integration
- 🎭 Custom Digital Avatars: generate avatars from costomized photos
- 🏢 Enterprise Features: hybrid model deployment + high-concurrency solutions
EVA’s vision: empower every device with autonomous interaction, execution, and memory: awakening true hardware-level AI intelligence.
You are welcome whether you are:
- App developers
- IoT / toy / smart home device/ robotics creators
- Embedded engineers (ESP32 / RK / STM32)
- DIY makers
You can contribute:
- PRs (optimize mobile SDK)
- Embedded integration examples
- Tutorials / documentation
- New language SDKs
- Device-side demos
- Issues / feature requests
Let’s build the most open and complete real-time multimodal AI hardware ecosystem—together!
EVA OS is released under the MIT License.
This means you are free to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the software, as long as you include the original copyright and permission notice in any copies or substantial portions.