A next-generation, hyper-personalized desktop AI Computer Use Agent (CUA) and assistant that seamlessly blends web, native, and 3D interfaces
🌐 Try Enteract | 📚 Documentation | 🐛 Report Issues | 💬 Discussions
- Advanced Speech Recognition - Real-time transcription with Whisper integration
- Multi-Modal AI - Vision analysis, document understanding, and conversational intelligence
- System Integration - OS-level automation, screenshot capture, and application control
- Personal Knowledge Base - RAG system with document embedding and semantic search
- Beautiful UI - Frameless windows with 3D visuals, glassmorphism effects, and smooth animations
- High Performance - Rust backend with optimized data storage (JSON → SQLite migration)
Windows
- Windows 10 (1903 or later) - Fully supported
- Windows 11 - Fully supported and optimized
- Advanced features: Eye tracking, audio loopback, system automation
- Native Windows API integration for seamless OS interaction
- Hybrid Storage System - Seamless migration from JSON to SQLite with zero downtime
- Modular AI Agents - Specialized agents for different tasks (coding, research, conversation)
- Cross-Platform - Windows, macOS, and Linux support via Tauri
- Audio Processing - Loopback capture, noise reduction, and speech-to-text
- Comprehensive Testing - Full test suite with Vitest and Vue Test Utils
Minimum Requirements:
- Windows: Windows 10 (build 1903+) or Windows 11
- RAM: 4GB minimum, 8GB recommended
- Storage: 8GB available space
- Camera: Any USB or integrated camera (for eye tracking)
- Microphone: Any audio input device (for speech recognition)
Development Requirements:
- Node.js 18+
- Rust (latest stable)
- Platform-specific build tools:
- Windows: Visual Studio Build Tools 2019/2022 or Visual Studio Community
# Clone the repository
git clone <repository-url>
cd embedded-agentic-assistant
# Install dependencies
npm install
# Run in development mode
npm run tauri dev
# Build for production
npm run tauri build- Configure AI Models - Set up your preferred Ollama models or OpenAI API keys
- Calibrate Eye Tracking - Follow the brief calibration process for gaze controls
- Explore Features - Open different windows and try voice commands
embedded-agentic-assistant/
├── src/ # Vue 3 + TypeScript frontend
│ ├── components/ # UI components
│ │ ├── ControlPanel.vue # Main control interface
│ │ ├── ChatWindow.vue # AI chat interface
│ │ └── ConversationalWindow.vue # Voice interaction UI
│ ├── composables/ # Vue composables
│ │ ├── useEyeTracking.ts # Eye tracking system
│ │ ├── useSpeechTranscription.ts # Speech recognition
│ │ └── useWindowManager.ts # Window management
│ ├── types/ # TypeScript definitions
│ └── tests/ # Comprehensive test suite
├── src-tauri/ # Rust backend
│ ├── src/
│ │ ├── ai_commands.rs # AI model integration
│ │ ├── data/ # Storage system
│ │ │ ├── json_store.rs # Legacy JSON storage
│ │ │ ├── sqlite_store.rs # Modern SQLite storage
│ │ │ ├── migration.rs # Migration utilities
│ │ │ └── hybrid_store.rs # Auto-selecting storage
│ │ ├── rag_system.rs # Document embedding & search
│ │ ├── speech.rs # Whisper integration
│ │ └── screenshot.rs # Screen capture
│ └── capabilities/ # Tauri permissions
└── resources/ # Documentation & assets
Long term TDD is intended, if you have rust or UX testing expertise please contribute!
Configure your AI models in the settings:
- Ollama - Local models for privacy-focused AI
- OpenAI / DeepSeek API - Cloud-based models for advanced capabilities (pending)
- Whisper - Local speech recognition
We welcome contributions! Whether you're fixing bugs, adding features, or improving documentation, your help makes this project better. Make sure to review the Contributing Guide first (short).
- Fork and Clone the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Install dependencies (
npm install) - Run tests (
npm run test) to ensure everything works - Make your changes with proper test coverage
- Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
- UI/UX Improvements - Enhanced visual design and user experience
- AI Capabilities - New AI agents and improved prompts
- Integrations - Connect with more external services
- Documentation - Tutorials, guides, and API docs
- Testing - Expanded test coverage and performance benchmarks
- Windows Features - Advanced Windows-specific integrations
(Experimental)
- Eye Tracking - Better calibration and gaze accuracy (experimental)
- Platform Support - macOS/Linux compatibility
- TypeScript/Vue - Use Composition API with TypeScript
- Rust - Follow standard Rust conventions with
rustfmt - Tests - Write tests for new features and bug fixes
- Documentation - Update relevant documentation
This project is licensed under the Apache 2.0 License - see the LICENSE file for details.
- Rust Community - For the powerful systems language and crate development
- Tauri Team - For the amazing cross-platform framework
- Vue.js Community - For the reactive frontend framework
- Whisper - For speech recognition technology
- Core UI components and window management
- Speech recognition integration
- SQL Based RAG
- Computer Use Agent (CUA) MCP - work in progress currently just using Regex
- Cloud integration (OAI API, Deepseek API, Azure AI)
- Enhanced AI agent capabilities (CUA)
- Basic eye tracking implementation w/ model context
- Improved RAG system with better embeddings / file ref
- Multi-modal AI interactions
- Multi-platform support
- Advanced automation workflows + CUA
- Plugin system for extensibility
- Website - Visit tryenteract.com for more information
- Issues - Report bugs and request features on GitHub Issues
- Discussions - Join conversations on GitHub Discussions
- Documentation - Check the
/resourcesdirectory for detailed guides