A production-ready voice assistant with facial recognition authentication, built on modern Python architecture and web technologies.
Features • Installation • Usage • Documentation • Contributing
Jarvis is an intelligent voice assistant that combines speech recognition, natural language processing, and computer vision to provide a seamless user experience. The system features biometric authentication, hotword detection, and extensive integration with popular platforms.
| Voice Control | Face Recognition | Hotword Detection | Web Integration |
|---|---|---|---|
| Advanced speech-to-text | Secure biometric auth | Always-on wake word | Modern responsive UI |
|
Voice & AI
|
Smart Integrations
|
graph TD
A[Web Frontend] -->|Eel Bridge| B[Main Process]
B --> C[Speech Recognition]
B --> D[Face Authentication]
B --> E[Hotword Detection]
C --> F[Command Parser]
F --> G[Feature Handlers]
G --> H[SQLite Database]
G --> I[WhatsApp Integration]
G --> J[YouTube Control]
G --> K[AI Chatbot]
%% Consistent style for all nodes
style A fill:#ede7f6,stroke:#4a148c,stroke-width:1px,color:#212121
style B fill:#ede7f6,stroke:#4a148c,stroke-width:1px,color:#212121
style C fill:#ede7f6,stroke:#4a148c,stroke-width:1px,color:#212121
style D fill:#ede7f6,stroke:#4a148c,stroke-width:1px,color:#212121
style E fill:#ede7f6,stroke:#4a148c,stroke-width:1px,color:#212121
style F fill:#ede7f6,stroke:#4a148c,stroke-width:1px,color:#212121
style G fill:#ede7f6,stroke:#4a148c,stroke-width:1px,color:#212121
style H fill:#ede7f6,stroke:#4a148c,stroke-width:1px,color:#212121
style I fill:#ede7f6,stroke:#4a148c,stroke-width:1px,color:#212121
style J fill:#ede7f6,stroke:#4a148c,stroke-width:1px,color:#212121
style K fill:#ede7f6,stroke:#4a148c,stroke-width:1px,color:#212121
OS: Windows 10/11, Linux, macOS
Python: 3.10+
RAM: 4GB minimum
Storage: 500MB free space |
Microphone: Required for voice input
Webcam: Required for face recognition
Internet: Active connection needed
Audio Output: Speakers/Headphones |
git clone https://github.com/vannu07/jarvis.git
cd jarvis|
Windows python -m venv venv
venv\Scripts\activate |
Linux/Mac python3 -m venv venv
source venv/bin/activate |
pip install -r requirements.txtCreate a .env file:
# API Keys
HUGGINGFACE_TOKEN=your_token_here
PORCUPINE_ACCESS_KEY=your_key_here
NEWSAPI_KEY=your_newsapi_key
# Voice Settings
TTS_RATE=150
TTS_VOICE=0
# Recognition Settings
FACE_CONFIDENCE_THRESHOLD=50
HOTWORD_SENSITIVITY=0.5python backend/auth/trainer.py
|
|
|
| Shortcut | Action |
|---|---|
Win + J (Windows) |
Manual Activation |
Cmd + J (macOS) |
Manual Activation |
Ctrl + Q |
Quit Application |
F11 |
Fullscreen Toggle |
Say "Jarvis" or "Alexa" followed by your command
jarvis/
├── backend/
│ ├── auth/
│ │ ├── haarcascade_frontalface_default.xml
│ │ ├── recognize.py # Face recognition
│ │ ├── trainer.py # Model training
│ │ └── trainer/ # Trained models
│ ├── command.py # Command parser
│ ├── config.py # Configuration
│ ├── db.py # Database ops
│ ├── feature.py # Feature handlers
│ └── helper.py # Utilities
├── frontend/
│ ├── assets/
│ │ ├── audio/ # Sound files
│ │ ├── img/ # Images & icons
│ │ └── vendor/ # Third-party libs
│ ├── index.html # Main UI
│ ├── style.css # Styles
│ ├── script.js # Particle effects
│ ├── main.js # Core logic
│ └── controller.js # Event handlers
├── main.py # Entry point
├── run.py # Launcher
├── requirements.txt # Dependencies
└── jarvis.db # SQLite DB
1. Define Command Pattern
Edit backend/command.py:
def parse_command(query: str) -> dict:
if "my custom action" in query.lower():
return {
"action": "custom_action",
"params": {"param1": "value1"}
}2. Implement Handler
Edit backend/feature.py:
def handle_custom_action(params: dict) -> str:
result = do_something(params)
return f"Action completed: {result}"3. Register Command
COMMAND_HANDLERS = {
"custom_action": handle_custom_action,
# ... other handlers
}-- Contacts Table
CREATE TABLE contacts (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT NOT NULL,
phone TEXT,
whatsapp TEXT,
email TEXT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Applications Table
CREATE TABLE apps (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT NOT NULL,
path TEXT NOT NULL,
keywords TEXT,
icon TEXT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Web Commands Table
CREATE TABLE web_commands (
id INTEGER PRIMARY KEY AUTOINCREMENT,
command TEXT NOT NULL,
url TEXT NOT NULL,
description TEXT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);# Run all tests
pytest tests/ -v
# Run with coverage
pytest --cov=backend --cov-report=html tests/
# Run specific test file
pytest tests/test_command.py -v
# Linting
black backend/ frontend/ --check
flake8 backend/
pylint backend/FROM python:3.10-slim
WORKDIR /app
# Install system dependencies
RUN apt-get update && apt-get install -y \
portaudio19-dev \
python3-pyaudio \
libopencv-dev \
&& rm -rf /var/lib/apt/lists/*
# Copy and install Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy application
COPY . .
EXPOSE 8000
CMD ["python", "run.py"]Build & Run:
docker build -t jarvis-ai .
docker run -p 8000:8000 -v $(pwd)/jarvis.db:/app/jarvis.db jarvis-ai| Metric | Value | Status |
|---|---|---|
| Cold Start Time | ~3.5s | |
| Response Latency | <200ms | |
| Face Recognition Accuracy | 94.2% | |
| Memory Footprint | ~150MB | |
| CPU Usage (Idle) | 2-5% |
Benchmarked on Windows 11, Intel i5-10400, 16GB RAM
Windows:
pip install pipwin
pipwin install pyaudioLinux:
sudo apt-get install portaudio19-dev python3-pyaudio
pip install pyaudiomacOS:
brew install portaudio
pip install pyaudio- Ensure good lighting conditions
- Position face 2-3 feet from camera
- Retrain model:
python backend/auth/trainer.py
- Check camera permissions in system settings
- Check microphone permissions
- Test microphone:
python -m speech_recognition
- Verify internet connection
- Try different microphone device
pip install --upgrade --force-reinstall -r requirements.txt# Windows
set JARVIS_DEBUG=1
python run.py
# Linux/Mac
export JARVIS_DEBUG=1
python run.py- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'feat: add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
type(scope): subject
[optional body]
[optional footer]
Types: feat, fix, docs, style, refactor, test, chore
Example:
git commit -m "feat(voice): add support for multiple languages"
git commit -m "fix(face): improve recognition accuracy in low light"
git commit -m "docs(readme): update installation instructions"- Follow PEP 8 for Python code
- Use type hints where applicable
- Write docstrings for public functions
- Run
blackandflake8before committing - Add unit tests for new features
|
|
|
Project Link: github.com/vannu07/jarvis
For issues, questions, or feature requests, please open an issue on GitHub
If you find this project helpful, please consider starring the repository
Made with Python
Copyright 2025