Jarvis AI Assistant

Your Personal Voice-Controlled AI Companion

A production-ready voice assistant with facial recognition authentication, built on modern Python architecture and web technologies.

Features • Installation • Usage • Documentation • Contributing

Overview

Jarvis is an intelligent voice assistant that combines speech recognition, natural language processing, and computer vision to provide a seamless user experience. The system features biometric authentication, hotword detection, and extensive integration with popular platforms.

Key Features

Voice Control	Face Recognition	Hotword Detection	Web Integration
Advanced speech-to-text	Secure biometric auth	Always-on wake word	Modern responsive UI

Core Capabilities

Voice & AI

Real-time Speech Recognition using Google STT
Natural Language Processing with Hugging Face
Text-to-Speech with customizable voices
Audio Visualization in real-time
Wake Word Detection ("Jarvis", "Alexa")

Smart Integrations

WhatsApp Automation (messages, calls, video)
YouTube Control via voice commands
System Control (apps, windows, shortcuts)
Contact Management with voice lookup
Web Browsing through voice

Technology Stack

Backend Technologies

Frontend Technologies

AI & ML

Tools & Libraries

System Architecture

graph TD
    A[Web Frontend] -->|Eel Bridge| B[Main Process]
    B --> C[Speech Recognition]
    B --> D[Face Authentication]
    B --> E[Hotword Detection]
    C --> F[Command Parser]
    F --> G[Feature Handlers]
    G --> H[SQLite Database]
    G --> I[WhatsApp Integration]
    G --> J[YouTube Control]
    G --> K[AI Chatbot]

    %% Consistent style for all nodes
    style A fill:#ede7f6,stroke:#4a148c,stroke-width:1px,color:#212121
    style B fill:#ede7f6,stroke:#4a148c,stroke-width:1px,color:#212121
    style C fill:#ede7f6,stroke:#4a148c,stroke-width:1px,color:#212121
    style D fill:#ede7f6,stroke:#4a148c,stroke-width:1px,color:#212121
    style E fill:#ede7f6,stroke:#4a148c,stroke-width:1px,color:#212121
    style F fill:#ede7f6,stroke:#4a148c,stroke-width:1px,color:#212121
    style G fill:#ede7f6,stroke:#4a148c,stroke-width:1px,color:#212121
    style H fill:#ede7f6,stroke:#4a148c,stroke-width:1px,color:#212121
    style I fill:#ede7f6,stroke:#4a148c,stroke-width:1px,color:#212121
    style J fill:#ede7f6,stroke:#4a148c,stroke-width:1px,color:#212121
    style K fill:#ede7f6,stroke:#4a148c,stroke-width:1px,color:#212121

Prerequisites

System Requirements

OS: Windows 10/11, Linux, macOS
Python: 3.10+
RAM: 4GB minimum
Storage: 500MB free space

Hardware

Microphone: Required for voice input
Webcam: Required for face recognition
Internet: Active connection needed
Audio Output: Speakers/Headphones

Installation

Step 1: Clone Repository

git clone https://github.com/vannu07/jarvis.git
cd jarvis

Step 2: Setup Virtual Environment

Windows

python -m venv venv
venv\Scripts\activate

Linux/Mac

python3 -m venv venv
source venv/bin/activate

Step 3: Install Dependencies

pip install -r requirements.txt

Step 4: Configure Environment

Create a .env file:

# API Keys
HUGGINGFACE_TOKEN=your_token_here
PORCUPINE_ACCESS_KEY=your_key_here
NEWSAPI_KEY=your_newsapi_key

# Voice Settings
TTS_RATE=150
TTS_VOICE=0

# Recognition Settings
FACE_CONFIDENCE_THRESHOLD=50
HOTWORD_SENSITIVITY=0.5

Step 5: Train Face Recognition (Optional)

python backend/auth/trainer.py

Quick Start

python run.py

Jarvis will launch at http://localhost:8000

Usage

Voice Commands

System Control

Jarvis, open Chrome
Jarvis, launch VS Code
Jarvis, close window
Jarvis, shutdown computer

Media Control

Jarvis, play Metallica
Jarvis, pause video
Jarvis, next song
Jarvis, volume up

Communication

Jarvis, message John
Jarvis, call Sarah
Jarvis, video call Mike
Jarvis, open WhatsApp

Keyboard Shortcuts

Shortcut	Action
`Win + J` (Windows)	Manual Activation
`Cmd + J` (macOS)	Manual Activation
`Ctrl + Q`	Quit Application
`F11`	Fullscreen Toggle

Wake Words

Say "Jarvis" or "Alexa" followed by your command

Project Structure

jarvis/
├── backend/
│   ├── auth/
│   │   ├── haarcascade_frontalface_default.xml
│   │   ├── recognize.py        # Face recognition
│   │   ├── trainer.py          # Model training
│   │   └── trainer/            # Trained models
│   ├── command.py              # Command parser
│   ├── config.py               # Configuration
│   ├── db.py                   # Database ops
│   ├── feature.py              # Feature handlers
│   └── helper.py               # Utilities
├── frontend/
│   ├── assets/
│   │   ├── audio/              # Sound files
│   │   ├── img/                # Images & icons
│   │   └── vendor/             # Third-party libs
│   ├── index.html              # Main UI
│   ├── style.css               # Styles
│   ├── script.js               # Particle effects
│   ├── main.js                 # Core logic
│   └── controller.js           # Event handlers
├── main.py                     # Entry point
├── run.py                      # Launcher
├── requirements.txt            # Dependencies
└── jarvis.db                  # SQLite DB

Development

Adding Custom Commands

1. Define Command Pattern

Edit backend/command.py:

def parse_command(query: str) -> dict:
    if "my custom action" in query.lower():
        return {
            "action": "custom_action",
            "params": {"param1": "value1"}
        }

2. Implement Handler

Edit backend/feature.py:

def handle_custom_action(params: dict) -> str:
    result = do_something(params)
    return f"Action completed: {result}"

3. Register Command

COMMAND_HANDLERS = {
    "custom_action": handle_custom_action,
    # ... other handlers
}

Database Schema

-- Contacts Table
CREATE TABLE contacts (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    name TEXT NOT NULL,
    phone TEXT,
    whatsapp TEXT,
    email TEXT,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Applications Table
CREATE TABLE apps (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    name TEXT NOT NULL,
    path TEXT NOT NULL,
    keywords TEXT,
    icon TEXT,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Web Commands Table
CREATE TABLE web_commands (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    command TEXT NOT NULL,
    url TEXT NOT NULL,
    description TEXT,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

Testing

# Run all tests
pytest tests/ -v

# Run with coverage
pytest --cov=backend --cov-report=html tests/

# Run specific test file
pytest tests/test_command.py -v

# Linting
black backend/ frontend/ --check
flake8 backend/
pylint backend/

Docker Deployment

FROM python:3.10-slim

WORKDIR /app

# Install system dependencies
RUN apt-get update && apt-get install -y \
    portaudio19-dev \
    python3-pyaudio \
    libopencv-dev \
    && rm -rf /var/lib/apt/lists/*

# Copy and install Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy application
COPY . .

EXPOSE 8000

CMD ["python", "run.py"]

Build & Run:

docker build -t jarvis-ai .
docker run -p 8000:8000 -v $(pwd)/jarvis.db:/app/jarvis.db jarvis-ai

Performance Metrics

Metric	Value	Status
Cold Start Time	~3.5s
Response Latency	<200ms
Face Recognition Accuracy	94.2%
Memory Footprint	~150MB
CPU Usage (Idle)	2-5%

Benchmarked on Windows 11, Intel i5-10400, 16GB RAM

Troubleshooting

PyAudio Installation Fails

Windows:

pip install pipwin
pipwin install pyaudio

Linux:

sudo apt-get install portaudio19-dev python3-pyaudio
pip install pyaudio

macOS:

brew install portaudio
pip install pyaudio

Face Recognition Not Working

Ensure good lighting conditions
Position face 2-3 feet from camera
Retrain model:
```
python backend/auth/trainer.py
```
Check camera permissions in system settings

Voice Commands Unresponsive

Check microphone permissions
Test microphone:
```
python -m speech_recognition
```
Verify internet connection
Try different microphone device

Module Import Errors

pip install --upgrade --force-reinstall -r requirements.txt

Enable Debug Mode

# Windows
set JARVIS_DEBUG=1
python run.py

# Linux/Mac
export JARVIS_DEBUG=1
python run.py

Contributing

We welcome contributions from the community

Contribution Guidelines

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'feat: add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Commit Convention

type(scope): subject

[optional body]

[optional footer]

Types: feat, fix, docs, style, refactor, test, chore

Example:

git commit -m "feat(voice): add support for multiple languages"
git commit -m "fix(face): improve recognition accuracy in low light"
git commit -m "docs(readme): update installation instructions"

Code Style

Follow PEP 8 for Python code
Use type hints where applicable
Write docstrings for public functions
Run black and flake8 before committing
Add unit tests for new features

Top Contributors

Roadmap

Short Term

Multi-language support
Mobile companion app
Theme customization
Plugin system

Medium Term

Cloud synchronization
Home automation
Voice training
Analytics dashboard

Long Term

Advanced AI models
Cross-platform support
Multi-user profiles
End-to-end encryption

License

This project is licensed under the MIT License

See LICENSE file for details

Acknowledgments

Special thanks to these amazing projects:

Support

Project Link: github.com/vannu07/jarvis

For issues, questions, or feature requests, please open an issue on GitHub

Show Your Support

If you find this project helpful, please consider starring the repository

Made with Python

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
.github		.github
backend		backend
frontend		frontend
testing		testing
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
ENVIRONMENT_SETUP.md		ENVIRONMENT_SETUP.md
HACKTOBERFEST.md		HACKTOBERFEST.md
LICENSE		LICENSE
PROJECT_ROADMAP.md		PROJECT_ROADMAP.md
README.md		README.md
hacktober.txt		hacktober.txt
hacktober_update.txt		hacktober_update.txt
main.py		main.py
news_fetcher.py		news_fetcher.py
news_log.txt		news_log.txt
requirements.txt		requirements.txt
run.py		run.py
setup_env.py		setup_env.py

License

vannu07/jarvis

Folders and files

Latest commit

History

Repository files navigation

Jarvis AI Assistant

Your Personal Voice-Controlled AI Companion

Overview

Key Features

Core Capabilities

Technology Stack

Backend Technologies

Frontend Technologies

AI & ML

Tools & Libraries

System Architecture

Prerequisites

System Requirements

Hardware

Installation

Step 1: Clone Repository

Step 2: Setup Virtual Environment

Step 3: Install Dependencies

Step 4: Configure Environment

Step 5: Train Face Recognition (Optional)

Quick Start

Usage

Voice Commands

System Control

Media Control

Communication

Keyboard Shortcuts

Wake Words

Project Structure

Development

Adding Custom Commands

Database Schema

Testing

Docker Deployment

Performance Metrics

Troubleshooting

PyAudio Installation Fails

Face Recognition Not Working

Voice Commands Unresponsive

Module Import Errors

Enable Debug Mode

Contributing

Contribution Guidelines

Commit Convention

Code Style

Top Contributors

Roadmap

Short Term

Medium Term

Long Term

License

Acknowledgments

Support

Show Your Support

About

Topics

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 11

Uh oh!

Languages

Packages