Thanks to visit codestin.com
Credit goes to Github.com

Skip to content
/ VYOM Public

VYOM (Virtual Yet Omnipotent Machine) is a futuristic AI-powered personal assistant , inspired by J.A.R.V.I.S. from Iron Man. Designed to simplify your digital life, VYOM uses advanced language models and browser automation to handle complex tasks with simple voice or text commands.

License

Notifications You must be signed in to change notification settings

Th-Shivam/VYOM

πŸ€– VYOM – Virtual Yet Omnipotent Machine

VYOM Banner

Python Version MIT License Contributions Welcome SWOC'26

πŸš€ A Futuristic AI-Powered Personal Assistant Inspired by J.A.R.V.I.S.


πŸ—οΈ Technical Architecture

VYOM is built on a Modular Multi-Threaded Architecture. Unlike linear assistants, VYOM decouples peripheral I/O (Voice/Listen) from core logic (NLP/Action) to prevent UI freezing and ensure real-time responsiveness.

System Flow & Data Lifecycle

The following diagram illustrates how a voice command propagates through the modular layers:

graph TD
    subgraph Input_Layer [Perception]
        A[🎀 Voice Input] -->|PyAudio / SpeechRecognition| B(Audio Stream)
        B -->|Whisper / Google API| C{Speech-to-Text}
    end

    subgraph Brain_Layer [Processing]
        C -->|Raw Text| D[🧠 NLP Engine]
        D -->|Intent Extraction| E{Action Router}
    end

    subgraph Execution_Layer [Action]
        E -->|System Cmd| F[OS Controller]
        E -->|Web Query| G[Browser Automation]
        E -->|API Call| H[Weather/IoT/News]
    end

    subgraph Output_Layer [Feedback]
        F & G & H --> I[πŸ—£οΈ TTS Engine]
        I --> J[πŸ”Š Speaker Output]
    end
Loading

🧠 Multi-Threading Logic

To maintain the "Always Listening" capability while executing heavy AI tasks, VYOM utilizes Python's threading and asyncio modules:

  • Thread 1 (Listener): Continuously monitors the microphone for the wake word.
  • Thread 2 (Processor): Handles API calls to Groq/Cohere without blocking the listener.
  • Thread 3 (Executor): Manages OS-level tasks and GUI updates.

πŸ“‚ Project Structure

For SWOC contributors, please refer to this modular map before submitting PRs:

VYOM/
β”‚
β”œβ”€β”€ Backend/                           # Core backend logic for the assistant
β”‚   β”‚
β”‚   β”œβ”€β”€ Automation.py                  # Handles task automation (system tasks, workflows)
β”‚   β”œβ”€β”€ ChatBot.py                     # Manages chatbot logic and conversational flow
β”‚   β”œβ”€β”€ ImageGeneration.py             # Generates images using AI models/APIs
β”‚   β”œβ”€β”€ Model.py                       # Loads and manages AI/ML models
β”‚   β”œβ”€β”€ Productivity.py                # Productivity features (notes, reminders, utilities)
β”‚   β”œβ”€β”€ RealTimeSearchEngine.py        # Performs real-time web/search queries
β”‚   β”œβ”€β”€ SpeechToText.py                # Converts spoken audio input into text
β”‚   └── TextToSpeech.py                # Converts text responses into spoken audio
β”‚
β”œβ”€β”€ Frontend/                          # User interface and client-side logic
β”‚   β”‚
β”‚   β”œβ”€β”€ Files/                         # Runtime data and application state storage
β”‚   β”‚   β”‚
β”‚   β”‚   β”œβ”€β”€ Database.data              # Stores persistent application data
β”‚   β”‚   β”œβ”€β”€ ImageGeneration.data       # Stores image generation history/results
β”‚   β”‚   β”œβ”€β”€ Mic.data                   # Stores microphone state and audio metadata
β”‚   β”‚   β”œβ”€β”€ Responses.data             # Stores chatbot responses
β”‚   β”‚   └── Status.data                # Tracks application and system status
β”‚   β”‚
β”‚   β”œβ”€β”€ Graphics/                      # UI assets and visual resources
β”‚   β”‚   β”‚
β”‚   β”‚   β”œβ”€β”€ Chats.png                  # Chat interface icon/image
β”‚   β”‚   β”œβ”€β”€ Close.png                  # Close window button icon
β”‚   β”‚   β”œβ”€β”€ GUI.py                     # GUI layout logic using graphical assets
β”‚   β”‚   β”œβ”€β”€ Home.png                   # Home screen icon/image
β”‚   β”‚   β”œβ”€β”€ Mic_off.png                # Microphone disabled icon
β”‚   β”‚   β”œβ”€β”€ Mic_on.png                 # Microphone enabled icon
β”‚   β”‚   β”œβ”€β”€ Minimize.png               # Minimize window icon
β”‚   β”‚   β”œβ”€β”€ maximize.png               # Maximize window icon
β”‚   β”‚   β”œβ”€β”€ minimize2.png              # Alternate minimize icon
β”‚   β”‚   β”œβ”€β”€ settings.png               # Settings icon
β”‚   β”‚   β”œβ”€β”€ VYOM.jpeg                  # Project logo / branding image
β”‚   β”‚   └── jarvis.gif                 # Animated assistant graphic
β”‚   β”‚
β”‚   β”œβ”€β”€ automation/                    # Frontend automation tests
β”‚   β”‚   └── test_gui.py                # Automated tests for GUI behavior
β”‚   β”‚
β”‚   β”œβ”€β”€ playwright_tests/              # Playwright-based UI testing
β”‚   β”‚   β”œβ”€β”€ homepage.png               # Screenshot of homepage during tests
β”‚   β”‚   β”œβ”€β”€ index.html                 # Static test page for UI validation
β”‚   β”‚   └── test_gui.py                # Playwright test cases for GUI
β”‚   β”‚
β”‚   β”œβ”€β”€ tests/                         # Frontend test specifications
β”‚   β”‚   └── test_issue4.spec.js        # Test case for reported issue #4
β”‚   β”‚
β”‚   β”œβ”€β”€ GUI.py                         # Main frontend GUI controller
β”‚   └── test_gui.py                    # Manual/functional GUI test script
β”‚
β”œβ”€β”€ config/                            # Configuration and environment settings
β”‚   β”‚
β”‚   β”œβ”€β”€ __init__.py                    # Marks config as a Python package
β”‚   └── settings.py                   # Centralized configuration variables
β”‚
β”œβ”€β”€ utils/                             # Shared utility functions
β”‚   β”‚
β”‚   β”œβ”€β”€ logger.py                     # Logging utilities for debugging and monitoring
β”‚   └── memory.py                     # Memory management and context handling
β”‚
β”œβ”€β”€ .env.example                       # Sample environment variables file
β”œβ”€β”€ .gitignore                         # Files and folders ignored by Git
β”œβ”€β”€ CODE_OF_CONDUCT.md                 # Community guidelines and behavior rules
β”œβ”€β”€ CONTRIBUTING.md                    # Contribution guidelines for developers
β”œβ”€β”€ LICENSE                            # Project licensing information
β”œβ”€β”€ README.md                          # Project overview and documentation
β”‚
β”œβ”€β”€ main.py                            # Application entry point
β”œβ”€β”€ requirements.txt                  # Python dependencies list
β”‚
β”œβ”€β”€ test_logger.py                    # Unit tests for logger utility
└── test_memory.py                    # Unit tests for memory utility


πŸ› οΈ Installation & Setup

Prerequisites

  • Python 3.13+
  • FFmpeg (Required for audio processing)
  • C++ Build Tools (Required for PyAudio on Windows)

🐧 Linux/Mac Setup (Audio Dependencies) Most setup errors occur due to missing audio driver headers. Run the following before pip install:

  • For Ubuntu/Debian:
sudo apt-get update
sudo apt-get install python3-pyaudio portaudio19-dev libasound2-dev espeak
  • For macOS:
brew install portaudio
pip install pyaudio

πŸ“¦ Standard Installation

1. Clone & Environment

git clone [https://github.com/th-shivam/vyom.git](https://github.com/th-shivam/vyom.git) && cd vyom
python -m venv .venv
source .venv/bin/activate  # Mac/Linux
# .venv\Scripts\activate   # Windows

2. Install & Run

pip install -r requirements.txt
python main.py

🀝 Contributing

We are proud to be an official part of Social Winter of Code (SWOC) 2026! πŸš€

We welcome contributors of all skill levels. To ensure a smooth collaboration, please identify your path:

  • 🌱 Beginners: Look for issues labeled good-first-issue and documentation. Perfect for your first PR!
  • πŸ› οΈ Advanced: Check for modular-enhancement and threading-optimization to work on the core engine.

πŸ›£οΈ Quick Workflow

  1. Fork the repository and create your branch.
  2. Follow the PEP 8 style guide for Python code.
  3. Ensure your module is placed in the correct directory (see Project Structure).
  4. Open a PR with a clear description of your changes.

πŸ“‹ Full Contributing Guide | πŸ—οΈ Architecture Deep Dive


πŸ“„ License

This project is licensed under the MIT License. You are free to use, modify, and distribute this software, provided the original copyright and license notice are included.

TL;DR: Open-source, permissive, and community-friendly.

See the LICENSE file for the full legal text.


If you find VYOM helpful, don't forget to give it a ⭐!

VYOM v2.0 β€’ Built with 🐍 Python β€’ Focused on πŸ—οΈ Modular Architecture

⬆ Back to Top

About

VYOM (Virtual Yet Omnipotent Machine) is a futuristic AI-powered personal assistant , inspired by J.A.R.V.I.S. from Iron Man. Designed to simplify your digital life, VYOM uses advanced language models and browser automation to handle complex tasks with simple voice or text commands.

Topics

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 13

Languages