Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Accessibility software with minimal system permissions and on-device processing for users with limited mobility.

License

Notifications You must be signed in to change notification settings

rick12000/vocalance

Repository files navigation

Vocalance Logo

💡 Overview

Vocalance offers hands free control of your computer, enabling you to switch tabs, move on screen, dictate anywhere and much more!

🚀 Website

To find out more about what Vocalance can do, including detailed instructions and guides, refer to the official website:

Vocalance Logo

💻 Installation

Vocalance can be set up entirely from the source code in this repository. To do so, follow the instructions below (currently only supported on Windows):

1. Set Up UV

  1. Open Windows PowerShell and enter the script below to install UV (Python package manager):

    powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
  2. Add UV to path (this is specific to this terminal session only, repeat this step every time, or add to permanent path to skip):

    $env:Path = "$HOME\.local\bin;$env:Path"

2. Set Up Vocalance

  1. Create a 3.13.9 virtual environment named vocalance_env with UV:

    uv venv --python 3.13.9 vocalance_env
  2. Activate the environment:

    vocalance_env\Scripts\activate
  3. Clone the repository:

    git clone https://github.com/rick12000/vocalance.git
  4. Go to the repository directory:

    cd vocalance
  5. Install Vocalance from uv.lock:

    uv sync --active
  6. Run the application:

    python vocalance.py

The application will start up and download any required models (like speech recognition models) on first run (these are downloaded from Hugging Face or other reputable hosts). This may take several minutes depending on your internet connection.

Then you're good to go! If you haven't already, refer to Vocalance's official website for instructions on how to get started.

3. Reopen Vocalance

If you want to reopen Vocalance after you closed it, you can repeat above steps, but skipping all installation steps.

Specificaly, open a new Windows PowerShell window and enter the following chained commands (taken from set up section):

$env:Path = "$HOME\.local\bin;$env:Path"; vocalance_env\Scripts\activate; cd vocalance; uv sync --active; python vocalance.py

This will start Vocalance.

An Aside on Pip

The recommended approach is to install Vocalance with uv, since the developers can freeze and document all recommended dependancies in a uv.lock file, which you then install with uv sync --active.

If you're more familiar with a mixture of a virtual environment manager (eg. venv or conda or pyenv) + pip however, you can absolutely replace above uv steps with your environment manager and replace uv sync --active with pip install . to install Vocalance as a package. Note this is at your discretion, and license disclosures in this repository pertain to pinned package versions in uv.lock.

🔧 System Requirements

  • Operating System: Windows 10/11 (macOS and Linux support planned)
  • RAM: 2GB RAM
  • Disk: 5GB
  • Hardware: It is strongly recommended to purchase a reasonably good headset or microphone to improve Vocalance outputs and recognition, but it will still work without this.

🤝 Contributing

Reach out at [email protected] with title "Contribution" if:

  • You have software engineering experience and have feedback on how the architecture of the application could be improved.
  • You want to add an original or pre-approved feature.

For now, contributions will be handled on an ad-hoc basis, but in future contribution guidelines will be set up depending on the number of contributors.

📚 Technical Documentation

If you want to find out more about Vocalance's architecture, refer to the technical documentation pages:

📈 Upcoming Features

The following features are planned additions to Vocalance, with some in early development and others under consideration:

  • Eye Tracking for Cursor Control: This feature is planned to enable cursor control via eye movements.

    • Gaze Tracking Accuracy: Merge gaze tracking with historical screen click data and screen contents to improve accuracy, aiming for good performance even with webcam tracking.
    • Zoom Option: Add a zoom option to better direct gaze on screen contents.
  • Context-Aware Commands: Implement context bucketing for commands, allowing the same command phrase (e.g., "previous") to map to different hotkeys depending on the active application (e.g., VSCode vs. Chrome). This aims to avoid disambiguation phrases.

  • LLM-Powered Text Refactoring: Ability to select any text and reformat it via an LLM by speaking a prompt.

  • Improved Text Editing & Navigation: Further enhancements to text editing and text navigation tools.

  • Enhanced Predictive Features: Improve predictive capabilities based on window contents, recent context, gaze patterns, and more.

    • Privacy Note: Any feature requiring local storage of potentially sensitive data (e.g., screenshots, window contents) will be deployed as an opt-in feature and disabled by default.

About

Accessibility software with minimal system permissions and on-device processing for users with limited mobility.

Topics

Resources

License

Stars

Watchers

Forks

Languages