Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Desktop app that indexes videos with AI (object detection, face recognition, emotion analysis), enables semantic search through natural language queries, and generates rough cuts

License

MIT, MIT licenses found

Licenses found

MIT
LICENSE
MIT
LICENSE.md

IliasHad/edit-mind

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🎬 Edit Mind β€” AI-Powered Video Indexing & Semantic Search

License: MIT PRs Welcome Made with Electron ChromaDB

⚠️ Development Status: Edit Mind is currently in active development and not yet production-ready.
Expect incomplete features and occasional bugs. We welcome contributors to help us reach v1.0!

🧠 Your Video Library, Reimagined

Edit Mind is a cross-platform desktop app that acts as an editor’s second brain.

Screenshot 2025-10-26 at 21 51 30

It locally indexes your entire video library, generating deep metadata using AI analysis β€” including:

  • πŸŽ™ Full transcriptions
  • πŸ‘€ Recognized faces
  • 🎨 Dominant colors
  • πŸ“¦ Detected objects
  • πŸ”€ On-screen text (OCR)

This creates a fully searchable, offline-first video database, letting you find the exact shot you need in seconds.


πŸ“Ί See It In Action

Edit Mind Demo

Click to watch a walkthrough of Edit Mind's core features


βš™οΈ How It Works

When you add a video, Edit Mind runs a complete AI-powered local analysis pipeline:

  1. πŸŽ™ Full Transcription β€” Extracts and transcribes the audio track using a local OpenAI Whisper model for time-stamped dialogue.
  2. 🎞 Scene Segmentation β€” Splits the video into 2-second β€œScenes” for precise frame-level indexing.
  3. 🧩 Deep Frame Analysis β€” Each Scene is analyzed by Python plugins to:
    • Recognize faces
    • Detect objects
    • Perform OCR (on-screen text)
    • Analyze colors and composition
  4. 🧠 Data Consolidation β€” Aligns spoken text with visual content using timestamps.
  5. πŸ” Vector Embedding & Storage β€” All extracted data (transcripts, tags, and metadata) are embedded using Google Text Embedding Models and stored locally in ChromaDB.
  6. πŸ—£ Semantic Search Parsing β€” When you search in natural language (e.g. β€œshow me all clips where Ilias looks happy”), Edit Mind uses Google Gemini 2.5 Pro to convert your search prompt into a structured JSON query.
    This query is then executed locally against the ChromaDB vector store to retrieve relevant scenes.

πŸ’‘ Privacy by Design:
All video files, frames, and extracted metadata remain fully local.
The only cloud-based component is the Gemini API call for search prompt interpretation and Google text embedding generation β€” no raw video are ever uploaded.
In a future update, Edit Mind will include the option to use offline embedding and query models for completely disconnected operation.


✨ Features

Category Description
πŸ”’ Privacy-First 100% local AI processing. Your videos never leave your device.
🧠 Deep Indexing Extracts transcription, faces, objects, text, and colors automatically.
πŸ” Semantic Search Search your videos by meaning, not just filenames β€” e.g. β€œscenes with two people talking at a table.”
🎬 AI-Generated Rough Cuts Describe your desired sequence in natural language:
β€œGive me all clips where @ilias looks happy.”
Edit Mind finds matching scenes and assembles a rough cut.
πŸ’» Cross-Platform Runs on macOS, Windows, and Linux (Electron).
🧩 Plugin-Based Architecture Easily extend analysis capabilities with Python plugins (e.g. logo detection, emotion analysis).
πŸͺ„ Modern UI Built with React, TypeScript, and shadcn/ui for a clean, responsive experience.

🧭 Roadmap

v0.2.0

  • Advanced search filters (date range, camera type)
  • Export rough cuts as an Adobe Premiere Pro and Final Cut Pro project
  • Improved indexing performance

v0.3.0

  • New analysis plugins (e.g., audio event detection)
  • Plugin documentation and examples

Future

  • Optional cloud sync for indexes
  • Collaborative tagging and shared libraries
  • Plugin marketplace

πŸ› οΈ Tech Stack

Area Technology
App Framework Electron
Frontend React, TypeScript, Vite
UI / Styling shadcn/ui, Tailwind CSS
Backend (Main) Node.js
AI / ML Python, OpenCV, PyTorch, Whisper
Vector Database ChromaDB
Packaging Electron Builder
Linting / Formatting ESLint, Prettier

πŸš€ Getting Started

Prerequisites

  • Node.js v22+
  • Python v3.9+
  • Recommended Hardware: Multi-core CPU, modern GPU, and at least 8GB RAM.

Installation

# Clone the repo
git clone https://github.com/iliashad/edit-mind
cd edit-mind

Install Node.js dependencies

npm install

Set up the Python environment

cd python
python3.12 -m venv .venv                                                  
source .venv/bin/activate   # (macOS/Linux)
# .\.venv\Scripts\activate  # (Windows)
pip install -r requirements.txt
pip install chromadb
chroma run --host localhost --port 8000 --path .chroma_db

Configuration

Create a .env file in the project root:

GEMINI_API_KEY=your_api_key_here

Running the Application

With the setup complete, you can start the application.

npm run start

πŸ—οΈ Building for Production

To create a distributable package for your operating system, use the build command:

npm run build:mac

This will generate an installer or executable in the out/ directory, configured according to electron-builder.yml.

πŸ“‚ Project Structure

The project is organized to maintain a clear separation of concerns:

  • app/: Contains all the React frontend code (pages, components, hooks, styles). This is the renderer process.
  • lib/: Contains the core Electron application logic.
    • main/: The Electron main process entry point and core backend services.
    • preload/: The preload script for securely bridging the main and renderer processes.
    • conveyor/: A custom-built, type-safe IPC (Inter-Process Communication) system.
    • services/: Node.js services that orchestrate tasks like calling Python scripts.
  • python/: Home to all Python scripts for AI/ML analysis, transcription, and more.
  • resources/: Static assets that are not part of the web build, like the application icon.

πŸ“Š Performance Benchmarks

To help you understand Edit Mind's resource requirements, here are real-world performance metrics from analyzing large video files.

Test Environment

  • Hardware: M1 MacBook Max with 64 GB RAM
  • Enabled Plugins:
    • ObjectDetectionPlugin
    • FaceRecognitionPlugin
    • ShotTypePlugin
    • EnvironmentPlugin
    • DominantColorPlugin

Note: The metrics below reflect frame analysis time and peak memory usage. Transcription and embedding score processing stages are not included in these measurements.

(Lower is better - 1.0Γ— means processing takes the same time as video duration)

File Size (MB) Video Codec Frame Analysis Time (s) Video Duration (s) Processing Rate Peak Memory (MB)
20150.38 h264 7707.29 3372.75 2.29Γ— 4995.45
11012.64 hevc 3719.77 1537.54 2.42Γ— 10356.77
11012.24 hevc 3326.29 1537.54 2.16Γ— 11363.27
11001.07 hevc 1576.47 768.77 2.05Γ— 10711.09
11000.95 hevc 1592.94 768.77 2.07Γ— 11250.42
11000.55 hevc 1598.97 768.77 2.08Γ— 10797.03
11000.15 hevc 2712.68 768.77 3.53Γ— 5127.25
10999.96 hevc 1592.72 768.77 2.07Γ— 11328.47
10755.45 hevc 3762.24 751.65 5.01Γ— 5196.98

Key Takeaways

  • Processing Speed: Approximately 2-3 hours of analysis time per hour of video content with all plugins enabled
  • Memory Usage: Peak memory consumption ranges from 5-11 GB depending on video complexity and codec
  • Codec Impact: HEVC videos show varied performance, likely due to differences in encoding parameters and scene complexity

πŸ’‘ Performance Tips:

  • Disable unused plugins to reduce processing time and memory usage
  • Consider processing large files during off-hours
  • Ensure sufficient RAM (16GB+ recommended for optimal performance)
  • SSD storage significantly improves I/O performance during analysis

πŸ§‘β€πŸ’» How to Contribute

We welcome contributions of all kinds! Here are a few ways you can help:

  • Reporting Bugs: If you find a bug, please open an issue.
  • Improving the UI: Have ideas to make the interface better? We'd love to hear them.
  • Creating a Plugin: The analysis pipeline is built on plugins. If you have an idea for a new analyzer (e.g., logo detection, audio event classification), this is a great place to start. Check out the existing plugins in the python/plugins/ directory to see how they work.

🀝 Contributing

As an open-source project in its early stages, we are actively looking for contributors. Whether it's fixing bugs, adding new analysis plugins, or improving the UI, your help is invaluable.

Please read CONTRIBUTING.md for details on our code of conduct and the process for submitting pull requests.

πŸ™ Acknowledgements

This project was bootstrapped from the excellent guasam/electron-react-app template. It provided a solid foundation with a modern Electron, React, and Vite setup, which allowed us to focus on building the core features of Edit Mind.

⚠️ Known Challenges & Areas for Contribution

While the core architecture is robust, the project is still in early development. Contributions are welcome in solving these key challenges to make the app production-ready.

  1. Application Packaging & Distribution: The current setup is developer-focused. A major goal is to create a seamless, one-click installer for non-technical users. This involves bundling the Python environment, ML models, and all dependencies into the final Electron application for macOS, Windows, and Linux. Contributions in this area (e.g., using PyInstaller, managing model downloads) are highly welcome.

  2. Performance on Consumer Hardware: The analysis pipeline is resource-intensive. While the code includes memory monitoring and optimizations, further work is needed to ensure smooth operation on a variety of consumer-grade machines. Key areas for improvement include:

    • Implementing a robust background queuing system for video processing.
    • Adding user-configurable "analysis levels" (e.g., "transcription only" vs. "full analysis").
    • Further optimization of the frame processing and ML inference steps.
  3. Data Schema Evolution: As new plugins and features are added, the metadata schema for scenes will evolve. A long-term challenge is to implement a strategy for handling data migrations, allowing users to "upgrade" their existing indexed data to a new schema without having to re-index their entire library from scratch.


πŸ“„ License

This project is licensed under the MIT License - see the LICENSE.md file for details.

About

Desktop app that indexes videos with AI (object detection, face recognition, emotion analysis), enables semantic search through natural language queries, and generates rough cuts

Topics

Resources

License

MIT, MIT licenses found

Licenses found

MIT
LICENSE
MIT
LICENSE.md

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published