Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Generate a transcript for your favourite Manga: Detect manga characters, text blocks and panels. Order panels. Cluster characters. Match texts to their speakers. Perform OCR.

License

Notifications You must be signed in to change notification settings

BinhPQ2/magi_functional

 
 

Repository files navigation

Manga Read-Along Video Creator

Overview

This repository contains a set of Python scripts designed to transform raw manga images and associated JSON data into engaging read-along videos. Inspired by children's voice books, the videos feature character speech bubbles that appear in sync with narration, providing an interactive experience for weebs. (yes I use ChatGPT to generate this)

Features

  • Image Processing: Use Magiv2 to get the transcript
  • Bubble Chat Animation: Creates a dynamic speech bubble sequence corresponding to character dialogues.
  • Video Creation: Converts the processed images into a video format using img2mp4.py, enabling smooth playback of the read-along experience.
  • Voice over: Uses a TTS model to read out the dialogue extracted from the images. The TTS engine converts the text into natural-sounding speech, which is then synchronized with the bubble animations.

Files Included

  • main.py: The main file to run the whole process
  • utils/process_raw_from_json.py: Script for processing manga images and generating animated bubble chats based on JSON data.
  • utils/img2mp4.py: Script for converting a series of images into a video file.
  • requirements.txt: Lists the dependencies needed to run the scripts.

Usage

  1. Setup Environment:

    • Make sure you have Python installed.
    • Create a virtual environment (optional but recommended) and install the required packages:
    pip install -r requirements.txt
  2. Prepare Your Data:

    • Place your manga and character images in a designated folder (e.g., input/raw folder)
    • Works best when the raw is in English and they are named in sequential order (e.g., 01.jpg, 02.jpg, 03.jpg)
    • Character images name should include gender for voice bank if text-to-voice feature is used, naming convention: <character_name>_<gender>_<number> (e.g., mom_female_1.jpg)
    • Ensure the corresponding JSON files (generated by MagiV2) are accessible.
  3. Process Images and create Video with voice:

    • Run the main.py script, providing the necessary paths for images and JSON files.
    python main.py
  4. View the Demo:

    • A video demo showcasing the read-along feature can be found in the repository.

Image Demo

page_002_panel_000_bubble_000 page_002_panel_000_bubble_001 page_002_panel_000_bubble_002

Video Demo

Full-Page Demo

ruri_dragon.mov

Panel-View Demo

no_color_panel.mp4

Contribution

Feel free to fork the repository and submit pull requests if you have improvements or suggestions.

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

Generate a transcript for your favourite Manga: Detect manga characters, text blocks and panels. Order panels. Cluster characters. Match texts to their speakers. Perform OCR.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Jupyter Notebook 75.6%
  • Python 24.4%