This repository contains a set of Python scripts designed to transform raw manga images and associated JSON data into engaging read-along videos. Inspired by children's voice books, the videos feature character speech bubbles that appear in sync with narration, providing an interactive experience for weebs. (yes I use ChatGPT to generate this)
- Image Processing: Use Magiv2 to get the transcript
- Bubble Chat Animation: Creates a dynamic speech bubble sequence corresponding to character dialogues.
- Video Creation: Converts the processed images into a video format using
img2mp4.py, enabling smooth playback of the read-along experience. - Voice over: Uses a TTS model to read out the dialogue extracted from the images. The TTS engine converts the text into natural-sounding speech, which is then synchronized with the bubble animations.
main.py: The main file to run the whole processutils/process_raw_from_json.py: Script for processing manga images and generating animated bubble chats based on JSON data.utils/img2mp4.py: Script for converting a series of images into a video file.requirements.txt: Lists the dependencies needed to run the scripts.
-
Setup Environment:
- Make sure you have Python installed.
- Create a virtual environment (optional but recommended) and install the required packages:
pip install -r requirements.txt
-
Prepare Your Data:
- Place your manga and character images in a designated folder (e.g.,
input/rawfolder) - Works best when the raw is in English and they are named in sequential order (e.g.,
01.jpg,02.jpg,03.jpg) - Character images name should include gender for voice bank if text-to-voice feature is used, naming convention:
<character_name>_<gender>_<number>(e.g.,mom_female_1.jpg) - Ensure the corresponding JSON files (generated by MagiV2) are accessible.
- Place your manga and character images in a designated folder (e.g.,
-
Process Images and create Video with voice:
- Run the
main.pyscript, providing the necessary paths for images and JSON files.
python main.py
- Run the
-
View the Demo:
- A video demo showcasing the read-along feature can be found in the repository.
ruri_dragon.mov
no_color_panel.mp4
Feel free to fork the repository and submit pull requests if you have improvements or suggestions.
This project is licensed under the MIT License - see the LICENSE file for details.