ExtractVideoSlides

从课程视频中自动提取幻灯片：逐帧检测翻页，优先选择“无字幕”帧导出，保持原始分辨率。

功能特点

✅ 流式处理：逐帧读取视频，不预先导出全部帧
✅ 自动翻页检测：SSIM/MSE 比较判定 slide 切换
✅ 无字幕优选：同一页内优先保存检测到无字幕的帧
✅ 用户可告知是否有字幕：has_subtitle = auto/true/false
✅ 样例字幕自校准：可提供一张字幕样式截图，自动提取颜色范围适配不同样式
✅ 保留原分辨率，支持 PNG/JPEG；可选超分辨率放大（Real-ESRGAN）
✅ 配置集中：所有参数均在 config.toml 中调整

安装

pip install -r requirements.txt

依赖：Python 3.8+，opencv-python，numpy，scikit-image，torch（可选 GPU），tqdm，toml；字幕文档生成依赖 pytesseract（可选）；超分依赖 realesrgan/basicsr 等（可选）。

使用

# 基本用法
python main.py input_video.mp4

# 指定配置文件
python main.py input_video.mp4 -c custom_config.toml

# 指定输出目录
python main.py input_video.mp4 -o ./my_slides

# 调试日志
python main.py input_video.mp4 --log-level DEBUG

配置要点（config.toml）

视频与输出

[video]
input_path = "input.mp4"

[output]
output_dir = "output"
file_format = "{video_stem}-{index:04d}.png"
png_compression = 0
jpeg_quality = 100
enable_upscaling = false
upscale_factor = 2

字幕检测与策略

[subtitle]
# 用户提示："auto" | true | false
has_subtitle = "auto"
location = "bottom"          # 或 "top"
comparison_ratio = 0.15       # 翻页比较时裁掉字幕带的比例
detection_ratio = 0.2         # 在顶部/底部多少高度内找字幕
score_threshold = 0.0         # 低于此值视为无字幕
clean_threshold = 0.02        # 判定“无字幕帧”的上限
require_clean_slide = true    # 无字幕帧缺失时是否跳过该页
max_contour_height_ratio = 0.25
focus_ratio = 0.5             # 仅在检测带的下/上半部判定字幕
color_filter = true           # 默认开启颜色过滤
sample_image = ""             # 可选字幕样例截图路径

has_subtitle: 若视频确定无字幕，设为 false 可跳过字幕检测；确定有字幕可设 true；默认 auto 自动检测。
sample_image: 提供一张字幕样式截图，程序启动时自动提取 HSV 范围，适配不同字幕颜色。
clean_threshold: 字幕覆盖比例低于此值时认为该帧“干净”，作为保存优先。
require_clean_slide: 若开启且整页未找到干净帧，则跳过该页（不导出）。

处理参数

[processing]
sample_mode = "every_frame"   # 或 "by_time"
frame_interval_sec = 1.0       # by_time 时每多少秒抽一帧
comparison_method = "ssim"    # 或 "mse"
similarity_threshold = 0.95
resize_width = 320
resize_height = 180

字幕文档（可选）

[subtitle_doc]
enable = true                  # 需 pytesseract
# subtitle_output_dir / subtitle_file_name 可按需设置

高级

[advanced]
enable_black_frame_detection = false
min_slide_interval_sec = 2.0   # 预留

工作原理

逐帧读取视频，按采样策略取帧；
在字幕带区域进行颜色+形态检测，估计字幕覆盖比例，识别“干净帧”；
使用 SSIM/MSE 比较判定是否翻页；
每页优先保存检测到的无字幕帧；若允许回退则保存字幕最少帧；
可选：生成字幕文本（OCR），或对输出帧进行超分辨率放大。

参数调优小贴士

幻灯片切换灵敏度：提高 similarity_threshold 减少页数，降低则更敏感。
采样速度：frame_interval_sec 越大速度越快但可能漏检；every_frame 最稳但最慢。
字幕检测：
- 若误把正文当字幕，调小 max_contour_height_ratio 或 focus_ratio，或提供 sample_image。
- 若字幕漏检，适当调大 clean_threshold（如 0.03）或 detection_ratio。
无字幕视频：将 has_subtitle = false，避免不必要的检测开销。

项目结构

ExtractVideoSlides/
├── main.py
├── config.toml
├── requirements.txt
├── src/
│   ├── slide_comparator.py    # SSIM/MSE 翻页判断
│   ├── subtitle_detector.py   # 字幕存在与覆盖估计
│   ├── subtitle_calibrator.py # 从样例图自动提取字幕颜色范围
│   ├── slide_extractor.py     # 主流程：采样、检测、保存
│   ├── subtitle_doc.py        # 可选：字幕 OCR 文档
│   └── upscaler.py            # 可选：Real-ESRGAN 超分
└── output/                   # 输出目录（自动创建）

常见问题

幻灯片未输出？可能整页未找到“干净”帧，检查 require_clean_slide 或调高 clean_threshold。
仍包含字幕？提供 sample_image 或调低 clean_threshold；必要时关闭 require_clean_slide 以回退保存字幕最少帧。
速度慢？增大 frame_interval_sec 或关闭 subtitle_doc，必要时关闭超分。
无字幕视频？将 has_subtitle = false 加速并避免误判。

许可证

MIT License

贡献

欢迎提交 Issue 和 Pull Request！

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
src		src
LICENSE		LICENSE
README.md		README.md
config.toml		config.toml
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ExtractVideoSlides

功能特点

安装

使用

配置要点（config.toml）

视频与输出

字幕检测与策略

处理参数

字幕文档（可选）

高级

工作原理

参数调优小贴士

项目结构

常见问题

许可证

贡献

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ExtractVideoSlides

功能特点

安装

使用

配置要点（config.toml）

视频与输出

字幕检测与策略

处理参数

字幕文档（可选）

高级

工作原理

参数调优小贴士

项目结构

常见问题

许可证

贡献

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages