Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@timerring
Copy link
Owner

Description

Auto slice videos based on the danmakus density, and summarize the content with large multimodal models GLM-4V-PLUS.

Related Issues

fix #150

Changes Proposed

  • new package autoslice
  • add the branch process in the original logic.

Who Can Review?

@timerring

Checklist

  • Code has been reviewed
  • Code complies with the project's code standards and best practices
  • Code has passed all tests
  • Code does not affect the normal use of existing features
  • Code has been commented properly
  • Documentation has been updated (if applicable)
  • Demo/checkpoint has been attached (if applicable)

@timerring timerring self-assigned this Dec 30, 2024
Copy link
Owner Author

@timerring timerring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, wait for test.

Copy link
Owner Author

@timerring timerring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ready to merge.

@timerring timerring merged commit 4afda49 into main Dec 30, 2024
@timerring timerring deleted the dev branch December 30, 2024 05:22
@moushicheng
Copy link

AI的效果如何?了解了一些源码,AI只是生成标题用...尝试了一下zhipu效果其实不尽人意?感觉AI无法很好的阅读出视频具体内容

@timerring
Copy link
Owner Author

AI的效果如何?了解了一些源码,AI只是生成标题用...尝试了一下zhipu效果其实不尽人意?感觉AI无法很好的阅读出视频具体内容

确实,只能作为一个大意总结的功能,我最近试了不少 prompt,目前来看,gemini 的效果远胜于 zhipu 的 GLM, GLM 给人的感觉更像识别画面中在做什么,而且似乎存在 fallback 策略,好几次不同的内容都能给我返回相似的标题,我猜测可能 zhipu 模型的 Gate 设置得更严?不太确定,如果要实际用的话,我认为 gemini 效果有时能让人眼前一亮,另外的话目前市面上好像很少有视频理解的多模态模型了。

@timerring
Copy link
Owner Author

AI的效果如何?了解了一些源码,AI只是生成标题用...尝试了一下zhipu效果其实不尽人意?感觉AI无法很好的阅读出视频具体内容

另外要替代的话,我目前的思路是通过摘要图的方式加上 whisper 字幕以及弹幕分析,可能会创建一个 workflow,然后进行整个流程,不知道会不会效果胜过单独的视频理解模型,等我最近优化完 bilitool 就试一试。

@timerring
Copy link
Owner Author

AI的效果如何?了解了一些源码,AI只是生成标题用...尝试了一下zhipu效果其实不尽人意?感觉AI无法很好的阅读出视频具体内容

我重新测试了 gemini,效果还可以。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] Automatically slice the video.

3 participants