Stars
Official repository for BrickGPT, the first approach for generating physically stable toy brick models from text prompts.
Official implementation of the paper: "FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow Models"
【❤️ 互联网大厂技术分享PPT 👍🏻 持续更新中!】🍻各大技术交流会、活动资料汇总 ,如 👉QCon👉全球运维技术大会 👉 GDG 👉 全球技术领导力峰会👉大前端大会👉架构师峰会👉敏捷开发DevOps👉OpenResty👉Elastic,欢迎 PR / Issues
This is probably the best web presentation tool so far!
北京航空航天大学(北航)课程作业资料共享计划
DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception
[ACL 2024] ChartAssistant is a chart-based vision-language model for universal chart comprehension and reasoning.
Everything about the SmolLM and SmolVLM family of models
[CVPR 2024] Official RT-DETR (RTDETR paddle pytorch), Real-Time DEtection TRansformer, DETRs Beat YOLOs on Real-time Object Detection. 🔥 🔥 🔥
An arbitrary face-swapping framework on images and videos with one single trained model!
An on-premises, OCR-free unstructured data extraction, markdown conversion and benchmarking toolkit. (https://idp-leaderboard.org/)
[NeurIPS 2025] Mask Image Watermarking (Official Implementation)
This is the official implementation of our paper: "MiniMax-Remover: Taming Bad Noise Helps Video Object Removal"
DiffuEraser is a diffusion model for video inpainting, which performs great content completeness and temporal consistency while maintaining acceptable efficiency.
AI-Powered Watermark Remover using Florence-2 and LaMA Models: A Python application leveraging state-of-the-art deep learning models to effectively remove watermarks from images with a user-friendl…
基于AI的图片/视频硬字幕去除、文本水印去除,无损分辨率生成去字幕、去水印后的图片/视频文件。无需申请第三方API,本地实现。AI-based tool for removing hard-coded subtitles and text-like watermarks from videos or Pictures.
[ICCV 2023] ProPainter: Improving Propagation and Transformer for Video Inpainting
a machine learning image inpainting task that instinctively removes watermarks from image indistinguishable from the ground truth image
🔥 CNN for Watermark Removal using Deep Image Prior with Pytorch 🔥.
PyTorch code and models for the DINOv2 self-supervised learning method.
[NeurIPS 2025] YOLOv12: Attention-Centric Real-Time Object Detectors
PyTorch code for Vision Transformers training with the Self-Supervised learning method DINO
This is the third party implementation of the paper Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection.
MiniCPM-V 4.5: A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone
Code for the Lovász-Softmax loss (CVPR 2018)
The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.
Kimi K2 is the large language model series developed by Moonshot AI team