Thanks to visit codestin.com
Credit goes to Github.com

SHI Labs

All

43 repositories

physical-ai-bench
Public
PAI-Bench: A Comprehensive Benchmark for Physical AI
benchmark world-model physical-ai
Python
•
MIT License
•0•40•1•0•Updated Dec 2, 2025Dec 2, 2025
Forget-Me-Not
Public
Forget-Me-Not: Learning to Forget in Text-to-Image Diffusion Models, 2023
Python
•
MIT License
•8•135•7•0•Updated Oct 22, 2025Oct 22, 2025
VisPer-LM
Public
[NeurIPS 2025] Elevating Visual Perception in Multimodal LLMs with Visual Embedding Distillation
Python
•1•69•2•0•Updated Oct 17, 2025Oct 17, 2025
IMG-Multimodal-Diffusion-Alignment
Public
IMG: Calibrating Diffusion Models via Implicit Multimodal Guidance, ICCV 2025
Python
•3•30•1•0•Updated Oct 1, 2025Oct 1, 2025
StyleNAT
Public
New flexible and efficient image generation framework that sets new SOTA on FFHQ-256 with FID 2.05, 2022
gan image-generation neighborhood-attention
Python
•
MIT License
•13•101•0•0•Updated Jun 26, 2025Jun 26, 2025
Slow-Fast-Video-Multimodal-LLM
Public
Python
•1•27•2•0•Updated Apr 8, 2025Apr 8, 2025
Diffusion-Driven-Test-Time-Adaptation-via-Synthetic-Domain-Alignment
Public
Everything to the Synthetic: Diffusion-driven Test-time Adaptation via Synthetic-Domain Alignment, arXiv 2024 / CVPR 2025
diffusion-models test-time-adaptation
Python
•2•38•1•0•Updated Mar 1, 2025Mar 1, 2025
Compact-Transformers
Public
Escaping the Big Data Paradigm with Compact Transformers, 2021 (Train your Vision Transformers in 30 mins on CIFAR-10 with a single GPU!)
Python
•
Apache License 2.0
•84•539•7•2•Updated Nov 5, 2024Nov 5, 2024
Smooth-Diffusion
Public
Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models arXiv 2023 / CVPR 2024
Python
•
MIT License
•7•353•13•0•Updated Sep 24, 2024Sep 24, 2024
CuMo
Public
CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts
Python
•
Apache License 2.0
•8•162•0•1•Updated Jun 8, 2024Jun 8, 2024
Neighborhood-Attention-Transformer
Public
Neighborhood Attention Transformer, arxiv 2022 / CVPR 2023. Dilated Neighborhood Attention Transformer, arxiv 2022
pytorch neighborhood-attention
Python
•
MIT License
•88•1.2k•5•0•Updated May 15, 2024May 15, 2024
VCoder
Public
[CVPR 2024] VCoder: Versatile Vision Encoders for Multimodal Large Language Models
Python
•
Apache License 2.0
•16•280•4•1•Updated Apr 17, 2024Apr 17, 2024
Rethinking-Text-Segmentation
Public
[CVPR 2021] Rethinking Text Segmentation: A Novel Dataset and A Text-Specific Refinement Approach
Python
•29•270•13•0•Updated Dec 2, 2023Dec 2, 2023
Matting-Anything
Public
Matting Anything Model (MAM), an efficient and versatile framework for estimating the alpha matte of any instance in an image with flexible and interactive visual or linguistic user prompt guidance.
Python
•
MIT License
•49•689•8•1•Updated Nov 18, 2023Nov 18, 2023
Prompt-Free-Diffusion
Public
Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion Models, arxiv 2023 / CVPR 2024
Python
•
MIT License
•38•757•15•1•Updated Nov 16, 2023Nov 16, 2023
VIM
Public
Python
•
MIT License
•4•63•4•0•Updated Nov 8, 2023Nov 8, 2023
Versatile-Diffusion
Public
Versatile Diffusion: Text, Images and Variations All in One Diffusion Model, arXiv 2022 / ICCV 2023
Python
•
MIT License
•85•1.3k•9•1•Updated Aug 10, 2023Aug 10, 2023
OneFormer-Colab
Public
[Colab Demo Code] OneFormer: One Transformer to Rule Universal Image Segmentation.
transformer coco image-segmentation semantic-segmentation cityscapes instance-segmentation ade20k panoptic-segmentation universal-segmentation oneformer
Python
•
MIT License
•10•14•1•0•Updated May 24, 2023May 24, 2023
PAIR-Diffusion
Public
PAIR-Diffusion: Object-Level Image Editing with Structure-and-Appearance Paired Diffusion Models, 2023
Python
•
MIT License
•21•3•0•0•Updated May 19, 2023May 19, 2023
Text2Video-Zero
Public
a copy of "Text-to-Image Diffusion Models are Zero-Shot Video Generators", ICCV 2023
Python
•
Other
•389•2•0•0•Updated May 6, 2023May 6, 2023
Text2Video-Zero-sd-webui
Public
Python
•
Other
•15•82•6•0•Updated Apr 10, 2023Apr 10, 2023
SH-GAN
Public
[WACV 2023] Image Completion with Heterogeneously Filtered Spectral Hints
Python
•4•69•3•0•Updated Mar 28, 2023Mar 28, 2023
Boosted-Dynamic-Networks
Public
Boosted Dynamic Neural Networks, AAAI 2023
Python
•
MIT License
•3•8•1•0•Updated Dec 1, 2022Dec 1, 2022
VMFormer
Public
[Preprint] VMFormer: End-to-End Video Matting with Transformer
video-matting vision-transformer
Python
•
Other
•9•120•8•0•Updated Nov 30, 2022Nov 30, 2022
Unsupervised-Domain-Adaptation-with-Differential-Treatment
Public
[CVPR 2020] Differential Treatment for Stuff and Things: A Simple Unsupervised Domain Adaptation Method for Semantic Segmentation
Python
•14•92•5•2•Updated Nov 22, 2022Nov 22, 2022
Convolutional-MLPs
Public
[Preprint] ConvMLP: Hierarchical Convolutional MLPs for Vision, 2021
Python
•
Apache License 2.0
•16•167•4•0•Updated Oct 11, 2022Oct 11, 2022
LIVE-Layerwise-Image-Vectorization
Public
[CVPR 2022 Oral] Towards Layer-wise Image Vectorization
Python
•
Apache License 2.0
•63•2•0•0•Updated Jun 10, 2022Jun 10, 2022
VideoINR-Continuous-Space-Time-Super-Resolution
Public
[CVPR 2022] VideoINR: Learning Video Implicit Neural Representation for Continuous Space-Time Super-Resolution
Python
•23•0•0•0•Updated Jun 9, 2022Jun 9, 2022
SinNeRF
Public
"SinNeRF: Training Neural Radiance Fields on Complex Scenes from a Single Image", Dejia Xu, Yifan Jiang, Peihao Wang, Zhiwen Fan, Humphrey Shi, Zhangyang Wang
Python
•25•0•0•0•Updated May 3, 2022May 3, 2022
micromotion-styleGAN
Public
Python
•
MIT License
•10•0•0•0•Updated May 1, 2022May 1, 2022