Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View BestSonny's full-sized avatar
🤔
Focusing
🤔
Focusing

Highlights

  • Pro

Block or report BestSonny

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

The Codes and Data of A Comprehensive Benchmark for Multimodal Large Language Models in Industrial Anomaly Detection [ICLR'25]

Python 183 13 Updated Aug 8, 2025

Cosmos-Drive-Dreams: Scalable Synthetic Driving Data Generation with World Foundation Models

Jupyter Notebook 269 26 Updated Oct 29, 2025

Discover Unknown Unsafe Events via Generative Simulation

Python 121 9 Updated Oct 28, 2025

Skyfall-GS: Synthesizing Immersive 3D Urban Scenes from Satellite Imagery

Python 193 14 Updated Oct 22, 2025

[CVPR 2025 Highlight] Towards Autonomous Micromobility through Scalable Urban Simulation

Python 132 9 Updated Oct 28, 2025

A Gym for Agentic LLMs

Python 344 19 Updated Oct 30, 2025

StreamingVLM: Real-Time Understanding for Infinite Video Streams

Python 622 39 Updated Oct 15, 2025

A fork to add multimodal model training to open-r1

Python 1,413 70 Updated Feb 8, 2025

A very simple GRPO implement for reproducing r1-like LLM thinking.

Python 1,413 109 Updated Aug 5, 2025

This repo contains the code for "VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks" [ICLR 2025]

Python 449 41 Updated Oct 24, 2025

Video-R1: Reinforcing Video Reasoning in MLLMs [šŸ”„the first paper to explore R1 for video]

Python 724 38 Updated Sep 19, 2025

🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.

Python 546 45 Updated Oct 21, 2025

A modular RL library to fine-tune language models to human preferences

Python 2,364 203 Updated Mar 1, 2024
Python 17 1 Updated Jun 10, 2025

Kimi-VL: Mixture-of-Experts Vision-Language Model for Multimodal Reasoning, Long-Context Understanding, and Strong Agent Capabilities

1,083 52 Updated Jul 15, 2025

Codebase for Aria - an Open Multimodal Native MoE

Jupyter Notebook 1,076 85 Updated Jan 22, 2025

[NeurIPS 2025] 4KAgent: Agentic Any Image to 4K Super-Resolution. An intelligent computer vision agent that can magically restore any image to perfect-4K!

Python 605 25 Updated Sep 24, 2025

A PyTorch and TorchDrug based deep learning library for drug pair scoring. (KDD 2022)

Python 754 97 Updated Sep 11, 2023

An open-source application for biological image analysis

Python 1,056 409 Updated Oct 29, 2025

Official code for TimeSeriesGym: A Scalable Benchmark for (Time Series) Machine Learning Engineering Agents

Python 25 4 Updated Sep 25, 2025

Superlinked is a Python framework for AI Engineers building high-performance search & recommendation applications that combine structured and unstructured data.

Jupyter Notebook 1,408 107 Updated Oct 22, 2025

A curated list of Large Language Model resources, covering model training, serving, fine-tuning, and building LLM applications.

4,302 588 Updated Aug 18, 2025

Agentless🐱: an agentless approach to automatically solve software development problems

Python 1,947 211 Updated Dec 22, 2024

The 100 line AI agent that solves GitHub issues or helps you in your command line. Radically simple, no huge configs, no giant monorepo—but scores >70% on SWE-bench verified!

Python 1,959 211 Updated Oct 27, 2025
Python 172 18 Updated Dec 20, 2024
2 Updated Aug 19, 2025

Video Annotation Tool

Vue 225 29 Updated Jun 18, 2024
Jupyter Notebook 232 632 Updated Oct 28, 2025

Official implementation of MatterGen -- a generative model for inorganic materials design across the periodic table that can be fine-tuned to steer the generation towards a wide range of property c…

Python 1,531 282 Updated Oct 6, 2025

Tarsier -- a family of large-scale video-language models, which is designed to generate high-quality video descriptions , together with good capability of general video understanding.

Python 497 28 Updated Aug 14, 2025
Next