- Auburn, Alabama
-
00:51
(UTC -05:00) - bestsonny.github.io
Highlights
- Pro
Starred repositories
The Codes and Data of A Comprehensive Benchmark for Multimodal Large Language Models in Industrial Anomaly Detection [ICLR'25]
Cosmos-Drive-Dreams: Scalable Synthetic Driving Data Generation with World Foundation Models
Discover Unknown Unsafe Events via Generative Simulation
Skyfall-GS: Synthesizing Immersive 3D Urban Scenes from Satellite Imagery
[CVPR 2025 Highlight] Towards Autonomous Micromobility through Scalable Urban Simulation
StreamingVLM: Real-Time Understanding for Infinite Video Streams
A fork to add multimodal model training to open-r1
A very simple GRPO implement for reproducing r1-like LLM thinking.
This repo contains the code for "VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks" [ICLR 2025]
Video-R1: Reinforcing Video Reasoning in MLLMs [š„the first paper to explore R1 for video]
š¾ OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.
A modular RL library to fine-tune language models to human preferences
Kimi-VL: Mixture-of-Experts Vision-Language Model for Multimodal Reasoning, Long-Context Understanding, and Strong Agent Capabilities
Codebase for Aria - an Open Multimodal Native MoE
[NeurIPS 2025] 4KAgent: Agentic Any Image to 4K Super-Resolution. An intelligent computer vision agent that can magically restore any image to perfect-4K!
A PyTorch and TorchDrug based deep learning library for drug pair scoring. (KDD 2022)
An open-source application for biological image analysis
Official code for TimeSeriesGym: A Scalable Benchmark for (Time Series) Machine Learning Engineering Agents
Superlinked is a Python framework for AI Engineers building high-performance search & recommendation applications that combine structured and unstructured data.
A curated list of Large Language Model resources, covering model training, serving, fine-tuning, and building LLM applications.
Agentlessš±: an agentless approach to automatically solve software development problems
The 100 line AI agent that solves GitHub issues or helps you in your command line. Radically simple, no huge configs, no giant monorepoābut scores >70% on SWE-bench verified!
Official implementation of MatterGen -- a generative model for inorganic materials design across the periodic table that can be fine-tuned to steer the generation towards a wide range of property cā¦
Tarsier -- a family of large-scale video-language models, which is designed to generate high-quality video descriptions , together with good capability of general video understanding.