Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View lilyzhng's full-sized avatar

Block or report lilyzhng

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results
TypeScript 3 2 Updated Oct 19, 2025

IROS 2025 Workshop Page

HTML 1 Updated Oct 27, 2025

GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

Python 1,722 102 Updated Oct 28, 2025

Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.

Python 5,882 318 Updated Sep 30, 2025

VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.

Python 3,623 301 Updated Oct 20, 2025

Inference, Fine Tuning and many more recipes with Gemma family of models

Jupyter Notebook 274 48 Updated Jul 18, 2025

Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek-R1, Qwen3, Gemma 3, TTS 2x faster with 70% less VRAM.

Python 47,716 3,898 Updated Nov 1, 2025

MiMo-VL

574 27 Updated Aug 21, 2025

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Python 9,410 731 Updated Sep 22, 2025

Code for CVPR2025 paper: Generating Multimodal Driving Scenes via Next-Scene Prediction

Python 91 2 Updated Oct 24, 2025

SplatAD: Real-Time Lidar and Camera Rendering with 3D Gaussian Splatting for Autonomous Driving

Cuda 277 20 Updated Oct 9, 2025

Curated list of papers and resources focused on 3D Gaussian Splatting, intended to keep pace with the anticipated surge of research in the coming months.

HTML 7,929 478 Updated Oct 1, 2025

FreeVS: Generative View Synthesis on Free Driving Trajectory

Python 144 2 Updated Feb 22, 2025

Official repo for the paper "Scaling Synthetic Data Creation with 1,000,000,000 Personas"

Python 1,378 110 Updated Feb 19, 2025

Starting point for the Women in AI RAG Hackathon, Jan 25 2025

Jupyter Notebook 8 20 Updated Jan 24, 2025

High-Resolution 3D Assets Generation with Large Scale Hunyuan3D Diffusion Models.

Python 12,233 1,196 Updated Oct 28, 2025

Synthetic data curation for post-training and structured data extraction

Python 1,538 123 Updated Jul 29, 2025

Official Implementation of paper "MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion"

Python 1,292 78 Updated Jun 16, 2025

New repo collection for NVIDIA Cosmos: https://github.com/nvidia-cosmos

8,066 520 Updated Jun 9, 2025

LLM powered retrieval engine designed to process a ton of sources to collect a comprehensive list of entities.

TypeScript 502 56 Updated May 7, 2024

Vision agent

Python 5,087 577 Updated Aug 30, 2025

[IROS 2023] DualCross: Cross-Modality Cross-Domain Adaptation for Monocular BEV Perception

Python 31 4 Updated Nov 28, 2023
Jupyter Notebook 13 Updated Aug 25, 2023

(T-IV, ITSC) Auto-labeling of point cloud sequences for 3D object detection using an ensemble of experts and temporal refinement

Python 188 21 Updated Aug 15, 2024

Fully typed & consistent chat APIs for OpenAI, Anthropic, Groq, and Azure's chat models for browser, edge, and node environments.

TypeScript 169 11 Updated May 31, 2024

Get structured, fully typed, and validated JSON outputs from OpenAI and Anthropic models.

TypeScript 626 15 Updated Apr 26, 2024

Painter & SegGPT Series: Vision Foundation Models from BAAI

Python 2,582 181 Updated Dec 6, 2024

An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary algorithms utilized include the Segment Anything Model (SAM) fo…

Jupyter Notebook 3,066 353 Updated Apr 25, 2024

An open-source framework for training large multimodal models.

Python 4,033 316 Updated Aug 31, 2024
Next