lilyzhng

Lily Zhang lilyzhng

Technical Lead, Research Scientist in 3D Generation and Vision Language Models

52 followers · 70 following

Palo Alto, CA
United States
11:57 (UTC -07:00)
https://lilyzhng.github.io/
in/lilyzhng
https://scholar.google.com/citations?user=la-Mx-UAAAAJ&hl=en

Achievements

Starred repositories

dzhng / nanobanana-hackathon

TypeScript 3 2 Updated Oct 19, 2025

lilyzhng / robogen-iros.github.io

Forked from robogen-iros/robogen-iros.github.io

IROS 2025 Workshop Page

HTML 1 Updated Oct 27, 2025

zai-org / GLM-V

GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

Python 1,722 102 Updated Oct 28, 2025

QwenLM / Qwen-Image

Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.

Python 5,882 318 Updated Sep 30, 2025

NVlabs / VILA

VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.

Python 3,623 301 Updated Oct 20, 2025

huggingface / huggingface-gemma-recipes

Inference, Fine Tuning and many more recipes with Gemma family of models

Jupyter Notebook 274 48 Updated Jul 18, 2025

unslothai / unsloth

Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek-R1, Qwen3, Gemma 3, TTS 2x faster with 70% less VRAM.

Python 47,716 3,898 Updated Nov 1, 2025

XiaomiMiMo / MiMo-VL

MiMo-VL

574 27 Updated Aug 21, 2025

OpenGVLab / InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Python 9,410 731 Updated Sep 22, 2025

YanhaoWu / UMGen

Code for CVPR2025 paper: Generating Multimodal Driving Scenes via Next-Scene Prediction

Python 91 2 Updated Oct 24, 2025

carlinds / splatad

SplatAD: Real-Time Lidar and Camera Rendering with 3D Gaussian Splatting for Autonomous Driving

Cuda 277 20 Updated Oct 9, 2025

MrNeRF / awesome-3D-gaussian-splatting

Curated list of papers and resources focused on 3D Gaussian Splatting, intended to keep pace with the anticipated surge of research in the coming months.

HTML 7,929 478 Updated Oct 1, 2025

esdolo / FreeVS

FreeVS: Generative View Synthesis on Free Driving Trajectory

Python 144 2 Updated Feb 22, 2025

tencent-ailab / persona-hub

Official repo for the paper "Scaling Synthetic Data Creation with 1,000,000,000 Personas"

Python 1,378 110 Updated Feb 19, 2025

stefanwebb / women-in-ai-hackathon

Starting point for the Women in AI RAG Hackathon, Jan 25 2025

Jupyter Notebook 8 20 Updated Jan 24, 2025

Tencent-Hunyuan / Hunyuan3D-2

High-Resolution 3D Assets Generation with Large Scale Hunyuan3D Diffusion Models.

Python 12,233 1,196 Updated Oct 28, 2025

bespokelabsai / curator

Synthetic data curation for post-training and structured data extraction

Python 1,538 123 Updated Jul 29, 2025

Junyi42 / monst3r

Official Implementation of paper "MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion"

Python 1,292 78 Updated Jun 16, 2025

NVIDIA / Cosmos

New repo collection for NVIDIA Cosmos: https://github.com/nvidia-cosmos

8,066 520 Updated Jun 9, 2025

dzhng / deep-seek

LLM powered retrieval engine designed to process a ton of sources to collect a comprehensive list of entities.

TypeScript 502 56 Updated May 7, 2024

landing-ai / vision-agent

Vision agent

Python 5,087 577 Updated Aug 30, 2025

YunzeMan / DualCross

[IROS 2023] DualCross: Cross-Modality Cross-Domain Adaptation for Monocular BEV Perception

Python 31 4 Updated Nov 28, 2023

towardsautonomy / DatasetEquity

Jupyter Notebook 13 Updated Aug 25, 2023

darrenjkt / MS3D

(T-IV, ITSC) Auto-labeling of point cloud sequences for 3D object detection using an ensemble of experts and temporal refinement

Python 188 21 Updated Aug 15, 2024

dzhng / llm-api

Fully typed & consistent chat APIs for OpenAI, Anthropic, Groq, and Azure's chat models for browser, edge, and node environments.

TypeScript 169 11 Updated May 31, 2024

dzhng / zod-gpt

Get structured, fully typed, and validated JSON outputs from OpenAI and Anthropic models.

TypeScript 626 15 Updated Apr 26, 2024

lilyzhng / lilyzhng.github.io

HTML 2 3 Updated Oct 23, 2025

baaivision / Painter

Painter & SegGPT Series: Vision Foundation Models from BAAI

Python 2,582 181 Updated Dec 6, 2024

z-x-yang / Segment-and-Track-Anything

An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary algorithms utilized include the Segment Anything Model (SAM) fo…

Jupyter Notebook 3,066 353 Updated Apr 25, 2024

mlfoundations / open_flamingo

An open-source framework for training large multimodal models.

Python 4,033 316 Updated Aug 31, 2024

Lily Zhang lilyzhng

Starred repositories

domain-adaptation

semi-supervised-learning