Lists (2)
Sort Name ascending (A-Z)
Stars
A curated collection of fun and creative examples generated with Nano Banana🍌, Gemini-2.5-flash-image based model. We also release Nano-consistent-150K openly to support the community's development…
🔥[IJCAI 2022, Official Code] for paper "Rethinking Image Aesthetics Assessment: Models, Datasets and Benchmarks". Official Weights and Demos provided. 首个面向多主题场景的美学评估数据集、算法和benchmark.
An open-source AI agent that brings the power of Gemini directly into your terminal.
Open Source DeepWiki: AI-Powered Wiki Generator for GitHub/Gitlab/Bitbucket Repositories. Join the discord: https://discord.gg/gMwThUMeme
CLIP+MLP Aesthetic Score Predictor
[NeurIPS 2025] OmniSVG is the first family of end-to-end multimodal SVG generators that leverage pre-trained Vision-Language Models (VLMs), capable of generating complex and detailed SVGs, from sim…
Scan for React performance issues and eliminate slow renders in your app
Convert any website to editable Figma designs
ui-screenshot-to-prompt is an AI-powered tool that analyzes UI images to generate detailed prompts for AI coders. It uses computer vision and natural language processing to break down UI components…
Prompt, run, edit, and deploy full-stack web applications. -- bolt.new -- Help Center: https://support.bolt.new/ -- Community Support: https://discord.com/invite/stackblitz
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
From Chain-of-Thought prompting to OpenAI o1 and DeepSeek-R1 🍓
[NeurIPS 2023] Tree of Thoughts: Deliberate Problem Solving with Large Language Models
Fast Rust bundler for JavaScript/TypeScript with Rollup-compatible API.
《李宏毅深度学习教程》(李宏毅老师推荐👍,苹果书🍎),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases
An accurate GUI element detection approach based on old-fashioned CV algorithms [Upgraded on 5/July/2021]
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
[arXiv 2023] Set-of-Mark Prompting for GPT-4V and LMMs
Multimodal Large Language Models for Code Generation under Multimodal Scenarios
Web Extension for saving a faithful copy of a complete web page in a single HTML file
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
A simple screen parsing tool towards pure vision based GUI agent
12 Weeks, 24 Lessons, AI for All!
TikTok 发布/喜欢/合辑/直播/视频/图集/音乐;抖音发布/喜欢/收藏/收藏夹/视频/图集/实况/直播/音乐/合集/评论/账号/搜索/热榜数据采集工具/下载工具