AI is changing faster than developers can keep up. This repo is a monthly collection of AI news and resources for developers.
- Kimi Code: Kimi CLI Coding
- Kimi K2.5: Aesthetic Coding x Agent Swarm
- NVIDIA PersonaPlex: The most Natural speech-to-speech conversational AI.
- Introducing Agentic Vision in Gemini 3 Flash
- Ollama Launch: A new Ollama command that sets up and runs your favorite coding tools like Claude Code, OpenCode, and Codex.
- Clawdbot: The AI that does everything.
- Ollama Image Generation:: Generate images on macOS & Windows.
- Ollama + Claude Code: Run Claude Code locally with open-source models.
- Ollama + OpenAI Codex: Run Codex locally with open-source models.
- RalphTUI: An AI Agent Loop Orchestrator.
- Chroma 1.0: Open-source real-time speech-to-speech model. Quick start.
- GLM-4.7-Flash: Local coding and agentic assistant.
- Heartmula: Local and open-source AI music generator. Tutorial. Read the research paper.
- Blackbox CLI: Run Claude Code, Codex, Gemini CLI, + others in a single CLI.
- ChatGPT Go: Low-cost ChatGPT subscription.
- TranslateGemma: A new suite of open translation models.
- openresponses.org: An open-source spec for building multi-provider LLM interfaces. X post.
- GPT-5.2 Codex in OpenAI API: In GitHub Copilot, Cursor, Warp.
- The FreeMoCap project: X post. A free motion capture for everyone.
- Anthropic Cowork: Claude Code for non-technical tasks.
- Kyutai Pocket TTS: A high quality TTS that gives your CPU a voice.
- Advancing Claude: Healthcare and the life sciences.
- UCP: Universal Commerce Protocol on Google.
- NVIDIA Alpamayo: Reason-driven AI model for autonomous vehicles.
- LTX-2 Model: Best open-source multimodal AI video generation.
- ChatGPT Health
- NVIDIA New Open Models
- G0 Plus VLA: Pick up anything AI model.
- FlowDeck: Build and ship SwiftUI apps without leaving Cursor.
- skillseekersweb.com/: Automatically convert documentation websites, GitHub repositories, and PDF files into production-ready skills for any LLM platform—Claude, Gemini, OpenAI
- Manus joins Meta
- Minimax 2.1: Real-world agentic coding
- GLM-4.7: Advanced AI coding
- Grok Voice Agent API: Realtime Speech-to-Speech
- Gemini 3 Flash
- Introducing GPT-5.2-Codex
- Qwen Image Layered
- Molmo 2: Video understanding AI model
- gpt-image-1.5
- Zoom Federated AI
- SAM Audio: Multimodal Model for Audio Separation
- Qwen Code v0.5.0
- Google Code Wiki: Auto-generate Architectural diagrams for code
- Nemotron 3 Family of Open Models
- Gemini Interactions API
- Build iPhone apps on iPhone
- GPT-5.2, X post
- Gemini Text-to-Speech models
- Visual editor for Cursor Browser, X post
- Agentic AI Foundation: Anthropic + OpenAI
- Devstral 2 and Mistral Vibe CLI
- DeepSeek-V3.2
- Mistral 3
- Qwen3-TTS
- VoxCPM Text-to-Speech
- Claude 4.5 Opus: World's best coding model
- Gemini 3
- Google Antigravity: AI-Assisted IDE
- Grok 4.1
- Meta Sam 3: Segment Anything AI Model
- GPT-5.1: A smarter, more conversational ChatGPT
- Kimi K2 Thinking
- Gemini Built-In RAG: File Search in the API
- ElevenLabs Scribe v2 Realtime: Speech-to-Text
- Vision Agents: Build vision/voice/video AI apps in Python
- Claude Developer Platform (API): Structured outputs
- Anthropic is building its own AI infrastructure
- Cursor 2.0: Redesigned Agentic UI
- MiniMax M2: For Efficient Agentic Coding
- Neo: Humanoid Robot
- ChatGPT Atlas AI Browser
- Gemini Vibe Coding in Google AI Studio
- Claude Haiku 4.5
- Qwen3-VL in Ollama Cloud
- Gemini 2.5 Computer Use model
- OpenAI Agent Builder
- Apps in ChatGPT
- Apps SDK in ChatGPT
- Moondream 3 Preview on fal
- Grok Imagine
- GPT-5 Pro in the API
- OpenAI DevDay 2025
- Claude Agent SDK
- ElevenLabs Agent Workflow
- OpenAI Sora 2
- Claude Sonnet 4.5
- OpenAI + Stripe Agentic Commerce Protocol
- ChatGPT Parental Control
- Gemini Robotics-ER 1.5, Blog, Research Paper
- GitHub Copilot CLI
- Grok 4 Vision
- OpenAI function calling update
- Subagents in Claude Code
- ChatGPT Pulse
- Kimi Ok Computer: Agent mode for Kimi Chat
- Mooondream 3 Preview
- Meta Code World Model (CWM)
- Qwen3-VL
- Qwen3-TTS API: X post
- Qwen3-Omni: Text, image, audio & video model. X post
- Ollama Cloud Models: Run larger models locally with fast, datacenter-grade hardware
- DeepSeek-V3.1-Terminus
- OpenAI & NVIDIA partnership
- ElevenLabs Studio 3.0
- Google GenKit Go 1.0
- Stitch by Google: New features
- Meta Ray-Ban
- Gemini in Chrome
- GPT-5-Codex, Blog post: A version of GPT-5 further optimized for agentic coding in Codex
- GPT-5 now built-in in Xcode 26
- AgentScope: Agent-oriented programming for building LLM apps
- sosumi.ai: Making Apple docs AI-readable
- Kimi K2-0905 update
- Google on-device AI: EmbeddingGemma
- Qwen3-Max-Preview
- Claude Sonnet 4 in Xcode 26 Beta 7
- GPT Realtime
- Google vids.new
- Gemini 2.5 Flash Image Generation: Blog, X
- Claude for Chrome
- VibeVoice
- Agents.md
- GPT 5
- Cursor CLI
- Codex CLI in Cursor & VS Code
- Grok Code Fast 1 in Cursor & kilocode.ai
- Open models by OpenAI
- GPT-OSS Playground
- Claude Opus 4.1
- v0.app, X post
- ElevenLabs Music
- Genie 3, X post
- Gemini Storybook
- Qwen Image, Blog
- Qwen Image Edit
- Gemma 3 270M
- DINOv3
- Swift Agent: Swift SDK for building AI agents
- ElevenLabs Next.js Starter Kit, Next.js Playground
- Eleven v3
- ElevenLabs Video-to-Music, X post
- DeepSeek V3.1
- Qoder: Agentic coding platform
- Fireplexity
- Cartesia Line SDK
- Google Opal: Build mini AI apps
- ChatGPT study mode
- GLM-4.5 model: Reasoning, Coding, and Agentic model
- Ollama for Mac
- Introducing GhatGPT Agent
- Qwen3 Coder
- Kiro: Amazon's Agentic IDE, Kiro.dev
- Gemini Embedding in the API
- Kimi K2
- Grok 4
- Perplexity Comet
- Grok 4 in Zed IDE
- Mistral Voxtral: Speech Recognition models
- Gemma 3n
- Imagen 4
- Andrej Karpathy: Software Is Changing
- Warp 2.0 Agentic Dev
- Gemini CLI, Repo
- WWDC25: Use ChatGPT in Xcode 26
- WWCD25: Apple Foundation Models Framework
- OpenAI o3 Pro
- WWDC25: MLX for Apple Silicon
- Stich by Google: Web/mobile UI vibe coding
- Anthropic Code with Claude live stream
- Jony/Sam AI-Powered Computers
- Jules Agentic Coding
- GitHub Copilot is now open-source
- OpenAI Codex
- Anthropic API: Web Search Tool
- Gemini-powered coding agent
- Windsurf SWE-1 model
- Elevenlabs Soundboard
- Introducing Qwen3
- Llama API
- OpenAI o3 and o4-mini
- OpenAI Codex CLI
- Introducing GPT-4.1 in the API
- GPT-4.1 Prompting Guide
- AI in Enterprise: OpenAI
- OpenAI: A Practical Guide to Building Agents
- Firebase Studio
- Identifying and Scaling AI Use Cases
- Google Agent Development Kit
- Gemini Cookbook
- Vercel AI Chat SDK, Get started
- The Llama 4 herd
- OpenAI Image Generation in ChatGPT
- OpenAI Response API and Agent SDK
- OpenAI.FM
- Gemini 2.5
- DeepSeek-V3-0324
- Manus
- Vapi: Voice AI Agents for Developers
- Gemma 3
- QwQ-32B Reasoning Model
- Introducing LMStudio SDK
- FastHTML and MonsterUI
- Mistral Small 3.1
- Claude Web Search
- Krea AI Video Training
- NotebookLM Mind Maps
- Hunyuan 3D Generation AI
- Stability AI New Virtual Camera
- Gemini Canvas & Audio Overview
-
- OpenAI GPT 4.5
- Claude 3.7 Sonnet and Claude Code
- Google Gemini Code Assist
- Grok 3 Beta
- Hugging Face FastRTC
- Microsoft Phi-4 Multimodal
- ElevenLabs Scribe
- Qwen Chat: Thinking, Web Search, Artifacts, Video
- Alibaba Wan 2.1 AI video
- Perplexity Voice Mode with Grok 3
- Amazon Alexa+
- Mistral Le Chat app
- Anthropic Jailbreaks
- Pika adds Pikaddition
- Google Gemini 2.0 Pro
- Replit free text-to-app
- ByteDance’s AI avatars
- OpenAI’s Deep Research
- HuggingFace AI App Store
- 12 Days of OpenAI
- Day 1: o1 and ChatGPT Pro
- Day 2: Reinforcement Fine-Tuning
- Day 3: Sora
- Day 4: ChatGPT Canvas
- Day 5: Apple Intelligence
- Day 6: Advanced voice with video & Santa mode
- Day 7: Projects in ChatGPT
- Day 8: ChatGPT Search
- Day 9: OpenAI o1 and new tools for developers
- Day 10: 1-800-CHATGPT
- Day 11: Work with apps
- Day 12: o3 Preview
- Gemini 2.0 Grok Image Generation Release
- Ollama Structured Outputs
- Llama 3.3
- ElevenLabs: Build AI Agents That Speak
- PydanticAI
- Introducing Amazon Nova Models
- DeepSeek V3