Thanks to visit codestin.com
Credit goes to github.com

Skip to content
#

multimodal-ai

Here are 547 public repositories matching this topic...

Mano-P: Open-source GUI-VLA agent for edge devices. #1 on OSWorld (specialized, 58.2%). Runs locally on Apple M4 Mac mini/MacBook โ€” no data leaves your device.Mano-P ๆ˜ฏไธ€ไธชๅผ€ๆบ GUI-VLA ้กน็›ฎ๏ผŒๆ”ฏๆŒๅœจ Mac mini/MacBook ไธŠๆˆ–้€š่ฟ‡็ฎ—ๅŠ›ๆฃ’ๆœฌๅœฐ่ฟ่กŒๆŽจ็†๏ผŒๅฎž็Žฐ็บฏ่ง†่ง‰้ฉฑๅŠจ็š„่ทจๅนณๅฐ GUI ่‡ชๅŠจๅŒ–ๆ“ไฝœใ€‚ๆ•ฐๆฎๅฎŒๅ…จๆœฌๅœฐๅค„็†๏ผŒๆ”ฏๆŒๅคๆ‚ๅคšๆญฅ้ชคไปปๅŠก่ง„ๅˆ’ไธŽๆ‰ง่กŒใ€‚

  • Updated May 22, 2026

OpenAI and Anthropic compatible server for Apple Silicon. Run LLMs and vision-language models (Llama, Qwen-VL, LLaVA) with continuous batching, MCP tool calling, and multimodal support. Native MLX backend, 400+ tok/s. Works with Claude Code.

  • Updated May 21, 2026
  • Python

ๅฐ้ฅๆœ็ดข๏ผŒๅฌๆ‡‚ไฝ ็š„่ฏใ€็œ‹ๆ‡‚ไฝ ็š„ๅ›พ๏ผŒ็”จAIๆ‰พๅˆฐๆœฌๅœฐไปปไฝ•ๆ–‡ไปถใ€‚่ฎฉๆœ็ดขๅƒ่Šๅคฉไธ€ๆ ท็ฎ€ๅ•ใ€‚XiaoyaoSearch: Understands your words, reads your images, finds any local file with AI. Making search as easy as chatting.

  • Updated May 22, 2026
  • Python
Building-Business-Ready-Generative-AI-Systems

This GitHub repository contains the complete code for building Business-Ready Generative AI Systems (GenAISys) from scratch. It guides you through architecting and implementing advanced AI controllers, intelligent agents, and dynamic RAG frameworks. The projects demonstrate practical applications across various domains.

  • Updated Feb 11, 2026
  • Jupyter Notebook

AI-powered tool to turn long videos into short, viral-ready clips. Combines transcription, speaker diarization, scene detection & 9:16 resizing โ€” perfect for creators & smart automation.

  • Updated Apr 2, 2025
  • Python

[๐ง๐š๐ญ๐ฎ๐ซ๐ž ๐ฆ๐š๐œ๐ก๐ข๐ง๐ž ๐ข๐ง๐ญ๐ž๐ฅ๐ฅ๐ข๐ ๐ง๐ž๐œ๐ž] ImmunoStruct enables multimodal deep learning for immunogenicity prediction

  • Updated Mar 5, 2026
  • Python

Improve this page

Add a description, image, and links to the multimodal-ai topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the multimodal-ai topic, visit your repo's landing page and select "manage topics."

Learn more