- https://github.com/open-world-agents/MediaRef: Pydantic media reference for images and video frames (with timestamp support) from data URIs, HTTP URLs, file URIs, and local paths. Features lazy loading and optimized batch video decoding.
- https://github.com/open-world-agents/ocap: ocap (Omnimodal CAPture) captures all essential desktop signals in synchronized format. Records screen video, audio, keyboard/mouse input, and window events.
- https://github.com/open-world-agents/open-world-agents: A versatile and efficient monorepo that embraces and grows multiple projects, containing all the essential building blocks for agent development.
- https://worv-ai.github.io/d2e/: D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI. Code will coming soon!
open-world-agents
Pinned Loading
Repositories
- open-world-agents Public
Everything you need to build state-of-the-art foundation multimodal desktop agent, end-to-end.
open-world-agents/open-world-agents’s past year of commit activity - MediaRef Public
Pydantic media reference for images and video frames (with timestamp support) from data URIs, HTTP URLs, file URIs, and local paths. Features lazy loading and optimized batch video decoding.
open-world-agents/MediaRef’s past year of commit activity - ocap Public
High-performance desktop recorder for Windows. Captures screen, audio, keyboard, mouse, and window events.
open-world-agents/ocap’s past year of commit activity - .github Public
open-world-agents/.github’s past year of commit activity - staged-recipes Public Forked from conda-forge/staged-recipes
A place to submit conda recipes before they become fully fledged conda-forge feedstocks
open-world-agents/staged-recipes’s past year of commit activity - desktop-env Public
A real-time, high-frequency, real-world desktop environment that is suitable for desktop-based ML development (agents, world models, etc.)
open-world-agents/desktop-env’s past year of commit activity
Top languages
Loading…
Most used topics
Loading…