-
Synvo AI
- Singapore
- jingkangyang.com
- @JingkangY
Lists (1)
Sort Name ascending (A-Z)
Starred repositories
Demo-ICL: In-Context Learning for Procedural Video Knowledge Acquisition
EgoHandICL: Egocentric 3D Hand Reconstruction with In-Context Learning (ICLR 2026)
Agentic LaTeX Writer - Local-first editor for AI-assisted academic writing
Privacy-first AI memory layer - Signal for AI Memory. E2EE, local-first, works with Claude, Cursor, and any MCP-compatible AI.
A local AI assistant running on your device. It turns your files into actionable memory.
Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence
Open-source Autonomous 3D Characters on the Web
One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks
A multimodal framework for analyzing public street space utilization and inclusiveness using vision-language models.
[NeurIPS 2025] Deep Memory Backtracking for Long Video Understanding
Deploying High-Performance Lean 4 Server in One Click
Code for paper "Half-Physics: Enabling Kinematic 3D Human Model with Physical Interactions". Coming soon.
Official code of the paper "EgoExOR: EgoExOR: An Ego-Exo-Centric Operating Room Dataset for Surgical Activity Understanding" accepted at NeurIPS 2025
MMSearch-R1 is an end-to-end RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal search tools.
[ICLR' 25] AvatarGO: Zero-shot 4D Human-Object Interaction Generation and Animation
No fortress, purely open ground. OpenManus is Coming.
[CVPR 2025] EgoLife: Towards Egocentric Life Assistant
A fork to add multimodal model training to open-r1
Octo is a transformer-based robot policy trained on a diverse mix of 800k robot trajectories.
[ICCV 2025] Auto Interpretation Pipeline and many other functionalities for Multimodal SAE Analysis.
[ECCV 2024 Oral] Code for our paper "A Fair Ranking and New Model for Panoptic Scene Graph Generation"
Official Code for "Digital Life Project: Autonomous 3D Characters with Social Intelligence"
Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey [Miyai+, TMLR2025]
VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)