Captr is a screen recording and computer interaction capture tool that records keyboard/mouse input, screen video, DOM snapshots, and accessibility trees. Perfect for creating datasets to train and evaluate computer-use AI models.
- Screen & Input Recording: Captures all mouse movements, clicks, scrolls, and keyboard inputs with precise timestamps
- OBS Integration: Automatic screen recording via OBS Studio
- DOM Capture: Automatically captures webpage structure from Chromium browsers (Chrome, Edge, Brave, etc.)
- Accessibility Trees: Records macOS accessibility information from native applications
- System Metadata: Captures detailed system information (OS, screen resolution, installed apps, etc.)
- Privacy Controls: Pause/resume recording to hide sensitive information
- Playback: Replay recorded sessions to verify captures
- Download the latest
Captr.dmgfrom the Releases page - Open the DMG file and drag
Captr.appto your Applications folder - Install and configure OBS Studio (see OBS Setup)
- Grant required macOS permissions when prompted
Requirements: Python ≥3.11, OBS Studio
# Clone the repository
git clone https://github.com/YOUR_USERNAME/Captr_MacOS.git
cd Captr_MacOS
# Create virtual environment
python3 -m venv venv
source venv/bin/activate
# Install dependencies
pip install -r requirements.txt
# Build the app
python build.pyThe built app will be in the dist folder as Captr.dmg.
Captr requires OBS Studio for screen recording:
- Download and install OBS Studio
- Open OBS and go to Tools → WebSocket Server Settings
- Enable WebSocket server and disable Authentication
- Add a macOS Screen Capture source in OBS
- Close OBS (Captr will control it automatically)
For detailed instructions, see OBS_SETUP.md.
Captr needs these permissions to function:
- Accessibility:
System Settings → Privacy & Security → Accessibility- Required for recording keyboard inputs and playing back actions
- Input Monitoring:
System Settings → Privacy & Security → Input Monitoring- Required for keyboard capture
- Screen Recording:
System Settings → Privacy & Security → Screen Recording- Required by OBS for screen capture
macOS will prompt for these permissions on first run.
- Launch Captr from Applications
- Click Start Recording
- Perform your computer tasks
- Use Pause/Resume to hide sensitive information (passwords, credit cards, etc.)
- Click Stop Recording
- Optionally name and describe your recording
Recordings are saved to ~/Documents/Captr_Recordings/.
To capture webpage DOM snapshots:
- Click Launch Browser for DOM Capture in Captr
- Select your preferred Chromium browser (Chrome, Edge, Brave, etc.)
- Click Launch
- Use the launched browser for web browsing during recording
DOM snapshots will be automatically captured when you click or navigate. See DOM_CAPTURE_SETUP.md for details.
- Play Latest Recording: Replays the most recent recording
- Play Custom Recording: Choose any recording to replay
- Press
Shift+Escto stop playback
Each recording creates a folder in ~/Documents/Captr_Recordings/ containing:
events.jsonl- All keyboard/mouse actions with timestampsmetadata.json- System information*.mp4- Screen recording video from OBSdom_snaps/- DOM snapshots from web pages (if DOM capture enabled)a11y_snaps/- Accessibility tree captures from native appsREADME.md- Optional recording description
{"time_stamp": 1234567.89, "action": "move", "x": 100.0, "y": 200.0}
{"time_stamp": 1234568.01, "action": "click", "x": 100.0, "y": 200.0, "button": "left", "pressed": true}
{"time_stamp": 1234568.15, "action": "key", "key": "a", "pressed": true}Run the diagnostic tool:
cd tools
python3 check_recording.pyMake sure you're using a browser launched through Captr's Launch Browser feature.
Check detailed logs:
open dist/Captr.app --stdout-path=/tmp/captr.log --stderr-path=/tmp/captr_err.log
cat /tmp/captr.logSee DOM_CAPTURE_SETUP.md for DOM/accessibility tree troubleshooting.
- After many playbacks, a segfault may occur (restart Captr)
- Mouse input not captured in video games that use raw input
- Google Docs and similar canvas-based web apps have limited DOM capture (by design for privacy)
- Banking sites may limit DOM capture content due to security policies
source venv/bin/activate
python main.pyLocated in tools/:
launch_chrome_debug.py- Launch browsers with debugging enabledcheck_recording.py- Verify recordings and diagnose issuesdebug_accessibility.py- Test accessibility API accessdebug_chrome_cdp.py- Test Chrome DevTools Protocol connection
Captr is derived from DuckTrack by DuckAI, released under the MIT License. We've added significant enhancements including DOM capture, accessibility trees, enhanced macOS support, and improved debugging tools.
See LICENSE for full details.
If you use Captr in your research or project, please cite it as:
@software{howland2025captr,
author = {Howland, Anais},
title = {Captr: Screen Recording and Computer Interaction Capture for Computer-Use Datasets},
year = {2025},
url = {https://github.com/anaishowland/captr_ducktrack}
}MIT License - see LICENSE file for details.
Created by Anais Howland at Paradigm Shift AI | Based on DuckTrack by DuckAI
