DataSipper is a custom web browser based on Chromium that features a unique slide-out panel for monitoring real-time data streams like REST API calls and WebSocket connections. It provides deep insights into website data flows while maintaining a familiar browsing experience.
- Arch Linux (primary target platform)
- 8GB+ RAM (16GB+ recommended)
- 100GB+ free disk space
- Fast internet connection (Chromium source is ~10GB)
# Clone the repository
git clone <repository-url>
cd datasipper
# Complete automated setup (1-4 hours depending on system)
./scripts/dev-setup.sh# Install dependencies
./scripts/install-deps-arch.sh
# Set up environment
source scripts/setup-env.sh
# Fetch Chromium source
./scripts/fetch-chromium.sh
# Apply DataSipper patches
cd chromium-src/src && python3 ../../scripts/patches.py apply
# Configure and build
cd ../.. && ./scripts/configure-build.sh dev
ninja -C chromium-src/src/out/DataSipper chrome
# Run DataSipper
./chromium-src/src/out/DataSipper/chrome# Build development container
docker build -f Dockerfile.dev --target development -t datasipper:dev .
# Run development environment
docker run -it --rm \
-v $(pwd):/home/datasipper/datasipper \
datasipper:dev
# Inside container
./scripts/dev-setup.sh- Development Environment: Complete automated setup with Docker support
- Stream Configuration UI: Advanced routing rules with condition-based filtering
- Rule Testing System: Built-in testing with sample data and visual feedback
- UI Framework: Modern responsive interface with tabs, modals, and controls
- Patch Management: Full patch management system for Chromium modifications
- Build System: Optimized build configurations (debug, dev, release)
- Chromium Integration: Setting up development environment and source fetch
- Network Interception: HTTP/HTTPS and WebSocket traffic capture implementation
- Data Storage: SQLite database and in-memory data structures
- Core Network Hooks: URLLoader and WebSocket interception patches
- IPC Communication: Connect network observers to UI panel
- Real-time Display: Live stream of network events
- External Integrations: Kafka, Redis, MySQL output connectors
-
Network Interception Layer
URLLoaderRequestInterceptor: Captures HTTP/HTTPS trafficDataSipperNetworkObserver: Processes request/response dataDataSipperWebSocketObserver: Monitors WebSocket connections
-
Data Storage System
DataSipperDatabase: SQLite-based persistent storageCircularEventBuffer: In-memory real-time event queueNetworkEventStorage: HTTP event managementWebSocketMessageStorage: WebSocket message handling
-
User Interface
- Slide-out monitoring panel (integrated into browser UI)
- Real-time data visualization
- Filtering and search capabilities
- Export functionality
-
Source Environment
source scripts/setup-env.sh source scripts/set_quilt_vars.sh cd chromium-src/src
-
Create New Feature Patch
# Create patch qnew core/datasipper/my-feature.patch # Add files to patch qadd path/to/file.cc qadd path/to/file.h # Make changes qedit path/to/file.cc # Update patch qrefresh
-
Build and Test
ninja -C out/DataSipper chrome ./out/DataSipper/chrome
- List patches:
./scripts/patches.py list - Apply patches:
./scripts/patches.py apply - Remove patches:
./scripts/patches.py reverse - Validate patches:
./scripts/patches.py validate
datasipper/
βββ build/ # Build artifacts and depot_tools
βββ chromium-src/ # Chromium source code
βββ docs/ # Documentation
β βββ GETTING_STARTED.md # Detailed setup guide
β βββ PATCH_DEVELOPMENT.md # Patch development workflow
βββ patches/ # DataSipper modifications
β βββ series # Patch application order
β βββ core/ # Essential functionality
β β βββ datasipper/ # Core infrastructure
β β βββ network-interception/ # Network capture
β β βββ ui-panel/ # User interface
β βββ extra/ # Optional features
β βββ upstream-fixes/ # Chromium bug fixes
βββ scripts/ # Development tools
β βββ setup-env.sh # Environment configuration
β βββ fetch-chromium.sh # Source fetching
β βββ configure-build.sh # Build configuration
β βββ patches.py # Patch management
β βββ install-deps-arch.sh # Dependency installation
βββ todo.md # Detailed development roadmap
- HTTP/HTTPS Interception: Complete request/response capture
- WebSocket Monitoring: Bidirectional message logging
- Real-time Display: Live stream of network events
- Historical Storage: SQLite database for persistence
- Filtering: By URL patterns, content type, method
- Grouping: By domain, API endpoint, content type
- Search: Full-text search across captured data
- Export: JSON, CSV formats
- Kafka Producer: Stream events to Kafka topics
- MySQL Storage: Direct database integration
- Redis Caching: High-performance data caching
- Webhooks: HTTP endpoint forwarding
- JavaScript API: Custom processing scripts
- Getting Started Guide: Complete setup instructions
- Patch Development: How to modify Chromium
- Development Roadmap: Detailed task breakdown
The current implementation captures:
- Request URL, method, headers, body
- Response status, headers, body
- Timing information
- Error codes and failure reasons
- Connection establishment and handshake
- Bidirectional message content (text/binary)
- Frame metadata (opcode, FIN bit)
- Connection closure events
- Real-time circular buffer (10,000 events default)
- SQLite persistence with indexed queries
- Configurable retention policies
- Database maintenance and cleanup
- All captured data remains local by default
- Optional external forwarding requires explicit configuration
- Request/response bodies can be disabled for sensitive data
- Configurable data retention and cleanup policies
- Follow the patch development workflow in PATCH_DEVELOPMENT.md
- Test thoroughly on clean Chromium source
- Document changes and maintain compatibility
- Submit patches for review
DataSipper is built on Chromium and follows the same BSD-style license. See individual files for specific license information.
Based on the current implementation status, the immediate priorities are:
- Complete UI Panel: Implement the slide-out panel with React/JavaScript
- IPC Communication: Connect network observers to UI panel
- Real-time Updates: WebSocket/MessageChannel for live data
- Basic Filtering: URL patterns and content type filters
- Export Functionality: JSON/CSV export for captured data
The foundation is solid with working network interception and data storage. The next phase focuses on user interface and real-time data presentation.