This project implements a comprehensive monitoring dashboard using Netdata, following the requirements from roadmap.sh. The goal is to help you learn the basics of system monitoring, understand how to monitor system health, and get acquainted with real-time performance tracking.
- Install Netdata on a Linux system
- Configure Netdata to monitor core system metrics:
- CPU usage
- Memory usage
- Disk I/O
- Access the Netdata dashboard via web browser
- Customize dashboard aspects (add/modify charts)
- Set up alerts (e.g., CPU usage > 80%)
- Implemented process-specific monitoring:
- System processes (sshd, netdata)
- Web services (nginx, apache, httpd)
- Database services (mysql, postgresql, mongodb)
- Custom services (docker, containerd, kubelet)
- Configured detailed metrics for each process:
- CPU usage percentage
- Memory usage (private and RSS)
- Process status and uptime
- Thread count
- File descriptors
- setup.sh: Automates Netdata installation and custom configuration
- test_dashboard.sh: Tests monitoring dashboard
- cleanup.sh: Removes Netdata agent
- Automates Netdata installation on Amazon Linux
- Configures basic system monitoring
- Sets up CPU usage alerts
- Enables dashboard access
- Configures custom process monitoring
- Sets proper permissions for configuration files
- Generates test loads:
- CPU stress testing
- Disk I/O operations
- Memory usage simulation
- Verifies monitoring functionality
- Checks alert triggering
- Removes Netdata installation
- Cleans up configuration files
- Removes test tools
- Verifies complete removal
# Custom process groups for monitoring
system: sshd* netdata*
webserver: nginx* apache* httpd*
database: mysql* postgresql* mongod*
custom_services: docker* containerd* kubelet*
update_every: 1
priority: 60000
process_stats:
name: 'System Process Stats'
check_intervals: 1
processes:
sshd:
command: 'sshd'
metrics:
- cpu
- mem
- threads
- uptime
netdata:
command: 'netdata'
metrics:
- cpu
- mem
- threads
- uptime- Installation
chmod +x setup.sh
sudo ./setup.sh- Testing
chmod +x test_dashboard.sh
sudo ./test_dashboard.sh- Cleanup
chmod +x cleanup.sh
sudo ./cleanup.sh- Access via: http://YOUR_SERVER_IP:19999
- Default port: 19999
- Remember to configure security group/firewall rules
- CPU Usage Alerts:
- Warning: 80% usage
- Critical: 90% usage
- Real-time anomaly rate monitoring
- Anomalous metrics tracking
- Automated detection and alerting
- Customizable detection thresholds
- Real-time event logging
- Alert status tracking
- System state changes
- Node-specific monitoring
- Customizable alert filters
- Comprehensive alert dashboard
- Multiple alert categories:
- File descriptor utilization
- Data collection status
- Disk usage monitoring
- CPU load tracking
- Network metrics
- Role-based alert management (sysadmin, silent)
- Clear/Warning/Critical status tracking
- CPU utilization graphs
- Pressure stall information (PSI)
- Multi-core performance tracking
- System load analysis
- Real-time performance metrics
- Live node status
- Resource utilization:
- CPU usage
- Memory availability
- Disk I/O
- Network traffic
- Cloud instance information
- System architecture details
- Real-time metrics visualization
These dashboards provide comprehensive monitoring capabilities with:
- Real-time data visualization
- Historical trend analysis
- Customizable alerts and thresholds
- System and process-level monitoring
- Resource utilization tracking
- Anomaly detection and reporting
-
Process Group Monitoring
- System processes status and metrics
- Web services performance
- Database services health
- Container and orchestration services
-
Real-time Metrics
- Process-specific CPU usage
- Memory utilization patterns
- Thread management
- Resource allocation
-
Performance Analytics
- Historical data tracking
- Resource usage patterns
- System bottleneck identification
- Understanding of system monitoring basics
- Experience with automated setup and testing
- Foundation for advanced monitoring techniques
- Process-specific monitoring implementation
- Custom dashboard configuration
- Alert system setup and management
- Amazon Linux or compatible distribution
- Root/sudo access
- Minimum 1GB RAM recommended
- Open port 19999 for dashboard access
- Additional process group monitoring
- Enhanced metric collection
- Advanced alerting system
- Backup and restore functionality
- Custom visualization options
- Integration with external monitoring systems
This project is based on the "Simple Monitoring" project from roadmap.sh, designed to introduce developers to system monitoring concepts and practices. Enhanced with custom process monitoring and detailed metric collection capabilities.