This project aims to detect potential Distributed Denial of Service (DDoS) attacks using machine learning techniques on historical network traffic data. It explores multiple models and compares performance across different feature subsets.
DDoS_Attack_Prediction.ipynb: Main notebook containing EDA, preprocessing, training, and evaluationData/: All dataset (unprocessed, preprocessed)
We implemented and compared the following classification models:
- Machine Learning
- Random Forest
- XGBoost
- Deep Learning
- TabNet
Each model was trained on:
- Full feature set
- Top 6 highly correlated features
- 6 least correlated features
Data was split into 80% train, 10% validation, 10% test.
| Model | Full Features | High Correlation Only | Low Correlation Only |
|---|---|---|---|
| Random Forest | 100% | 99% | 75% |
| XGBoost | 99% | 98% | 74% |
| TabNet | 97% | 92% | 72% |
Strongly correlated features proved nearly as effective as using the full dataset. Low-correlated features performed poorly, highlighting the importance of feature selection.
bytecountpktcountbyteperflowpktperflowpktratetot_dur
- Real-time implementation using live traffic streams
- Automated alert system for suspicious behavior
- Explainability with SHAP/TabNet attention to interpret model decisions
- Scalability testing on larger datasets or multiple network sources
For questions or collaboration, feel free to reach out:
- Name: Muhammad Hadi Nur Fakhri
- LinkedIn: linkedin.com/in/nur-fakhri/
This project is open-source