Thanks to visit codestin.com
Credit goes to github.com

Skip to content

This project implements a Machine Learning and Deep Learning hybrid approach to detect phishing websites. By analyzing URLs and their associated features, the system predicts whether a given website is legitimate or phishing, leveraging multiple ML algorithms and neural networks for improved accuracy.

jasoncobra3/Website_Phishing_Detection

Repository files navigation

🛡️ Website Phishing Detection

This project implements a Machine Learning and Deep Learning hybrid approach to detect phishing websites. By analyzing URLs and their associated features, the system predicts whether a given website is legitimate or phishing, leveraging multiple ML algorithms and neural networks for improved accuracy.


📌 Features

  • 📊 Collects phishing URLs from PhishTank and legitimate URLs from University of New Brunswick datasets
  • 🔍 Extracts 17 handcrafted features (address bar, domain, HTML & JavaScript based)
  • 🧠 Trains multiple ML & DL models including Decision Trees, Random Forest, XGBoost, SVMs, Autoencoders, and MLPs
  • 📈 Evaluates models and compares performance
  • 🏆 Best performing model: XGBoost (86.4% accuracy)
  • 💾 Saves trained model as .pickle file for future predictions

🛠️ Tech Stack

Purpose Technology Used
Data Collection PhishTank, UNB Dataset
Feature Extraction Python, Regex, Pandas
ML Models Decision Tree, Random Forest, SVM, XGBoost
DL Models Autoencoder, MLP
Model Persistence Pickle
Visualization Matplotlib, Seaborn

🚀 How to Run

  1. Clone the Repo
    git clone https://github.com/jasoncobra3/Website_Phishing_Detection.git
    cd Website_Phishing_Detection
    
    
  2. Create Virtual Environment
    python -m venv phishing_env
     # Windows:
     phishing_env\Scripts\activate
     # macOS/Linux:
     source phishing_env/bin/activatee
    
    
  3. Install Dependencies
    pip install -r requirements.txt
    
    
  4. Run Jupyter Notebooks
    jupyter notebook
    
  • Open URL Feature Extraction.ipynb - extract features
  • Open Phishing Website Detection_Models & Training.ipynb - train & evaluate models

📊 Results

  • ✅ XGBoost achieved 86.4% accuracy, outperforming other models
  • ✅ Random Forest and SVM performed moderately well
  • ✅ Autoencoder and MLP showed promising results for DL integration

🌟 Future Work

  • 🌐 Develop a browser extension to detect phishing in real-time
  • 🎨 Build a GUI/web app for user-friendly phishing detection
  • 🔄 Improve hybrid ML-DL pipeline with larger datasets

📄 References


🤝 Contributing

Feel free to fork, star, or submit a pull request to contribute improvements!

About

This project implements a Machine Learning and Deep Learning hybrid approach to detect phishing websites. By analyzing URLs and their associated features, the system predicts whether a given website is legitimate or phishing, leveraging multiple ML algorithms and neural networks for improved accuracy.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published