Phishing Website Detection by Machine Learning Techniques

Objective

Phishing websites pose a significant threat in today's digital landscape, often masquerading as legitimate entities to deceive users. The aim of this project is to employ machine learning models and deep neural networks to predict phishing websites effectively.

Data Collection

Phishing URLs are sourced from the open-source service , which provides regularly updated datasets in various formats such as CSV and JSON.
Legitimate URLs are obtained from the open datasets provided by the University of New Brunswick. Specifically, the benign URL dataset is utilized for this project.

Feature Extraction

Features are extracted from the URL data across three categories:

Address Bar-based Features: Extracting features related to the URL itself.
Domain-based Features: Incorporating features derived from the domain of the URL.
HTML & JavaScript-based Features: Extracting features from the HTML and JavaScript content of the website.

Models & Training

The dataset is split into 80-20 for training and testing purposes. Supervised machine learning models considered for training include:

Decision Tree
Random Forest
Multilayer Perceptrons
XGBoost
Autoencoder Neural Network
Support Vector Machines

Presentation

A concise video presentation of the project is available, accompanied by presentation slides.

End Results

The XGBoost Classifier exhibited the highest performance among the models. Future developments of this project may include creating a browser extension for real-time phishing website detection and developing a user-friendly GUI application for predicting the nature of URLs.

This project utilizes machine learning techniques to combat phishing attacks effectively. If you require any additional clarification, feel free to reach out.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
Phishing Website Detection by Machine Learning Techniques Presentation.pdf		Phishing Website Detection by Machine Learning Techniques Presentation.pdf
Phishing Website Detection_Models & Training.ipynb		Phishing Website Detection_Models & Training.ipynb
URL Feature Extraction.ipynb		URL Feature Extraction.ipynb
URLFeatureExtraction.py		URLFeatureExtraction.py
XGBoostClassifier.pickle.dat		XGBoostClassifier.pickle.dat
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Phishing Website Detection by Machine Learning Techniques

Objective

Data Collection

Feature Extraction

Models & Training

Presentation

End Results

About

Uh oh!

Releases

Packages

Languages

thennavan-dev/phishing

Folders and files

Latest commit

History

Repository files navigation

Phishing Website Detection by Machine Learning Techniques

Objective

Data Collection

Feature Extraction

Models & Training

Presentation

End Results

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages