Thanks to visit codestin.com
Credit goes to github.com

Skip to content

A sleek web app powered by Python, Flask, and Scikit-learn to intelligently detect SMS spam. Enter any message and see the verdict in real-time :)

Notifications You must be signed in to change notification settings

evan-william/sms-spamguard

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SMS Spam Detector Web App

A classic machine learning project for text classification, now built with a clean, modern web interface using Python and Flask.

Overview

This project implements a Natural Language Processing (NLP) model to solve a common problem: spam filtering. Originally a command-line tool, it has been transformed into an interactive web application where users can enter any message and instantly see whether it's classified as spam or legitimate ("ham").

It uses a Naive Bayes classifier, a simple yet powerful algorithm well-suited for text-based tasks, to learn the patterns that differentiate spam from ham.

What I Learned

  • Text Vectorization: Converting text messages into numerical features using CountVectorizer
  • Naive Bayes Classifier: Implementing and training a probabilistic model for text classification
  • Scikit-learn Pipelines: Building a clean, reusable workflow that chains preprocessing and modeling steps
  • Web Development with Flask: Creating routes, handling form submissions, and rendering dynamic templates
  • Frontend Integration: Using HTML and Tailwind CSS to build a responsive and user-friendly interface
  • Full-Stack Connection: Wiring a Python machine learning backend to a web frontend to create a complete application

Technical Stack

  • Python: Core programming language
  • Flask: Web framework for the backend
  • Scikit-learn: For the machine learning pipeline and model
  • Pandas: For data manipulation
  • HTML & Tailwind CSS: For the frontend user interface
  • Pytest: For running the automated tests

Key Concepts Applied

The core of the model is a scikit-learn Pipeline that automates the workflow. The Flask application loads this trained model on startup and uses it to serve predictions through a web interface.

self.model = Pipeline([
    ('vectorizer', CountVectorizer()),
    ('classifier', MultinomialNB()),
])

How It Works

  1. Model Training: When the Flask application starts, it loads the spam.csv dataset and trains the Naive Bayes model just once
  2. User Input: A user visits the web page and submits a message through an HTML form
  3. Backend Prediction: The Flask backend receives the message, feeds it to the pre-trained model, and gets a prediction (spam/ham) along with the confidence probabilities
  4. Display Results: The application re-renders the webpage, dynamically displaying the result, confidence scores, and a clean visual summary

Installation & Setup

git clone https://github.com/username/sms-spam-detector.git
cd sms-spam-detector

python -m venv venv
source venv/bin/activate  

pip install -r requirements.txt

flask run

Running Tests (Optional)

To verify that the data handling and model components are working correctly, you can run the test suite.

pytest

Requirements

You can install all dependencies from the requirements.txt file.

requirements.txt

Flask==3.0.0
scikit-learn==1.4.2
pandas==2.2.1
numpy==1.26.4
pytest==7.4.0
pytest-cov==4.1.0

Python 3.8 or higher

Project Structure

The project is organized to separate the Flask application logic from the machine learning model code.

sms-spam-detector/
├── app.py                  # Main Flask application file
├── requirements.txt        # Project dependencies
├── data/
│   └── spam.csv            # The training dataset
├── spam_detector/
│   ├── __init__.py
│   ├── data_handler.py     # Functions for loading and cleaning data
│   └── model.py            # The SpamDetector class and ML logic
└── templates/
    └── index.html          # The HTML file for the user interface

Note: A tests/ directory containing test_model.py can be included for development and validation.

Things I'd Improve

  • Use TF-IDF Vectorization: Instead of simple word counts, use Term Frequency-Inverse Document Frequency (TF-IDF) for potentially better feature representation
  • Try Other Models: Experiment with other classifiers like Logistic Regression or Support Vector Machines (SVM) to compare performance
  • Containerize: Package the application with Docker for easier deployment and scalability
  • Deploy to the Cloud: Host the application on a service like Heroku, Vercel, or AWS so anyone can access it

Author

Evan William
Version 2.0 (2025)

This project was an incredible learning journey that transformed a simple command-line script into a complete, production-ready ML web application! It's been amazing to see how all the pieces come together - from raw text data to a sleek web interface that anyone can use.

If you have any feedback or suggestions, feel free to open an issue or pull request!


This project is for educational purposes. Feel free to fork, modify, or use it for your own learning.

About

A sleek web app powered by Python, Flask, and Scikit-learn to intelligently detect SMS spam. Enter any message and see the verdict in real-time :)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published