SMS Spam Detector Web App

A classic machine learning project for text classification, now built with a clean, modern web interface using Python and Flask.

Overview

This project implements a Natural Language Processing (NLP) model to solve a common problem: spam filtering. Originally a command-line tool, it has been transformed into an interactive web application where users can enter any message and instantly see whether it's classified as spam or legitimate ("ham").

It uses a Naive Bayes classifier, a simple yet powerful algorithm well-suited for text-based tasks, to learn the patterns that differentiate spam from ham.

What I Learned

Text Vectorization: Converting text messages into numerical features using CountVectorizer
Naive Bayes Classifier: Implementing and training a probabilistic model for text classification
Scikit-learn Pipelines: Building a clean, reusable workflow that chains preprocessing and modeling steps
Web Development with Flask: Creating routes, handling form submissions, and rendering dynamic templates
Frontend Integration: Using HTML and Tailwind CSS to build a responsive and user-friendly interface
Full-Stack Connection: Wiring a Python machine learning backend to a web frontend to create a complete application

Technical Stack

Python: Core programming language
Flask: Web framework for the backend
Scikit-learn: For the machine learning pipeline and model
Pandas: For data manipulation
HTML & Tailwind CSS: For the frontend user interface
Pytest: For running the automated tests

Key Concepts Applied

The core of the model is a scikit-learn Pipeline that automates the workflow. The Flask application loads this trained model on startup and uses it to serve predictions through a web interface.

self.model = Pipeline([
    ('vectorizer', CountVectorizer()),
    ('classifier', MultinomialNB()),
])

How It Works

Model Training: When the Flask application starts, it loads the spam.csv dataset and trains the Naive Bayes model just once
User Input: A user visits the web page and submits a message through an HTML form
Backend Prediction: The Flask backend receives the message, feeds it to the pre-trained model, and gets a prediction (spam/ham) along with the confidence probabilities
Display Results: The application re-renders the webpage, dynamically displaying the result, confidence scores, and a clean visual summary

Installation & Setup

git clone https://github.com/username/sms-spam-detector.git
cd sms-spam-detector

python -m venv venv
source venv/bin/activate  

pip install -r requirements.txt

flask run

Running Tests (Optional)

To verify that the data handling and model components are working correctly, you can run the test suite.

pytest

Requirements

You can install all dependencies from the requirements.txt file.

requirements.txt

Flask==3.0.0
scikit-learn==1.4.2
pandas==2.2.1
numpy==1.26.4
pytest==7.4.0
pytest-cov==4.1.0

Python 3.8 or higher

Project Structure

The project is organized to separate the Flask application logic from the machine learning model code.

sms-spam-detector/
├── app.py                  # Main Flask application file
├── requirements.txt        # Project dependencies
├── data/
│   └── spam.csv            # The training dataset
├── spam_detector/
│   ├── __init__.py
│   ├── data_handler.py     # Functions for loading and cleaning data
│   └── model.py            # The SpamDetector class and ML logic
└── templates/
    └── index.html          # The HTML file for the user interface

Note: A tests/ directory containing test_model.py can be included for development and validation.

Things I'd Improve

Use TF-IDF Vectorization: Instead of simple word counts, use Term Frequency-Inverse Document Frequency (TF-IDF) for potentially better feature representation
Try Other Models: Experiment with other classifiers like Logistic Regression or Support Vector Machines (SVM) to compare performance
Containerize: Package the application with Docker for easier deployment and scalability
Deploy to the Cloud: Host the application on a service like Heroku, Vercel, or AWS so anyone can access it

Author

Evan William
Version 2.0 (2025)

This project was an incredible learning journey that transformed a simple command-line script into a complete, production-ready ML web application! It's been amazing to see how all the pieces come together - from raw text data to a sleek web interface that anyone can use.

If you have any feedback or suggestions, feel free to open an issue or pull request!

This project is for educational purposes. Feel free to fork, modify, or use it for your own learning.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SMS Spam Detector Web App

Overview

What I Learned

Technical Stack

Key Concepts Applied

How It Works

Installation & Setup

Running Tests (Optional)

Requirements

Project Structure

Things I'd Improve

Author

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
spam_detector		spam_detector
templates		templates
tests		tests
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

evan-william/sms-spamguard

Folders and files

Latest commit

History

Repository files navigation

SMS Spam Detector Web App

Overview

What I Learned

Technical Stack

Key Concepts Applied

How It Works

Installation & Setup

Running Tests (Optional)

Requirements

Project Structure

Things I'd Improve

Author

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages