Classifying the MNIST is a Data Science project focused on exploring and implementing the k-Nearest Neighbors (kNN) classification method to classify the MNIST dataset. MNIST is a widely used dataset in the field of machine learning, consisting of 28x28 pixel grayscale images of handwritten digits (0 through 9).
The goal of this project is to build, fine-tune, and analyze the kNN classification model to achieve accurate and efficient digit recognition. The repository contains code, documentation, and resources related to the kNN classification task.
The project is organized into the following main components:
-
Data Preparation:
- Explore and preprocess the MNIST dataset.
- Transform the data into a suitable format for kNN model training.
-
Model Development:
- Implement the kNN classification model.
- Tune hyperparameters for optimal performance.
-
Evaluation and Comparison:
- Evaluate the kNN model's performance using metrics such as accuracy, precision, recall, and F1 score.
- Compare and analyze the results obtained using kNN with potential insights into its strengths and weaknesses.
-
Documentation:
- Detailed documentation explaining the rationale behind choosing the kNN model, preprocessing steps, and performance metrics.
- Results and insights gained from the analysis.
To get started with the project, follow these steps:
-
Clone the repository:
git clone https://github.com/Zane-dev16/Classifying-the-MNIST.git cd Classifying-the-MNIST -
Explore the Jupyter notebooks in the notebooks/ directory to understand each step of the machine learning pipeline.
Géron, A. (2019). Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems. O'Reilly Media.
Deng, L. (2012). The mnist database of handwritten digit images for machine learning research. IEEE Signal Processing Magazine, 29(6), 141–142.