“ANN Model for Image Classification on MNIST Dataset”
1
Sanjay Bohra(Research Scholar)
Department of Computer Science & Engineering,Mandsaur University
2
Dr.Firoj Parwej(Associate Professor)
Department of Computer Science & Engineering,
Faculty of Engineering & Technology,
Mandsaur University, Mandsaur (MP)
Abstarct- The MNIST data set is a handwritten digit image is no noise around the handwriting, and all numbers are properly
data set, its identification is a task in computer vision that aligned.
involves the recognition and classification of digits (0-9) from
handwritten images. This task is important for a variety of real-
world applications, such as automatic postal code recognition,
vehicle number plate recognition, bank check clearance
processing, and digit-based form recognition [15]. Many deep
learning models have been explored in handwritten character
recognition (HCR). The use of deep learning is growing rapidly
due to its similarity to the human brain. The major Deep learning
algorithms Artificial Neural Network which have been used in
this paper considering their feature extraction and classification
stages of recognition. This model is trained on MNIST dataset
using hierarchical cross-entropy loss and ADAM optimizer but
also trained with different optimizer technique like SGD,
RMSprop, Adagrad and Adadelta and find accuracy and
validation accuracy scores are compared. Back propagation with
gradient descent is being used to train the network with reLU
activation function in the network which performs automatic
feature extraction. Among neural networks, ANN and CNN are
the primary classifiers for performing image recognition, image
classification tasks in computer vision. Fig 1:Mnist Dataset Sample
Keyword: ANN, MNIST, ADAM activation function
This paper explores the application of Artificial Neural
Networks (ANNs) for handwritten digit recognition, Check
I.INTRODUCTION performance in terms of feature extraction and classification.
Models are trained on the widely-used MNIST document
The introduction provides a clear overview of the importance of dataset, which consists of 28×28 grayscale images of
handwritten digit recognition and the crucial role played by the handwritten digits (0–9)[11].
MNIST dataset in benchmarking various algorithms.
Handwritten digit recognition is a fundamental task in computer II. LITERATURE REVIEW
vision, serving as the cornerstone for many real-world
applications such as automated bank check processing, postal [17].The MNIST database (Modified National Institute of
address recognition, and digital form scanning [18]. With Standards and Technology database) is a huge collection of
advances in artificial intelligence (AI) and deep learning, handwritten digits which is widely used for image processing
handwritten character recognition has become a benchmark system training. In the field of machine learning, the database is
problem for evaluating the performance of machine learning also frequently utilized for training and testing. In 60,000 we use
models. The practical implementation in this study is based on 55000 images for training and 5,000 images for test samples in
the MNIST dataset. The MNIST image dataset is popular in the MNIST data set represents a 28X28 handwritten digit picture.
introducing machine learning techniques for various reasons. For validation, we use random training pictures. In validation set
Inconsistencies in handwriting can make handwritten character we adjust hyperparameters to give the best validation error after
recognition (HCR) challenging. Variations in individual writing 1 million weight changes.
styles, slant, and character spacing often pose obstacles for deep
learning models. A. LeCun et al. (1998) introduced a multilayer perceptron (MLP)
This task becomes simpler in cases where there are no other with a single hidden layer. Despite the limited computational
anomalies, such as well-defined input dimensions and ten resources at the time, they achieved a recognition accuracy of
numbers of output classes, or controlled input conditions. There
approximately 95%. This foundational work demonstrated the free from your browser.Shivam [S.Kadam at el.2020, Journal of
feasibility of using ANNs for digit classification. Scientific Research]
B. Hinton et al. (2012) Regularization strategies such as dropout
and weight decay have been widely used to improve the
generalization of ANN models on MNIST. introduced dropout, a
method of randomly deactivating neurons during training to
prevent overfitting. This approach significantly improved 4.1 Data
performance when used with both MLPs and CNNs.
1).Dataset: The experiments were conducted on the MNIST
C. Kingma and Ba (2014) demonstrated the effectiveness of handwritten image dataset, a standard benchmark dataset for
Adam in achieving state-of-the-art performance on MNIST with handwritten digit recognition. This dataset are load using keras
fewer iterations compared to traditional SGD. Advances in with the following statement “tf.keras.datasets.mnist” and after
optimization algorithms have also contributed to the success of load data then perform train test split and store data in a variable.
ANN models on MNIST. Gradient-based methods such as
Stochastic Gradient Descent (SGD), along with adaptive 2) Dataset Details: Total 70000 image 60,000 training images
techniques like Adam and RMSProp, have enabled faster and 10,000 testing images. Each image is a 28×28 pixel
convergence and better handling of non-convex loss surfaces. grayscale image representing digits from 0 to 9.
D. With the advent of deep learning, researchers have 3) Preprocessing: Normalization: Pixel values were scaled to the
experimented with deeper ANN architectures on MNIST. range [0, 1]. Divide the variable using 255 character range like
Models with multiple hidden layers and advanced activation X_valid,X_train=X_train_full[:5000]/255.
functions (e.g., ReLU) have achieved near-perfect accuracy. For X_train_full[5000:]/255.
example, architectures like deep MLPs and Residual Neural y_valid,y_train=y_train_full[:5A000],y_train_full[5000:]
Networks (ResNets) have been evaluated on MNIST with
accuracy surpassing 99.5%.
III. BACKGROUND
[10]Artificial Neural Networks is a computational models this
model inspired by the structure and function of the human brain.
They consist of interconnected layers of nodes (neurons),
including input, hidden, and output layers. The MNIST dataset is
particularly suited for ANN because of its straightforward
classification task. Researchers have explored various ANN and
CNN architectures and techniques to improve accuracy,
generalization, and computational efficiency when using the
dataset.
3.1 Deep Learning in Digit Recognition
Deep learning has gained immense popularity due to its ability to
mimic the functionality of the human brain, particularly in
feature learning and decision-making. Its major algorithms,
Artificial Neural Networks (ANNs) have demonstrated
significant success in image recognition tasks.
Fig.2: Image pixel show between 0 to1
Artificial Neural Networks (ANNs):
o Composed of interconnected layers of neurons. 4.2 Model Training
o Uses Backpropagation with gradient descent ANN models were trained using the categorical cross-entropy
for weight optimization. and the ADAM optimizer. The training process involved: This
o Performs feature extraction implicitly, relying snippet trains a machine learning model for 5 epochs using
on the fully connected layers. training and validation datasets. Here's an explanation and tips to
ensure the code functions correctly:
IV.METHODOLOGY Explanation:
EPOCHS=5:
The epochs are Specifies the number of times the training
All experiments in this study were conducted on a computer or process will iterate over the entire training dataset.
laptop computer with Intel i5 processor, 8 GB of DDR3 RAM by The VALIDATION_SET is (x_valid, y_valid) are represent in
using google colab. Colaboratory is a research tool for machine x_valid-Validation input data (features) and y_valid-
learning education and research. A Jupyter notebook does not Corresponding validation labels (targets). This provides data for
require environment setup and runs entirely in the cloud.google the model to evaluate its performance after each epoch without
Colaboratory you can write and execute code, save and share influencing the training process.
your analyses, and access powerful computing resources, all for Fit the model of the following variable
history1=model_clf.fit(...):
Trains model_clf (the classifier model) on the training data An epoch is a term used in machine learning and deep learning
(x_train, y_train) for EPOCHS iterations [16]. that refers to one complete pass of the entire training dataset
Tracks metrics like loss and accuracy for both training and through the model during the training process.
validation datasets.
The returned history1 object stores training and validation Neural networks require multiple passes over the dataset
performance metrics (e.g., loss, accuracy) for each epoch. to learn the underlying patterns effectively.
During the first epoch, the model's parameters (weights)
are typically initialized randomly, so the initial
predictions are far from accurate.
TABLE I The model gradually adjusts its parameters to minimize the loss
and improve its performance. The training process can be
visualized by plotting the training loss and validation loss over
epochs. During the experimentation performance of sigmoid,
softmax, relu activation functions are tested. Results are taken
with different optimizers namely Adam, Adagrad, Adadelta,
SGD and RMSprop.
TABLE II
Performance of optimizer
MNIST Data set
Optimizer Epoch Accuracy Validation
accuracy
SGD 50 0.9992 0.9816
RMSprop 50 1.0000 0.9832
Adagrad 50 0.9699 0.9662
Adadelta 50 1.0000 0.9846
Adam 5 0.9891 09778
10 0.9952 0.9836
Backpropagation: To minimize the loss function by
back propagating the error back through the network. 50 0.9987 0.9828
Gradient Descent: To optimize weights using
calculated gradients. TABLE III
Activation Function: ReLU activation was applied No of epochs 5
after each layer for non-linearity and efficient training.
3.3 ANN Architecture
Input Layer: Flatten the 28×28 image into a 784
dimensional vector.
Hidden Layers: Two fully connected layers with 128
and 64 neurons, respectively, followed by ReLU
activation function.
Output Layer: A softmax layer with 10 neurons (one for
each digit class).
Fig 3: ANN for image classification
V. RESULT AND ANAYLSIS
Accuracy: The proportion of correctly predicted
samples, assuming accuracy was included as a metric
when the model was complied.
If your model was trained on the MNIST dataset, a
good performance might yield.
Test Loss: A small value (e.g., <0.1)
Test Accuracy: A high percentage (e.g.,>98%)
Evaluated value are assign of the following variable
test_loss,test_accuracy=model_clf.evaluate(X_test,y_test).this
statement “model_clf.predict(X_test[:5])”generates predictions
for the first 5 samples of the dataset. The output will ba a
probability distribution for each sample, as the model’s last layer
likely uses the softmax activation function.
Fig 4:Accuracy graph for 5 epochs 3) Compare Test and Validation Results:
If the test accuracy is close to the validation accuracy,
it indicates that the model generalizes well to unseen
data.
A significant gap might indicate overfitting or issues
with the test set.
TABLE IV
No of epochs 10 4)Analyze Loss:
If the test loss is much higher than the validation loss,
the model might not generalize well or the test set might
differ significantly from the training set.
5.2 Output Explanation:
1. Predictions (Raw Probabilities):
Each row corresponds to a sample (in this case,
5 rows for 5 samples)...
Each column corresponds to the probability of
a particular class (e.g., digits 0–9 for MNIST).
2. Predicted Classes:
To predict the class of an image in the MNIST image
set, we typically use a machine learning model like a
Artificial Neural Network (ANN) or a pre-trained
model. The MNIST dataset consists of handwritten
digits (0-9), and the goal is to classify the input image
into one of these classes.
5.3 Visualizing Predictions:
You can display the images along with the model’s
predictions for easier interpretation.
5.4 Verify Predictions:
Check if the predicted classes match the true labels.
5.5 Debug Errors (if any):
If there are mismatches, analyze the misclassifications
using a confusion matrix.
5.6 Refine the Model:
Improve training using techniques like data
augmentation, regularization, or hyperparameter
tuning if predictions are not accurate.
5.7 Performance Metrics:
Fig 5:Accuracy graph for 10 epochs The models were evaluated using accuracy and loss on
the training testing datasets.
5.1 Model Prediction ANN Performance:
1) Inputs: X_test: The test datasets features(e.g. images from Accuracy: ~98.3%on the test data
MNIST test set) Y_test: The corresponding true Strength: Simpler architecture, faster to train
labels for the test dataset. Limitation: Struggles with spatial feature extraction.
Debug Errors(if any):
If there are mismatches, analyze the misclassifications
2) Output:
using a confusion matrix.
Loss: This is the value of the loss function (e.g.,
categorical cross-entropy) computed on the test set. It
represents how well the model’s predictions match the VI. CONCLUSION
true labels. This study shows that while ANNs are effective for handwritten
digit recognition, artificial neural networks (ANNs) have the
capability to extract spatial features from image data. Future
work can explore hybrid models, advanced architectures such as
ResNet, or transfer learning approaches to further improve the [7] Saeed AL-Mansoori, “Intelligent Handwritten Digit
performance on the handwritten character dataset. This paper Recognition using Artificial Neural Network”, Journal of
makes substantial contributions to the understanding of Engineering Research and Applications, Vol. 5, Issue 5, (Part -3)
handwritten digit reorganization using ANN. In this research May (2015), ISSN: 2248- 9622, pp.46-51.
paper review and analysis of experimental result son MNIST
document dataset and it also provides a valuable resource for [8] S M Shamim, Mohammad Badrul Alam Miah, Angona
researchers and practitioners in the field of image reorganization Sarker, Masud Rana, Abdullah Al Jobair, “Handwritten Digit
using ANN and also provide the aforementioned suggestions Recognition using Machine Learning Algorithms”, Journal of
would further enhance the clarity and completeness of the paper. Computer Science and Technology: DNeural & Artificial
Summary of the findings of the research paper, insights gained, Intelligence, vol. 18, Issue 1, (2018), ISSN: 0975-4172, pp.0975-
and implications for the field of handwritten digit recognition. 4350.
These papers make recommendations for future research
directions and improvements in model architectures. We study [9] Md. Anwar Hossain, Md. Mohon Ali, “Recognition of
some handwriting digit recognition and artificial neural network- Handwritten Digit using Convolutional Neural Network (CNN)”,
based recognition algorithms to decide on the best algorithm in Journal of Computer Science and Technology: D Neural &
terms of several aspects such as accuracy, validation accuracy, Artificial Intelligence, Vol. 19, Issue 2, (2019), ISSN: 0975-
and performance. 4172, pp.0975-4350.
Many authors proposed different models and they adopted some
criteria such as execution time is also taken into consideration. [10] Saqib Ali, Zeeshan Shaukat, Muhammad Azeem, Zareen
Random and standard datasets of handwritten digits are used to Sakhawat, Tariq Mahmood, Khalilur Rehman, “An efficient and
calculate the algorithm. improved scheme for handwritten digit recognition based on the
The experimental results of this study show that artificial neural convolutional neural network\", (2019), unpublished.
networks (ANNs) are the most effective algorithms for
handwritten digit recognition when evaluated on the MNIST [11] Savita Ahlawat, Amit Choudhary, Anand Nayyar, Saurabh
dataset. ANNs perform better in terms of accuracy, achieving Singh, Byungun Yoon, “Improved Handwritten Digit
higher levels of precision than other models, and are Recognition Using Convolutional Neural Networks (CNN)”,
computationally efficient, making them suitable for practical (2020), unpublished.
applications.
[12] Aarti Gupta, Rohit Miri, Hiral Raja, “Recognition of
REFERENCES Automated Hand-written Digits on Document Images Making
Use of Machine Learning Techniques”, Journal of Engineering
[1] Convolutional neural networks for speech recognition. and Technology Research, (2021), ISSN: 2736-576X.
IEEE/ACM Transactions on audio, speech, and language
processing, 22(10), 1533-1545. Agarap, A. F. (2017). [13] V. Gopalakrishan, R. Arun, L. Sasikumar, K. Abhirami,
“Handwritten Digit Recognition for Banking System”, Kings
[2] An architecture combining convolutional neural network College of Engineering, Punalkulam, Pudukottai, Journal of
(CNN) and support vector machine (SVM) for image Engineering Research & Technology (IJERT), (2021), ISSN:
classification. arXiv preprint arXiv:1712.03541. Agarap, A. F. 2278-0181.
(2018).
[14] Ritik Dixit, Rishika Kushwah, Samay Pashine,
[3] Deep learning using rectified linear units (relu). arXiv “Handwritten Digit Recognition using Machine and Deep
preprint arXiv:1803.08375. Albawi, S., Mohammed, T. A., & Learning Algorithms”, Journal of Computer Applications,Vol.1.
Al-Zawi, S. (2017, August).
[15]Drishti Beohar at el. “Handwritten Digit Recognition using
[4] Understanding of a convolutional neural network. In 2017 Machine and Deep Learning state-of-art ANN and CNN” 2021
International Conference on Engineering and Technology International on Emerging Smart Computing and Informatics.
(ICET) (pp. 1-6). IEEE. Belongie, S., Malik, J., & Puzicha, J.
(2001, July). [16]S.Prasad jones Christydass,Nurhyati,S.Kannadhasan.’Hybrid
and Advance Technology”,CRC Press,2025
[5] Matching shapes. In Proceedings Eighth IEEE International
Conference on Computer Vision. ICCV 2001 (Vol. 1, pp. 454- [17] V.Serisha at el. “Handwritten Digit Classification using
461). IEEE. Belongie, S., Malik, J., & Puzicha, J. (2002). CNN” IJARIIT Volume 7,Issue 6-V716-1212.
[6] Shape matching and object recognition using shape contexts. [18]V.Sharmila,S.Kannadhasan,A.RajivKannan,P.Sivakumar,V.
IEEE transactions on pattern analysis and machine intelligence, Vennila. “Challenges in information, Communication and
24(4), 509-522. Bhatnagar, S., Ghosal, D., & Kolekar, M. H. Computing Technology”,CRC Press,2024
(2017, December).