Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
14 views6 pages

Breast Cancer Classification Using Neural Networks

Uploaded by

Faiz Chachiya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views6 pages

Breast Cancer Classification Using Neural Networks

Uploaded by

Faiz Chachiya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

2023 International Conference on Intelligent and Innovative Technologies in Computing, Electrical and Electronics (IITCEE) | 978-1-6654-9260-7/23/$31.

00 ©2023 IEEE | DOI: 10.1109/IITCEE57236.2023.10091020

Breast Cancer classification using Neural networks


V Asha Binju Saju Serene Mathew
Master of Computer Applications Master of Computer Applications Master of Computer Applications
New Horizon College of Engineering New Horizon College of Engineering New Horizon College of Engineering
Bengaluru, India Bengaluru, India Bengaluru, India
[email protected] [email protected] [email protected]

Athira M V Y Swapna S P Sreeja


Master of Computer Applications Master of Computer Applications New Master of Computer Applications
New Horizon College of Engineering Horizon College of Engineering New Horizon College of Engineering
Bengaluru, India Bengaluru, India Bengaluru, India
[email protected] [email protected] [email protected]

Abstract—Nowadays, due to lack of awareness of Digital mammography is the most important and critical
breast cancer and its signs that show, as well as methods way of diagnosing a patient with breast cancer. But that is
for prevention, causes them to be one of the most deadly not the only way by which we can diagnose a patient with
types of cancer and the death rate has significantly this disease.
increased. Hence, in order to stop the spread of cancer, We can use computer-aided diagnosis (CAD) systems to
early identification at a nearly stage is critical as well as accurately come to a conclusion when diagnosing a disease.
important. Breast cancer is further classified in to two [7-10]
types, malignant and benign. This study used machine
learning techniques and neural network methods to Benign and malignant are the two main types when
classify the breast cancer types. A system is automated to considering any type of cancer. While a person with benign
tumor can recover with a higher survivability, where as a
carry out its opinion that is also automated, for breast
human with malignant tumor has a lower survivability
cancer. This approach uses DNN (deep neural network),
mainly due to the diagnosis and prognosis of it found later.
CNN (Convolutional Neural Network) and ANN
Artificial Neural Network) and RFE (recursive feature To develop this kind of system where it can diagnose a
elimination) for feature selection. DNN is applied with a breast cancer in its earlier stages to save live, we do it with
multitude of layers of functions processing is applied to the help of different Neural Network.
categorize the breast cancer data set. The result shows DNN (deep neural network), CNN and ANN are used as
DNN is comparatively more outperforming with an a categorizer model and RFE (recursive feature elimination)
accuracy of 97%. for picking out the feature that is used for this method. DNN
is particularly used to increase accuracy of a machine
Keywords—Artificial Neural Network, Breast Cancer learning model. They are often referred to as deep net. This
Detection, Convolutional Neural Network, Deep Neural means they are multiple hidden layers deep. The secluded
Network, Malignant Cancer, Benign Cancer. layers will use the data been inputed from the other layers.
The rate error in inputting data may be significantly shrunken
I. INTRODUCTION down if the weights of every node are readjusted. DNN is
able to generate a finer model for themselves to train with the
As per the WHO (world health organization), more than presented data.
two million women will be identified with breast cancer in
twnties, and 685,000 people would pass away from the
disease. The death rate from cancer is currently rising. Breast II. LITERATURE REVIEW
cancer will become one of the most common illness in the Neural networks are a group of machine learning
earth by the end of twenty first century, with 7.8 million strategies that may be applied to applications requiring
women alive who have received a diagnosis in the preceding classification and regression In a neural network, a layer is
five years. Advancements in survival rates started in the thought of as a collection of neurons stacked one on top of
1980s at different regions with techniques for early detection the other. And there might be n nodes in this tier [3].
along with various sorts of therapy to eliminate invasive.. Abdel and Eldieb [4] Deep belief network (DBN) was
[1]. used for the data set, and a stunning 94.68% accuracy was
A late diagnosis and inadequate treatment might result in achieved. DBN travels in an uncontrolled manner. Using
mortality since tumor is a development of nonnormal tissues BPNN and the Levenberg-Marquardt learning function, this
that result from an alteration in these tissues and spread system was created. where the DBN route is used to initialise
throughout the body. Invasive and non-invasive breast the load of nodes. The system outperforms existing
cancers make up the majority of cases. Malignant means that classifiers and offers results that are adequate. Deep learning
the invasive has the ability to spread to other organs. It is not dramatically increases precision while reducing error rates to
toxic and does not spread to other organs when it is non- a minimum.
invasive. This especially affects the glands and milk ducts in Jhajhari et al.[5] provided a prototype for classifying
the women's chest. Quite frequently, cancer spreads through breast tumor that used a feed-forward based neural network
the circulation to many organs. [2]. to ccategorize the data and a component analysis technique
known as PCA, principal component analysis, to extract

978-1-6654-9260-7/23/$31.00 2023
c IEEE 900

Authorized licensed use limited to: MICROSOFT. Downloaded on December 28,2024 at 07:31:05 UTC from IEEE Xplore. Restrictions apply.
some characteristics from the dataset. Data is divided by a Area: In order to account for the digitization mistake,
splitting percentage for training and testing data to get the multiply the number of pixels within the boundary by 1.5
desired outcome. They used a least SVM machine to analyze and add 0.5 pixels to the perimeter.
a WDBC (Wisconsin Diagnostic Breast Cancer) dataset
(LSSVM). This system used a validation approach to achieve Smoothness: the local variation in radius lengths. is caused
accuracy of 98.53%.. by the distance between the length of a radius and the mean
Ghosh et al. [6] proposed a technique for classifying length of the two radius lines surrounding it.
breast cancer based on neuro-fuzzy analysis. These data sets
were also used to assess the effectiveness of the procedure. A Compactness: To determine how dense the cell nuclei are,
multi-layer perceptron is then used to categorize the data. the perimeter and area are combined.
The objective is finally achieved via inducing
defuzzification. Concavity: the degree to which the contour is concave; a
The highest accuracy rate observed was 97.8%. This strong concavity indicates that the cell nucleus' border has
performance can be improved if feature selection and indentations and is therefore more uneven than smooth.
limination is done more efficiently In this study, we are
trying to build models based on Neural networks with feature Concave points: the quantity of concave areas along the cell
reduction using Recursive Feature elimination algorithm. nucleus's contour.

III. PROPOSED SYSTEM Symmetry: The symmetry is established by first determining


the longest line that passes through the centre of the nucleus
The study is using various neural networks for creating the
from boundary point to boundary point, and then measuring
model. The neural networks have proved to be more efficient
while doing the background study. the length differences between the lines that are
perpendicular to that longest line in both directions. Because
Prediction model is build using CNN , ANN and DNN , also concavity exists in these nuclei, it is important to pay special
feature engineering is done using Recursive feature attention to those where the longest line passes through the
elimination algorithm. boundary.
A. Dataset
Fractal dimension: The "coastline approximation" is used to
Dataset is taken from kaggle which consist of 30 feature estimate the fractal dimension. The perimeter of the nucleus
variables and 569 occurrences make up the dataset. Since can be estimated using measuring sticks of a certain length;
the dataset only contains binary values for the class label as this length increases, the measured "coastlineoverall "'s
values—0 for benign and 1 for malignant—it allows binary length reduces due to a loss in size precision.Because a
classification models. single feature contains the MEAN radius, SE radius,
WORST radius, etc., the dataset has 30 features (in vector
Table 1: count of data in dataset format) as opposed to 10 features.

Types of breast cancer No. of cases for each No one of the 569 instances has a missing value. Therefore,
there is no need to get rid of instances to lower the system's
Malignant 212 mistake rate.

B. Algorithms used
Benign 357
1) Feature Elimination Using RFE
Total data 569
Recursive feature elimination (RFE) is the most common
algorithm used for doing the selection of features in the
dataset which will thereby help in fitting the model and also
The features in the dataset are defined as follows and removing the weak features from the dataset.
represent properties of the cell nuclei seen in a digitised
image: Features will be given ranks by the model and algorithm
will remove a small number of features during each loop or
Radius: the typical separation between each border point iteration. Workflow and Psudocode of RFE is shown in
and the centre of the nucleus figure 3 and 4.
Texture: the intensity of different gray shades in every
coordinate of the image, and the standard deviation of the
gray-scale values

Perimeter: the entire distance separating the cell nucleus's


two boundaries

International Conference on Intelligent and Innovative Technologies in Computing, Electrical and Electronics (IITCEE) 901

Authorized licensed use limited to: MICROSOFT. Downloaded on December 28,2024 at 07:31:05 UTC from IEEE Xplore. Restrictions apply.
One of the key benefits of applying ANN is that it
may create models that are simple to use and more accurate
from challenging natural systems with a greater input.
ANNs are sometimes compared to the human brain. The
activities may be simply coordinated to carry out a task,
much like the human brain does so effectively.

3) CONVOLUTIONAL NEURAL NETWOK (CNN)


Convolutional Neural Networks have made groundbreaking
discoveries over the past 10 years in a many of pattern
identification based areas like voice, Image, signal and
videos. The much highlighted thing of CNNs is the
decrement of the number of input parameters in ANN. Its
name is derived from the linear mathematical operation
known as convolution, which uses matrices.
This success has motivated researchers and developers to
employ larger models to handle difficult issues that cannot
be resolved with conventional ANNs. Regarding the
Fig 1. Working of RFE problems that CNN resolves, the most important premise is
that there shouldn't be any spatially dependent properties.
Another important advantage of CNN is the acquisition of
abstract properties when input propagates to further levels.

Fig 4 . CNN Block Diagram


Fig 2 . RFE Pseudo code
4) DEEP NEURAL NETWORK (DNN)
2) Artificial Neural Network (ANN)
A deep neural network (DNN) is a neural network
with numerous hidden layers. Compared to other neural
Three layers of neurons make up an artificial neural network
network, DNNs may reflect complex non-linear interactions.
(ANN): an input layer, one, two, or three hidden layers, and
The basic operation of a neural network is to take in a set of
an output layer. A common design is shown in Figure 3 with
inputs, analyses those inputs using increasingly intricate
lines joining the neurons. The weight of each connection is a
computations, and then output the findings to deal with
numerical value.
practical problems like classification.
Artificial neural networks, a useful model for classification,
Neural networks are frequently used to solve
pattern recognition, grouping, and prediction, have gained
supervised learning and reinforcement learning challenges.
increasing notoriety in the present generation. The high-
These networks are built on a strict hierarchy of layers that
speed processing offered by ANNs has greatly raised the
are connected to one another. In deep mastering, there can
need for growth in this field, giving them a lot of promise.
be a wide variety of hidden layers—mostly non-linear—of
One of the factors contributing to the success, effectiveness,
up to a thousand levels. DL models yield far greater results
and efficiency of ANNs is some data analysis aspects.
than traditional ML networks. The gradient descent
approach is typically used to optimize the network and
reduce the loss characteristic.
Every algorithm type, including reinforcement,
unsupervised, semi-supervised, and supervised, is
supported. As a result, the system will not define a particular
algorithm. By itself, the DNN builds an efficient model to
train itself using the inputed data..

Fig 3: architecture of ANN

902 International Conference on Intelligent and Innovative Technologies in Computing, Electrical and Electronics (IITCEE)

Authorized licensed use limited to: MICROSOFT. Downloaded on December 28,2024 at 07:31:05 UTC from IEEE Xplore. Restrictions apply.
denotes the absence of cancer and malignant denotes the
presence of a cancerous tumors. There are no missing values
in 569 cases.
The proposed model has 2 parts,
Pre-processing: It is impossible to disregard the need of
choosing the features for a machine learning model. It
minimizes the complexity of the data and makes the data
nebulosity-free. Additionally, it decreases the quantity of the
data, building model training quick and easy and reducing
training time. It avoids data from being overfit. The attributes
of the model can identify the type of tumor. Average radius,
texture, perimeter, area, smoothness, compactness,
concavity, concave points, symmetry, fractional measure,
Fig 5 : DNN Architecture etc. are some of the features. This also include handling of
missing data and feature selection using RFE.
C. Steps Followed Classification: The dataset is divided into train-test splits of
25, 50, and 75% by the system with the intention of
The procedure followed is stated as follows: conducting an experiment. Without any specificity and
x Data loading without adhering to any progressions, the dataset is split and
separated. 75% of the data is first used for training the model
x Picking out the finest features from the data set by and 25% data which is considered as the testing data for
implementing retrogression model. checking the performance of the algorithm.
x Splitting data into raining data and testing data This DNN classifier/categorizer contains four nodes in the
input layer, 3 hidden layers with ten, twenty and ten, and one
x Standardizing data
node in the output layer. Since this network includes multiple
x Building neural network model layers and numerous inner nodes, the outcomes are more
likely to be realized after training the model, which reduces
x Classifying the data using ANN,CNN and DNN computational costs. The count, mean, standard deviation,
x Performance analysis comparison of neural network lowest value, 25th, 50th, and 75th percentiles, as well as the
models highest value, are all statistical metrics used here.

IV. DISCUSSION AND RESULT

Dataset is taken from kaggle which consist of 30 feature


variables and 569 occurrences make up the dataset. Since
the dataset only contains binary values for the class label
values—0 for benign and 1 for malignant—it allows binary
classification models.

It is good to check the correlations between the attributes.


From the output graph below, the red round the diagonal
represent their interrelatedness. The yellow and green
portions show their intermediate interrelatedness.

Performance Analysis

Confusion matrix helps in predicting the total


performance of it. It also facilitates to get the categorized
and misclassified charge of the system. Impact fullness and
ability to be better, of a tool may be measured through
calculating the accuracy as in (1).

Accuracy = (1)
Fig 6: Proposed Model
It had 569 incidences in the breast cancer dataset. Since there Where, TP stands for true-positive , true-negative is TN,
are just two columns with the labels "benign" and false-positive is shown by FP, and FN represents false-
"malignant," this dataset allows dual categorization models negative.
or classification models. We can determine if a tumors is
benign or malignant by using the characteristics. They can be
distinguished by benign and malignant, where benign

International Conference on Intelligent and Innovative Technologies in Computing, Electrical and Electronics (IITCEE) 903

Authorized licensed use limited to: MICROSOFT. Downloaded on December 28,2024 at 07:31:05 UTC from IEEE Xplore. Restrictions apply.
(a)

(b)
Fig 9: DNN Performance (a) Accuracy and (b) Looss

Table 2. Accuracy of NNs with and without using RFE

Before using After using


Algorithm
RFE RFE
DNN 92 97
(b)
CNN 85 88
Fig 7: CNN Performance (a) Accuracy and (b) Loss ANN 79 85
Figure 7, 8 and 9 shows the performance of CNN, ANN and
DNN respectively.The Accuracy of DNN, CNN and ANN The result of the performance comparison of CNN, ANN
was identified with and without doing the feature and DNN is shown in fig10.
elimination using RFE

Fig 10: Performance Comparison


Fig 8: ANN Performance-Accuracy and Loss

It was observed that DNN has 97% accuracy followed by


CNN with 88% and ANN with 85% accuracy. As per the
study DNN is best classification model for Breast Cancer
classification
V. CONCLUSION
In this contemporary technology, great deal of humans are
dealing with many issues with contemporary age illnesses.
carcinoma is most common and deadliest sickness raising
through the years among unique nations among women.
Insufficient data and put up selection of sickness can
(a) become the primary cause for greater dying rates.

904 International Conference on Intelligent and Innovative Technologies in Computing, Electrical and Electronics (IITCEE)

Authorized licensed use limited to: MICROSOFT. Downloaded on December 28,2024 at 07:31:05 UTC from IEEE Xplore. Restrictions apply.
Computer-aided analysis might be an ideal answer for all [7] Nirmala, A.P., More, S.Role of artificial intelligence in
form of peoples to diagnose with correct outcomes. this fighting against covid-19,Proceedings of 2020 IEEE
technique will not be an ideal alternative for expert medical International Conference on Advances and Developments in
doctors, however this resource will assist them plenty , by Electrical and Electronics Engineering, ICADEE 2020, 2020,
9368956
helping practitioners, to form a great choice with the aid of [8] Duraipandian, M., and Mr R. Vinothkanna. "Smart Digital
studying affected person reports. From time to time, Mammographic Screening System for Bulk Image
practitioners also can perform a little mistake because of Processing." Journal of Electrical Engineering and
lack of revel in or bad evaluation of stories. So it’s visiting Automation 2, no. 4 (2021): 156-161.
act as a higher treatment for the modern medical [9] N. Krishnamoorthy, D. R. Suresh, D. Mohanapriya, D. A.
surroundings. Prasad, D. R. Krishnamoorthy, and D. R. Thiagarajan,
“Utilisation of Deep Learning to Exploit Locust Outbreaks in
This system used ANN, CNN and DNN models for Agricultural Harvesting,” vol. 20, no. 10, p. 8, 2022.
classification. DNN is best classification model for [10] B Nithya, Predictive Analytics in Health Care Using Machine
Learning Tools and Techniques IEEE International
carcinoma classification with 97% Accuracy. Conference on Intelligent Computing and Control Systems -
REFERENCES ICICCS 2017 PP 492-499 IEEE Xplore 978-1-5386-2745-
7/17/$31.00 ©2017
[11] Ebru Aydindag Bayrak, Pinar Kirci,
[1] “Breast cancer.” Accessed: Nov. 06, 2022. [Online]. Available: TolgaEnsari,“Comparison of machine learning methods
https://www.who.int/news-room/fact-sheets/detail/breast-cancer for breast cancer diagnosis.2019 Scientific Meeting on
[2] K. S. Priyanka, “A Review Paper on Breast Cancer Detection Electrical-Electronics & Biomedical Engineering and
Using Deep Learning,” IOP Conf. Ser. Mater. Sci. Eng., vol. Computer Science (EBBT), pp. 1-3,2019.
1022, no. 1, p. 012071, Jan. 2021, doi: 10.1088/1757- [12] Ch. Shravya, K. Pravalika, ShaikSubhani, “Prediction of
899X/1022/1/012071. breast cancer using supervised machine learning techniques,”
[3] K. Sekaran, S. P. Ramalingam, and C. M. P.V.S.S.R., “Breast International Journal of Innovative Technology and Exploring
Cancer Classification Using Deep Neural Networks,” in Engineering, vol. 8, no. 6, pp. 1106-1110, 2019.
Knowledge Computing and Its Applications: Knowledge [13] V Sansya Vijayan, Lekshmy P L, “Deep learning based
Manipulation and Processing Techniques: Volume 1, 2018, prediction of breast cancer in histopathological
pp. 227–241. doi: 10.1007/978-981-10-6680-1_12.
images,” International Journal of Engineering Research
[4] A. M. Abdel-Zaher and A. M. Eldeib, “Breast cancer
classification using deep belief networks,” Expert Syst. Appl., & Technology, vol. 8, no. 07, pp.148-152, 2019.
vol. 46, pp. 139–144, 2016, doi: [14] AB Tobsun “Graph Run Length Matrices For
https://doi.org/10.1016/j.eswa.2015.10.015. Histopathological Images” IEEE Transactions on
Medical Imaging Volume: 30 , Issue: 3 MARCH 2011
[5] ] Jhajharia, Smita et al. “A neural network based breast cancer [15] mriti H. Bhandari , “A Bag-Of-Features Approach For
prognosis model with PCA processed features.” 2016 Malignancy Detection In Breast Histopathology
International Conference on Advances in Computing, Images”, IEEE International Conference on Image
Communications and Informatics (ICACCI) (2016): 1896- Processing SEP 2015
1901.
[6] Ghosh, S., Biswas, S., Sarkar, D. C., & Sarkar, P. P. (2016).
Breast cancer detection using a neuro-fuzzy based
classification method. Indian Journal of Science and
Technology, 9(14).

International Conference on Intelligent and Innovative Technologies in Computing, Electrical and Electronics (IITCEE) 905

Authorized licensed use limited to: MICROSOFT. Downloaded on December 28,2024 at 07:31:05 UTC from IEEE Xplore. Restrictions apply.

You might also like