0% found this document useful (0 votes)

4 views9 pages

BERT-Based Cyberbullying Detection

Uploaded by

Trang Trần

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views9 pages

BERT-Based Cyberbullying Detection

Uploaded by

Trang Trần

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

International Research Journal on Advanced Engineering Hub (IRJAEH)

e ISSN: 2584-2137
Vol. 03 Issue: 04 April 2025
Page No: 2007-2015
https://irjaeh.com
https://doi.org/10.47392/IRJAEH.2025.0293

BERT-Based Cyberbullying Detection

Maladhy D1, Jeevitha D2, Madhumitha K3, Subashree J4
1
Assistant Professor, Dept. of IT, Rajiv Gandhi College of Engg. & Tech., Kirumampakkam, Puducherry,
India.
2,3,4
UG Scholar, Dept. of IT, Rajiv Gandhi College of Engg. & Tech., Kirumampakkam, Puducherry, India.
Emails: [email protected], [email protected],
[email protected], [email protected]

Abstract
Cyberbullying on social media has been on the rise lately-and with it, the serious psychological effects it leaves
in its wake: anxiety and sadness. That's why early detection and intervention are so crucial. Those traditional
methods of tackling online abuse often fall short when it comes to slang and ever-changing language. That's
because they just can't pick up on the intent behind the words. Our project tackles that problem head-on by
combining text and visual elements in a way that deep learning can really understand. We use a refined BERT
model to put language in context, Demoji to decipher the meanings of emojis and Pytesseract to extract text
from images. That hybrid approach ensures even the most hidden or indirect bullying messages are identified.
We deliver that analysis-and the tools to visualize it-in real-time through a mobile app. That means non-
technical users-parents, teachers and moderators-can easily use it to spot and stop cyberbullying. By
harnessing the latest AI technologies to safeguard vulnerable people, we create a safer online environment.
Keywords: Cyberbullying, Real-time Analysis, BERT Model, Demoji, Pytesseract.

1. Introduction
The world has evolved through different dimensions learning-based models for better accuracy. The deep
through internet in various fields like education, learning-based models used GloVe and SSWE for
sports, entertainment etc,. As there are ups and downs different word embedding techniques. The results
in life, the internet also has its own downsides. The conclude that more than those generated from
biggest problem in this digital world is cyberbullying. traditional machine learning models, deep learning-
In recent surveys it has been shown that about 36.5% based models work better consistently. A new
of the respondents have gone through cyberbullying, approach for cyberbullying detection is developed by
by facing harassment through digital media. Increase using a pre-trained BERT that outperform on
in internet usage lead to increase in cyberbullying numerous NLP tasks. Such models are applicable in
activities at an alarming rate. 87% of the young social catching contextual meanings of words and phrases,
media users have accepted that they have been thereby able to dig deeper into the complexities of
through these kinds of online harassments. It’s a online communication. Research says that the deep
tough task to deal with cyberbullying as it can happen learning-based models would beat the traditional
in different ways like toxic comments, photographs, models when it comes to detecting cyberbullying
videos etc,. Advanced technologies have been used to tasks. Using this advancement, there came BERT-
detect and remove cyberbullying activities on social based detection, which is actually an innovation
platforms to prevent these kinds of harmful developed by Google AI that marks an enormous leap
happenings. But cyberbullying detection is a very in the world of natural language processing. Also,
tough task. Sometimes when we are having a normal BERT can be fine-tuned over specific tasks, which
conversation with friends, they might sound like makes it fit perfectly to handle the task of harmful
bullying but if we examine properly it might not be social media content identification. Though
the same. Many studies based on the Cyberbullying Cyberbullying is a serious and growing problem but
detection suggested the application of traditional the development of deep learning models,
machine learning-based models and advanced deep specifically BERT, might bring some hope for

International Research Journal on Advanced Engineering Hub (IRJAEH) 2007

International Research Journal on Advanced Engineering Hub (IRJAEH)
e ISSN: 2584-2137
Vol. 03 Issue: 04 April 2025
Page No: 2007-2015
https://irjaeh.com
https://doi.org/10.47392/IRJAEH.2025.0293

solutions. paper emphasizes the advantages of deep learning

2. Literature Survey models over conventional machine learning
Cyber dating abuse in adolescents Myths of algorithms regarding cyberbullying detection. Deep
romantic love, sexting practices and bullying [1] learning models utilize their data processing capacity
Ainize Martínez Soto, Cristina Lopez-del Burgo and data classification abilities to classify images and
[1] This study shows the complexities of cyber dating text with a better degree of automating feature
abuse (CDA) in teens and we need programs that extraction through hidden layers. The result is more
cater to different social factors. Through a diverse accurate pattern identification and cyberbully
sample, we found boys are more likely to bully, sext detection when compared to conventional
and believe in romantic love myths, while highly algorithms. This paper also reviews datasets and deep
religious teens are less likely to sext. So we need to learning architectures proposed in the studies
educate teens on safe online behavior and identify mentioned, describing the tasks carried out each time
abusive actions in virtual and real life relationships. a dataset was used. Ultimately, this paper not only
We can better protect them by considering the adds to the understanding of present methods, but sets
different influences of gender, culture and personal the stage for future research using deep learning and
beliefs on their online interactions. other methods to improve cyberbullying detection
Cyberbullying Detection in Social Networks: A systems.
Comparison Between Machine Learning and Detecting cyberbullying using deep learning
Transfer Learning Approaches [2] Ainize techniques: a pre-trained glove and focal loss
Martínez Soto , Cristina Lopez-del Burgo [2] The technique [5] Amr Mohamed El Koshiry , Entesar
research developed an automatic system for detecting Hamed I. Eliwa , Tarek Abd El-Hafeez [5] This
cyberbullying using two distinct approaches: research comprises a performance comparison, using
Conventional Machine Learning (CML) and Transfer appropriate metrics (i.e., accuracy, precision, recall,
Learning. The CML approach, which utilized a and F1 score) among several classical and deep
combination of textual, sentiment, emotional, and learning algorithms to detect the presence of
toxicity features within a Logistic Regression model, cyberbullying in tweets. The Focal Loss algorithm
achieved an F-measure of 64.8%. Among the various exhibited the strongest performance based on
embeddings evaluated, DistilBert emerged as the accuracy and precision metrics; however, the low
most effective, yielding the highest F-measure both recall across most algorithms suggests a challenge in
individually and in combination with other features. detecting all instances of cyberbullying among the
A robust hybrid machine learning model for tweets. This research proposes novel directions for
Bengali cyber bullying detection in social media detection research and application by integrating a
[3] Arnisha Akhter , Uzzal Kumar Acharjee , Md. convolutional neural network (CNN) with a
Alamin Talukder [3] This article introduces a new bidirectional long short-term memory (Bi-LSTM)
hybrid machine learning approach that centers on layer. The detection model is trained on a pre-
detecting cyberbullying in Bengali on social media processed dataset of tweets and is trained on GloVe
platforms. The study uses advanced text word embeddings and the Focal Loss function to
preprocessing methodologies and adopts increase accuracy in detecting any instances of
TfidfVectorizer of effective feature representation. cyberbullying.
The method outperforms prior methods of detectable 3. Architecture Diagram
Bengali cyberbullying and provides substantial The recommended structure for the cyberbullying
contributions to better protection of users on social detection system allows for efficient data processing,
media. real-time detection, and precision across the various
A Review on Deep-Learning-Based Cyberbullying formats. This architecture is made up of three layers:
Detection [4] Md. Tarek Hasan, Md. Al Emran User Interface Layer, Processing Layer, and Backend
Hossain, Md. Saddam Hossain Mukta [4] This & Database Layer. The users' interactions with the

International Research Journal on Advanced Engineering Hub (IRJAEH) 2008

International Research Journal on Advanced Engineering Hub (IRJAEH)
e ISSN: 2584-2137
Vol. 03 Issue: 04 April 2025
Page No: 2007-2015
https://irjaeh.com
https://doi.org/10.47392/IRJAEH.2025.0293

system take place via a web or mobile application, proposed overall structure of the plan to show the
using React Native, that allows the user to submit interactions of the actors and the subcomponents of
text, images, or emojis for processing. The the use case which highlights how user interaction
Processing Layer uses a highly tuned BERT model to will provide the necessary action to generate
perform deep processing of the text, Pytesseract will proactive or intervention strategies with regard to
extract text from any images submitted by the user, cyberbullying while contributing to inclusive and
and Demoji will interpret emojis, allowing for the safe online experiences, shown in Figure 2.
detection of bullying communicated in all three
formats. The Backend & Database Layer uses Python
with Flask for processing of the user request and
stores the information captured from the users into a
structured SQL/NoSQL tied to further processing.
The proposed architecture allows for user data to be
processed efficiently, securely and accurately, which
will support a reliable and efficient method for
detecting instances of cyberbullying, Figure 1.

Figure 2 Use Case Diagram

5. Sequence Diagram

Figure 1 Architecture Diagram

4. 3. Use Case Diagram:

The diagram of use case for the integrated
cyberbullying detection plan explains how the
different actors interact with the different capabilities
of the plan. The main actor of this plan will be the
user, although in reality a person communicates via Figure 3 Sequence Diagram
social media as the user. The user of this plan
interacts with the plan by posting, commenting or This sequence diagram illustrates the interactions
sending a message to someone, which may involve a between user, system, and components of the system
form of being subjected to cyberbullying. Anytime a throughout the detection process step-by-step.
post of any kind is sent, the plan will process the post Initially, a user will post or comment on the social
which may involve data collection, data media platform. At this point, the system recognizes
preprocessing, extracting features, and classifying the submission, kicking off a sequence of events
data. The overall purpose of the plan is to identify and starting with data gathering. After that, the system
alert the user in cases of language and content that is follows-up with pre-processing steps to the submitted
harmful to other people. The diagram illustrates the text in order to clean the text and prepare it for

International Research Journal on Advanced Engineering Hub (IRJAEH) 2009

International Research Journal on Advanced Engineering Hub (IRJAEH)
e ISSN: 2584-2137
Vol. 03 Issue: 04 April 2025
Page No: 2007-2015
https://irjaeh.com
https://doi.org/10.47392/IRJAEH.2025.0293

processing. After pre-processing, the system sends flows and between these components and the various
the cleaned text to a BERT model for feature layers of the architecture all designed to assist the
extraction, which generates numeric values understanding and generative capabilities of the Bert
representing the content in the context of meaning. decision framework, shown in Figure 4.
After my BERT model features, the model is then 7. Proposed System
trained to classify the user submission as bullying or The cyberbullying detection system has been
non-bullying. In the end, the system returns the enhanced using deep learning and natural language
classification back to the users along with alerts or processing (NLP) methods to evaluate text, emojis,
actions if the content contained any harmful material. and images. It takes advantage of a fine-tuned BERT
Overall, this sequence of observations highlights the model to enhance the understanding of language by
structure and flow of data and processing within the determining the contextual meaning of a sentence,
system and highlights the organizations process flow rather than of single words. Pytesseract is to be
aligns to user engagement, shown in Figure 3. utilized to extract text that could be hidden in the
6. Architecture Diagram of The Bert Model image for a greater detection of textual content
capacity so that it can evaluate images that are often
used to bully others, such as memes and screenshots.
Demoji will be used to interpret the meaning of
emojis so that any meaning or sarcasm conveyed in
the symbols are not ignored. Overall, the system is
using these technologies to improve the ability to
detect cyberbullying both more accurately while at
the same time decreasing its risk of a false negative
and improving the reliability overall [6-8].
7.1 Data Collection
Data collection is a vital stage in building a more
precise cyberbullying detection system, involving the
gathering of different datasets from a variety of
digital spaces, including text, images, and video. The
Figure 4 BERT Architecture Diagram goal of the data collection is to create a holistic
dataset that carries representations of multiple forms
BERT (Bidirectional Encoder Representations from of online bullying, so that the model can recognize
Transformers) model is based on the architecture of harmful interactions across different contexts. The
transformers which is particularly constructed for datasets we are collecting contain chat conversations,
language understanding. BERT model is composed comments, Twitter posts, and images that are
of numerous layers of transformers which use self- typically platforms where cyberbullying occurs. One
attention to identify the relevance or relationship or more forms of multimodal content sample will also
between words in a sentence, irrespective of their be collected. Cyberbullying occurs primarily in text,
location in the sentence. The model utilizes this self- however, often includes images, memes, screenshots,
attention model framework in which context is and emojis multi-modal content, which if we only use
provided bidirectionally from the left and right side text we will inadvertently miss. This will allow us to
of each token within a sentence to produce a more identify indirect, covert, or sarcastic forms of
natural understanding of language. When fine-tuned bullying, that are not identifiable using typical text-
to a specific task such as bullying language detection, based approaches. All personally identifiable
BERT is able to apply its wide-ranging context-based information (PII) will be removed or anonymized for
knowledge in predicting based on textual input. In compliance with data protection frameworks like
summary, the architecture diagram depicts the GDPR and CCPA. The identified sample will be
interconnected components, including how data
International Research Journal on Advanced Engineering Hub (IRJAEH) 2010
International Research Journal on Advanced Engineering Hub (IRJAEH)
e ISSN: 2584-2137
Vol. 03 Issue: 04 April 2025
Page No: 2007-2015
https://irjaeh.com
https://doi.org/10.47392/IRJAEH.2025.0293

auto-filtered, then manually filtered to remove features facilitate the ability of the model to identify
irrelevant samples that may be biased to provide, as a hostile language, insults and threats of violence.
goal, a more balanced dataset that represents a sample Sentiment analysis captures negative emotions such
of real-world cyberbullying incidents. This provides as anger, hatred or fear, all of which are often
the detection system with the ability to recognize indicators of cyberbullying. For image-based
different types of online harassment, and improve detection, feature extraction extracts text and patterns
detection, intervention, and prevention strategies in a in images (utilizing OCR) that may suggest malicious
more effective way. intent. The integration of both linguistic and image-
7.2 Pre-processing based information creates better accuracy for
Pre-processing utilizes raw data for use in a data classification and richness of context increasing the
analysis technique by standardizing and cleaning it. ability to detect subtle, indirect or multimodal forms
For example, for text data pre-processing will entail of cyberbullying.
removing stopwords, punctuation, special characters, 7.4 OCR Module
and other irrelevant aspects which do not provide Pytesseract is an OCR tool that can pull text from
substantive analysis. Pre-processing may also entail images such as screenshots, memes, and social media
what we refer to the process of cleaning up the posts. Cyberbullying may not always be
variations of the words, as you will see later on. This communicated through descriptive text, and therefore
entails converting the words to lowercase, stemming, analyzing content that utilizes images is important for
and lemmatizing (or normalizing) the variations of complete detection. Pytesseract converts the text in
words to make sure they have the same general an image to a machine-readable format and allows the
representation. This cleaning up process is system to analyze and process visual media, in
particularly relevant for slang, abbreviations, and conjunction with typical text-based information.
other informal expressions which occur very After text extraction, the text is then processed using
frequently in online communication. For image data NLP and eventually sentiment analysis so that
pre-processing takes place to improve the accuracy of threats, insults, and harmful messaging can be
Optical Character Recognition (OCR) by resizing and classified correctly. The use of OCR for detecting
adjusting the contrast of images with text (e.g., cyberbullying has the added benefit of conducting
memes or screenshots). This is due to the fact that multimodal analysis for the system, which may
changes will improve OCR efficiency for extracting improve the detection of abusive content regardless
the embedded text and make it easier for machine- of the format, providing a safer and more equitable
learning models to analyze. Pre-processing of images online space across diverse digital formats.
may also include emojis which may be converted to 7.5 Emoji Transcription
text descriptions using Demoji, for example, and thus In cyberbullying, emojis can heighten and reinforce a
maintain the emotional context that emojis invariably negative intent, or disguise an abusive message that
alter in meaning of usages of the words. is often difficult to detect with legacy text-based
Consequently, pre-processing develops a structured approaches. Demoji transcribes emojis to text
and standardization process by which the descriptions to allow the system to interpret meaning
cyberbullying detection model can process the data and intent accurately. For example, an insult followed
used for learning analytical submission. by a laughing emoji is representative of mockery of
7.3 Feature Extraction the target, whereas another "stable" message,
Feature extraction detects patterns related to combined with an angry emoji, may represent
cyberbullying through the analysis of textual as well aggression. Systematically converting emojis to text
as visual information. Text analysis involves key enables the system to analyze sentiment, intent, and
features such as word frequency, n-grams (bigrams, emotional signals in a conversational exchange with
trigrams), sentiment scores, and TF-IDF (Term greater precision.
Frequency-Inverse Document Frequency). These 7.6 Text Analysis

International Research Journal on Advanced Engineering Hub (IRJAEH) 2011

International Research Journal on Advanced Engineering Hub (IRJAEH)
e ISSN: 2584-2137
Vol. 03 Issue: 04 April 2025
Page No: 2007-2015
https://irjaeh.com
https://doi.org/10.47392/IRJAEH.2025.0293

A finetuned BERT model improves cyberbullying and emojis to have a multimodal monitoring
detection by evaluating text in a bidirectional manner, approach for detection. In contrast to traditional
as compared to typically unidirectional models. This methods that only analyze the text, this new
means it evaluates each word based on the words that integrated system allows for a more whole
came before it and the words that will come after it. encompassing assessment of online interactions.
Because it is bidirectional, the model can evaluate More specifically, the BERT model is used as a way
context, nuance, and sentiment, meaning it can detect to identify the text's context, the Pytesseract
instances of bullying that are more complex such as recognizes the text from the images, and the Demoji
indirect, sarcastic, or ambiguous types or forms of tools receives emojis in order to identify a potential
bullying. A unidirectional model would evaluate the bullying intent. Based on combining these models,
words and relationships one at a time, whereas, the aim is to ensure higher detection accuracy, even
BERT evaluates relationships within the individual if the data contains sarcasm, slang, or multimedia.
sentence at a deeper level. Which will lead to The real-time prediction promises to allow
detecting examples like, "Great job, genius!" One moderators, educators, and other relevant actors to be
could easily interpret, "Great job!" as positive, proactive and take action, aiming to create a safer
however, the "genius!" gives the impression that the online space.
person was being sarcastic. This is apparent due to 8. Result and Discussion
the context of how the sentence is constructed. By The updated cyberbullying detection system has a
finetuning BERT on a dataset with cyberbullying superior level of accuracy and reliability, thanks to
content, BERT is highly accurate and identifies new deep learning and natural language processing
patterns of abuse with very good recall. (NLP) tools. While existing text-based methods have
7.7 Test Data benefited from support tools like Benenson's 6-form
Test data is a distinct set of data that is leveraged to text categorization system relying on algorithms and
evaluate the performance of a machine learning keywords, this system can analyze information across
model and the model's ability to generalize. While the various formats since it is multimodal. The ability to
training data is the data that the model learns from, analyze images, emojis, and text improves the
the test data evaluates whether a model can perform detection of subtle and indirect forms of
in an area the model has not seen. Test data is an cyberbullying because cyberbullying does not have to
important evaluation for performance metrics such as be straightforward and direct. The modified BERT
accuracy, precision, recall, and F1 score as it ensures model considers contextual meanings about language
the model punctually identifies instances of to vastly improve comprehension, providing
cyberbullying as efficiently as possible minimizing additional support for the detection of sarcasm, slang,
false positives and false negatives. Testing different or indirect threats, which reduces the number of false
examples in the real-world will also allow for the negatives in text-based detection analysis. The OCR
evaluation of bias, weaknesses, and overfitting which program Pytesseract will allow us to extract text from
improves the model before and over time deploying memes and screens, ensuring that any bullying
the model. Future evaluations and new test data will language hidden in the text-as images can be
be analyzed to allow the model to continue to adapt analyzed. The system will use Demoji to allow
the model to changes and patterns of behavior in hidden meanings associated with emojis that may
cyberbullying and maintain the model's relevance include mockery, sarcasm, or emotional cues. This
and usefulness into the future. evidence demonstrates the opportunity provided
7.8 Prediction through the project to promote a safer digital
In this stage, the model that has been previously environment through improved detection of
trained is applied to new data to subsequently predict cyberbullying in real-world online communications.
possible instances of cyberbullying. The system uses 8.1 Precision
deep learning techniques to scope the text, images, In the context of detecting cyberbullying using our

International Research Journal on Advanced Engineering Hub (IRJAEH) 2012

International Research Journal on Advanced Engineering Hub (IRJAEH)
e ISSN: 2584-2137
Vol. 03 Issue: 04 April 2025
Page No: 2007-2015
https://irjaeh.com
https://doi.org/10.47392/IRJAEH.2025.0293

BERT model, precision matters for evaluating how comments - a critical factor in terms of prompt
well the model detected incidents of abusive response and preventing further abuse. Therefore,
language (or abusive behavior) in online language. recall is often analysed in conjunction with precision
More specifically, since the BERT model detects text - to consider both factors to understand model
as bullying vs. non-bullying, precision is calculated performance (high recall means inclusion of cases
as the ratio of bullying incidents detected (true that matter, and minimizing false positive/identifying
positives, TP) to the total of detected bullying (true abusive conversation is high precision).
positives, TP) and non-bullying that were incorrectly 8.3 F1 Score
detected as bullying (false positives (FP)). The F1 score evaluates a BERT model's performance
in cyberbullying detection because it consolidates
precision and recall to assess a model's effectiveness
in a single evaluation metric. In cyberbullying
detection, precision is how many instances the model
flagged as bullying were actually abusive, while
recall refers to the number of actual cases of
cyberbullying the model detected. F1 score is a
balanced evaluation of a model's performance
A high precision metric, in this case, indicates that the especially when there is a trade-off like in precision
model correctly identifies language indicating and recall.
cyberbullying, or in other words, most of the content
it flags as being bullying is in fact bullying. Thus, we
reduce false positive content since most of what is
flagged is highly true. The representation of a
precision metric reinforces the integrity of the
detection system we are measuring, allowing us to
feel confident that the system is effectively
functioning to identify harmful interactions in online
environments.
A model can have a high precision by identifying
8.2 Recall
only a few clear cases of cyberbullying, but missing
Recall is a key metric for measuring the performance
completely. This would reflect a very low recall rate.
of a model and tells us whether the model identified
This would yield a lower F1 score, which would
every applicable incident of abusive language or
imply that the model is precise but that there is little
behavior in online content for cyberbullying
overall contribution to the detection of cyberbullying.
classification. It is calculated using true positive (TP)
A similar comment would relate to a case with a high
divided by true positive (TP) plus false negative
recall for detecting a cyberbullying, but that would
(FN), where bullying was present and was not
mean that the model would detect too many items,
identified by the model.
and the items might be those that are non-relevant or
benign commentary; this would yield a lower
precision, and lower F1 score. Ultimately, a high F1
score, in the context of cyberbullying detection,
would indicate that the BERT model is capturing a
strong proportion of true bullying cases while
filtering out non-relevant material, Figure 5.
For instance, if BERT is used to classify text
messages or social media posts, a high recall score
means the model identified a lot of actual abusive

International Research Journal on Advanced Engineering Hub (IRJAEH) 2013

International Research Journal on Advanced Engineering Hub (IRJAEH)
e ISSN: 2584-2137
Vol. 03 Issue: 04 April 2025
Page No: 2007-2015
https://irjaeh.com
https://doi.org/10.47392/IRJAEH.2025.0293

Upon data readiness, it goes through the BERT model

that generates embeddings that represent the input's
meaning semantically. The output will subsequently
undergo processing to classify instances as either
cyberbullying or not. By comparing the labeled
dataset to the predictions made by the model we will
be able to evaluate the model's accuracy identifying
strengths and weaknesses. Ultimately, accuracy is a
primary indicator of the models reliability and ability
to keep users safe from cyberbullying.
Figure 5 BERT Model Performance Metrics Conclusion and Future Work
The earlier phase of the cyberbullying detection
8.4 Accuracy system functions as a solid foundation for a more
Accuracy signifies the ratio of actual instances of integrated solution for detecting risky online
cyberbullying acknowledged among the overall behavior. Enhanced by OCR before images are
instances that the model evaluated. In other words, it stored, easy access to emoji text transcription, and the
conveys how often the model predicted whether a abilities of pretrained BERT for sentiment analysis,
comment or message was bullying (positive the system can process both text and visual data to
predictions) or not (negative predictions). identify early risk indicators for cyberbullying. In
sum, the first functionalities are a valuable stepping
stone towards the comprehensive and robust
cyberbullying detection, and as the tool iterates, it can
potentially emerge as a strong tool not just for
capturing cyberbullying as it happens, but also to help
create safer online spaces by detecting harmful
interactions across a variety of content.
Future work could investigate leveraging the models
A high accuracy shows that the model is achieving to extend the capabilities of the model to multiple
the desired effect of distinguishing abusive and non- languages, incorporating the models into social
abusive instances across a proportion of instances. media applications as a real-time system, providing
Furthermore, when working with a BERT model to context-based assessments as well as making the task
classify instances of cyberbullying, a pre-trained of reporting incidents of cyberbullying less of a
BERT model is regularly brought in via packages, cumbersome for users through the reporting
such as Hugging Face Transformers, shown in Figure interface. Ethical considerations around the use of
6. data in training the models, along with privacy
considerations, might also be worthy of future
examination.
References
[1]. Amshuman Singh (August 2021)-“Machine
Learning Approach to Crime Prediction and
Identification of Hotspots".
[2]. Neil Shah, Nandish Bhagat & Manan Shah
(April 2021)- “Crime forecasting: a machine
learning and computer vision approach to
crime prediction and prevention”.
Figure 6 Model Accuracy Over Training Epochs

International Research Journal on Advanced Engineering Hub (IRJAEH) 2014

International Research Journal on Advanced Engineering Hub (IRJAEH)
e ISSN: 2584-2137
Vol. 03 Issue: 04 April 2025
Page No: 2007-2015
https://irjaeh.com
https://doi.org/10.47392/IRJAEH.2025.0293

[3]. Md Manowarul Islam, Md Ashraf Uddin,

Linta Islam (2020)- “Cyberbullying
Detection on Social Networks Using Machine
Learning Approaches”.
[4]. Bandeh Ali Talpur, Declan O’Sullivan
(October 2020)- “Cyberbullying severity
detection: A machine learning approach”.
[5]. Department of Translation, Interpreting, and
Communication - Faculty of Arts and
Philosophy, Ghent University, Ghent,
Belgium, (October 2018), “Automatic
detection of cyberbullying in a social media
text”.
[6]. John Hani, Mohamed Nashaat, Mostafa
Ahmed, Zeyad Emad, Eslam Amer, Ammar
Mohammed, (2019), “Social Media
Cyberbullying Detection using Machine
Learning”.
[7]. Nureni Ayofe Azeez, Sunday O. Idiakose,
Chinazo Juliet Onyema and Charles Van Der
Vyver, (June 2021), “Cyberbullying
Detection in Social Networks: Artificial
Intelligence Approach”.
[8]. Afrah Almansoori, Mohammed Alshamsi,
Sherief Abdallah, and Said A. Salloum,
(2021), “Analysis of Cybercrime on Social
Media Platforms and Its Challenges”.

International Research Journal on Advanced Engineering Hub (IRJAEH) 2015

Paper 13
No ratings yet
Paper 13
8 pages
Cyberbullying Detection Using Machine Learning
No ratings yet
Cyberbullying Detection Using Machine Learning
6 pages
The Use of Arduino Interface and Date Palm (Phoenix Dactylifera) Seeds in Making An Improvised Air Ionizer-Purifier
No ratings yet
The Use of Arduino Interface and Date Palm (Phoenix Dactylifera) Seeds in Making An Improvised Air Ionizer-Purifier
7 pages
Cyber Bullying
No ratings yet
Cyber Bullying
20 pages
Paper 7
No ratings yet
Paper 7
13 pages
Cyberbullying Detection with BERT
No ratings yet
Cyberbullying Detection with BERT
5 pages
The Use of A Large Language Model For Cyberbullying Detection
No ratings yet
The Use of A Large Language Model For Cyberbullying Detection
14 pages
2022 Using Deep Transfer Learning
No ratings yet
2022 Using Deep Transfer Learning
19 pages
Cyberbullying Detection On Twitter Using Machine Learning A Review
No ratings yet
Cyberbullying Detection On Twitter Using Machine Learning A Review
5 pages
Cyberbullying Paper
No ratings yet
Cyberbullying Paper
9 pages
11.detection of Cyber Bullying On Social Media Using Machine Learning
No ratings yet
11.detection of Cyber Bullying On Social Media Using Machine Learning
83 pages
Irjet V7i12375
No ratings yet
Irjet V7i12375
15 pages
Machine Learning Based Cyber Bullying Detection
No ratings yet
Machine Learning Based Cyber Bullying Detection
5 pages
Survey Paper
No ratings yet
Survey Paper
7 pages
CB1 Paper
No ratings yet
CB1 Paper
11 pages
Online Social Network Bullying Detection Using Intelligence Techniques
No ratings yet
Online Social Network Bullying Detection Using Intelligence Techniques
8 pages
2020 Based On Deep Learning Architecture
No ratings yet
2020 Based On Deep Learning Architecture
14 pages
Batch-9 Paper
No ratings yet
Batch-9 Paper
8 pages
IJRPR14545
No ratings yet
IJRPR14545
6 pages
Research Paper3
No ratings yet
Research Paper3
9 pages
Detection of Cyberbullying On Social Media
No ratings yet
Detection of Cyberbullying On Social Media
9 pages
Articulo TTI FACPYA
No ratings yet
Articulo TTI FACPYA
15 pages
Smart Contract Vulnerability Detection
No ratings yet
Smart Contract Vulnerability Detection
12 pages
Cyber Bullying Detection Using Machine Learning
No ratings yet
Cyber Bullying Detection Using Machine Learning
4 pages
Cyberbullying Detection Using Natural Language Processing
No ratings yet
Cyberbullying Detection Using Natural Language Processing
10 pages
Automated Detection of Cyber Bullying
No ratings yet
Automated Detection of Cyber Bullying
3 pages
Detection and Classification of Cyberbullying Using CR
No ratings yet
Detection and Classification of Cyberbullying Using CR
8 pages
DL 4
No ratings yet
DL 4
10 pages
Machine Learning-Based Strategies For Detecting Cyberbullying in Online Chats
No ratings yet
Machine Learning-Based Strategies For Detecting Cyberbullying in Online Chats
4 pages
Cyberbullying Detection with AI
No ratings yet
Cyberbullying Detection with AI
13 pages
Cyberbullying Detection Using NLP (r3) - 1
No ratings yet
Cyberbullying Detection Using NLP (r3) - 1
45 pages
Paper 4
No ratings yet
Paper 4
5 pages
Impact Factor: 8.165: Volume 10, Issue 3, March 2022
No ratings yet
Impact Factor: 8.165: Volume 10, Issue 3, March 2022
7 pages
Survey Paper
No ratings yet
Survey Paper
8 pages
Cyber Bullying Detection On Social Media Network
No ratings yet
Cyber Bullying Detection On Social Media Network
9 pages
Detection and Classification of Cyberbullying in Social Media Using Text Mining
No ratings yet
Detection and Classification of Cyberbullying in Social Media Using Text Mining
6 pages
Proceedings 31 00027 PDF
No ratings yet
Proceedings 31 00027 PDF
10 pages
DL 5
No ratings yet
DL 5
7 pages
Cyberbullying Detection and Classification Using Information Retrieval Algorithm
No ratings yet
Cyberbullying Detection and Classification Using Information Retrieval Algorithm
6 pages
DL 8
No ratings yet
DL 8
5 pages
Detecting Hate Speech and Insults On Soc
No ratings yet
Detecting Hate Speech and Insults On Soc
7 pages
Apna Research Paper
No ratings yet
Apna Research Paper
13 pages
Cyberbullying Detection Through Sentiment Analysis
No ratings yet
Cyberbullying Detection Through Sentiment Analysis
6 pages
Cyberbullying Detection Through Sentiment Analysis
No ratings yet
Cyberbullying Detection Through Sentiment Analysis
6 pages
CBDA Research Paper
No ratings yet
CBDA Research Paper
29 pages
Cyberbullying Detection in Social Networks A Compa
No ratings yet
Cyberbullying Detection in Social Networks A Compa
32 pages
Cyber Bullying Detection
No ratings yet
Cyber Bullying Detection
5 pages
Deep Learning Algorithms For Cyber-Bulling Detection in Social Media Platfo 20250424 135605 0000
No ratings yet
Deep Learning Algorithms For Cyber-Bulling Detection in Social Media Platfo 20250424 135605 0000
8 pages
Optimized Twitter Cyberbullying Detection Based On Deep Learning
No ratings yet
Optimized Twitter Cyberbullying Detection Based On Deep Learning
5 pages
Cyberbullying Detection Model Using LSTM
No ratings yet
Cyberbullying Detection Model Using LSTM
9 pages
Abs 1
No ratings yet
Abs 1
2 pages
Paper Final
No ratings yet
Paper Final
8 pages
Empowering Online Safety A Machine Learning Approach To Cyberbullying Detection
No ratings yet
Empowering Online Safety A Machine Learning Approach To Cyberbullying Detection
5 pages
Cyberbullying 1
No ratings yet
Cyberbullying 1
2 pages
Cyber-Bullying Detection Using Machine Learning - 2020
No ratings yet
Cyber-Bullying Detection Using Machine Learning - 2020
7 pages
Major Project
No ratings yet
Major Project
6 pages
(IJCST-V10I5P24) :mrs R Jhansi Rani, M Narendra
No ratings yet
(IJCST-V10I5P24) :mrs R Jhansi Rani, M Narendra
8 pages
Detecting Child Predators on Social Media
No ratings yet
Detecting Child Predators on Social Media
4 pages
Pega CSA 7.4 Certification Guide
0% (1)
Pega CSA 7.4 Certification Guide
2 pages
Intonation System. Tench
No ratings yet
Intonation System. Tench
11 pages
Sara CV 2
No ratings yet
Sara CV 2
2 pages
Personalized Learning Path Generator (PLPG)
No ratings yet
Personalized Learning Path Generator (PLPG)
3 pages
Cambridge Homeschooling Guide
No ratings yet
Cambridge Homeschooling Guide
12 pages
Carol Gilligan': S Theory OF Oral Development
No ratings yet
Carol Gilligan': S Theory OF Oral Development
14 pages
Perdev q1 Mod3 Kdoctolero - Compress
No ratings yet
Perdev q1 Mod3 Kdoctolero - Compress
24 pages
New GRE Scoring Format: GRE Test GRE Test GRE Exam GRE Exam
No ratings yet
New GRE Scoring Format: GRE Test GRE Test GRE Exam GRE Exam
2 pages
Lec 13-Power Series
No ratings yet
Lec 13-Power Series
63 pages
MD Outline 2016-2017
No ratings yet
MD Outline 2016-2017
16 pages
Diagnostic Test Variant 2
No ratings yet
Diagnostic Test Variant 2
3 pages
Week 27 Class Vi, History
No ratings yet
Week 27 Class Vi, History
10 pages
Biomedical Pharmaceutical Sciences With Patient Care Correlations Full Download
0% (1)
Biomedical Pharmaceutical Sciences With Patient Care Correlations Full Download
403 pages
Alumni Admissions Essays
No ratings yet
Alumni Admissions Essays
10 pages
Az150 JSW GL TC
No ratings yet
Az150 JSW GL TC
1 page
Identifying Strong Thesis Statements Worksheet
100% (2)
Identifying Strong Thesis Statements Worksheet
8 pages
Alfred Adler's Individual Psychology - QUIZ
No ratings yet
Alfred Adler's Individual Psychology - QUIZ
26 pages
ECC Application Form
No ratings yet
ECC Application Form
2 pages
Kids' Summer Writing Camps
No ratings yet
Kids' Summer Writing Camps
1 page
Hermann Haken
No ratings yet
Hermann Haken
2 pages
Cross-Functional Team
No ratings yet
Cross-Functional Team
2 pages
PFM 2
No ratings yet
PFM 2
4 pages
Cell MCQ Collection Biology Grade Xi
No ratings yet
Cell MCQ Collection Biology Grade Xi
22 pages
Newsflash December 2012 FINAL
No ratings yet
Newsflash December 2012 FINAL
60 pages
Leadership: Definitions and Impact
0% (1)
Leadership: Definitions and Impact
68 pages
Vocational Skills Championships
No ratings yet
Vocational Skills Championships
1 page
3ra Guia de Trabajo Autonomo en Casa 10 (1) Decimo Expressing Real and Unreal
No ratings yet
3ra Guia de Trabajo Autonomo en Casa 10 (1) Decimo Expressing Real and Unreal
4 pages
Script For National Book Month
No ratings yet
Script For National Book Month
4 pages
Shadowing Technique Boosts Pronunciation
No ratings yet
Shadowing Technique Boosts Pronunciation
20 pages
Glad We Met: The Art and Science of 1:1 Meetings Steven G. Rogelberg PDF Download
100% (1)
Glad We Met: The Art and Science of 1:1 Meetings Steven G. Rogelberg PDF Download
147 pages

BERT-Based Cyberbullying Detection

Uploaded by

BERT-Based Cyberbullying Detection

Uploaded by

International Research Journal on Advanced Engineering Hub (IRJAEH)

BERT-Based Cyberbullying Detection

International Research Journal on Advanced Engineering Hub (IRJAEH) 2007

solutions. paper emphasizes the advantages of deep learning

International Research Journal on Advanced Engineering Hub (IRJAEH) 2008

Figure 2 Use Case Diagram

Figure 1 Architecture Diagram

4. 3. Use Case Diagram:

International Research Journal on Advanced Engineering Hub (IRJAEH) 2009

International Research Journal on Advanced Engineering Hub (IRJAEH) 2011

International Research Journal on Advanced Engineering Hub (IRJAEH) 2012

International Research Journal on Advanced Engineering Hub (IRJAEH) 2013

Upon data readiness, it goes through the BERT model

International Research Journal on Advanced Engineering Hub (IRJAEH) 2014

[3]. Md Manowarul Islam, Md Ashraf Uddin,

International Research Journal on Advanced Engineering Hub (IRJAEH) 2015

You might also like