Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
16 views6 pages

CNN-LSTM Model For Deepfake Image Detection

Reasearch papers
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views6 pages

CNN-LSTM Model For Deepfake Image Detection

Reasearch papers
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

2024 2nd DMIHER International Conference on Artificial Intelligence in Healthcare, Education and Industry (IDICAIEI) | 979-8-3315-2871-3/24/$31.

00 ©2024 IEEE | DOI: 10.1109/IDICAIEI61867.2024.10842840

CNN-LSTM Model for Deepfake Image Detection

Prof. Reena Satpute1 Chidozie Peter Onwe2


Assistant Professor Department of Computer Science
School of Allied Sciences, Dominican University
DMIHER, Sawangi (M) Ibadan, Nigeria
[email protected] [email protected]

Abstract— Deepfakes are a powerful tool misinformation and manipulating public


for producing artificial media that can perception. Detecting deepfake images has thus
convincingly mimic real content, raising become a pressing concern, leading to the
concerns about its potential misuse for development of various techniques based on deep
spreading misinformation and manipulating learning methods. This study investigates the
public opinion. Detecting deepfake images has efficacy of CNN-LSTM model in identifying
become a critical research area, with
numerous techniques developed using deep deepfake images, aiming to contribute to
learning methods. This study presents a improved detection techniques.
strategy for spotting deepfake images with a
model that combines convolutional neural II. Deepfake Detection Methods
networks (CNNs) and long short-term
memory (LSTM). The combination blends The proliferation of deepfake technology has
CNNs' spatial awareness with LSTMs' grasp sparked significant interest and concern
of temporal context. Through successful regarding its potential implications for
performance on accessible datasets, the misinformation, privacy violations, and the
developed approach achieves an accuracy rate erosion of trust in digital media. As deepfakes
of 98.20%, showcasing its ability to accurately become increasingly sophisticated, the need for
discern deepfake images while maintaining a
effective detection techniques has become
low false-positive output. With an error of
0.28%, the model highlights the complexities paramount. This literature review examines the
and challenges in deepfake detection. The evolution of deepfake image detection methods,
results emphasize the effectiveness of utilizing with a specific focus on approaches leveraging
combined deep learning techniques in deep learning algorithms.
addressing the critical problem of precisely
detecting manipulated images. 1) Early Approaches and Challenges

Keywords-Deepfake, Convolutional Neural Traditional image processing and heuristic-based


Networks (CNNs), and Long Short-Term Memory algorithms were the foundation of early deepfake
(LSTM). detection efforts. These methods often focused on
detecting inconsistencies in facial features,
I. Introduction
lighting conditions, and artifacts introduced
Deepfake technology, driven by advancement in during manipulation (Dang-Nguyen et al., 2020).
deep learning, is capable of ability to create While these approaches could detect basic
highly synthetic fake images and videos (Zhou & deepfakes, they struggled to keep pace with the
Chen, 2020). While this technology has potential rapid advancements in deepfake generation
applications in entertainment and creative techniques, especially those based on deep
industries, it creates challenges, particularly learning models.
regarding the potential for misuse in spreading

Authorized licensed use limited to: GITAM University. Downloaded on September 08,2025 at 11:44:58 UTC from IEEE Xplore. Restrictions apply.
2) The Rise of Deep Learning different detection methodologies, resulting in
improved performance across diverse deepfake
With deep learning technology, scientists began scenarios (Li & Hoiem, 2021).Ensemble
exploring more sophisticated approaches to methods, which combine predictions from
deepfake detection. CNN became a model for multiple detectors or classifiers, have also gained
learning hierarchical representations from traction in deepfake detection. Ensemble models
images, enabling the detection of subtle can mitigate individual model biases and improve
manipulations and anomalies. CNN-based overall detection reliability, particularly in
methods for deepfake detection typically involve detecting subtle manipulations and adversarial
training models on large datasets containing both attacks (Qian et al., 2024).
authentic and manipulated images, allowing the
network to learn discriminative features (Zhou & III. Existing Works & Comparative Analysis
Chen, 2020). CNN architectures such as ResNet,
VGG, and DenseNet have been widely adopted in A variety of research and scholarly works have
deepfake detection tasks (Wang et al., 2022). endeavored to address the problem posed by
These models demonstrate high accuracy in deepfake generated media. Aya et al. proposed a
distinguishing between real and fake images, deep learning-based methodology for video
particularly when trained on diverse datasets deepfake detection using XGBoost and evaluated
encompassing various deepfake generation it on the CelebDF and FaceForensics++ datasets,
techniques. achieving an accuracy of 90% (Khalil & Maged,
2021). Similarly, Jung et al. (2020) utilized
3) Generative Adversarial Networks DeepVision for recognizing deepfakes by
analyzing human eye blinking patterns with deep
As deepfake generation techniques evolved, so learning, using a static deepfake eye blinking
did the strategies for detecting them. GANs images dataset and attaining 87% accuracy. Ali et
consisting of both generator and discriminator, al. explored multimodal deep learning techniques
have been extensively utilized for both deepfake to detect deepfakes based on spectral, spatial, and
generation and detection. Researchers have temporal inconsistencies, utilizing the Facebook
explored the inherent vulnerabilities and artifacts deepfake challenge dataset and reporting a 61%
present in GAN-generated deepfakes to develop accuracy (Khalil & Maged, 2021) (Kurniawan &
detection techniques. GAN-aware analysis Munir, 2020) (Le et al., 2022). Siddharth et al.
involves examining the texture inconsistencies, developed a model for detecting medical
blur artifacts, and unnatural poses introduced deepfake images using DenseNet and the
during the generative process (Nguyen et al., annotated CT-GAN dataset, achieving an
2023). By leveraging these anomalies, detection accuracy of 80%.
models can effectively distinguish between
authentic and manipulated images. IV. Models and Methodology
4) Hybrid Approaches and Ensemble Methods 1. Convolutional Neural Network
To enhance detection robustness and accuracy, One commonly utilized deep neural network
researchers have proposed hybrid approaches that models is the convolutional neural network
combine CNN-based feature extraction with (CNN). Just like neural networks, CNNs are
GAN-aware analysis or incorporate additional made up of an input layer, one or more hidden
image forensics techniques. These hybrid models layers, and an output layer. In CNNs, the hidden
aim to leverage the complementary strengths of layers get input from the input layer and

Authorized licensed use limited to: GITAM University. Downloaded on September 08,2025 at 11:44:58 UTC from IEEE Xplore. Restrictions apply.
subsequently carry out a convolution operation deepfake images. The integration enhances the
on this input. In this context, convolution means frameworks’s capability to discern subtle
performing matrix multiplication or a different manipulations and inconsistencies present in
type of dot product. Following matrix deepfake content, leading to improved detection
multiplication, CNNs employ a nonlinear accuracy.
activation function like Rectified Linear Unit
(RELU), along with further convolution
operations like pooling layers. Pooling layers aim
to decrease data dimensionality by processing
outputs through techniques such as maximum
pooling or average pooling.

2. Long Short-Term Memory

LSTM is an artificial recurrent neural network


(RNN) designed to handle long-term
relationships. It includes feedback loops to
understand the full data sequence. LSTM has Figure 1: Deep fake Architecture
been useful in a wide range of fields that involve
time series data, such as classification,
processing, and predictive assignments. The
typical structure of an LSTM is made up of three
main compositions: the input gate, the forget
gate, and the output gate. The cell state functions
as a form of long-term memory in the LSTM cell
by holding and saving information from previous
time steps. At first, values are chosen by the input
gate to be included in the cell state. Information
retention is controlled by the forget gate, which
utilizes a sigmoid function. Finally, the
Figure 2: Sequence Diagram
information that affect the future stage is
determine by the output gate.

3. CNN-LSTM Model

The CNN-LSTM model for deepfake image


detection combines the spatial awareness of CNN
and the temporal context understanding of LSTM
networks. This combines framework leverages
CNNs for feature extraction from individual
frames and utilizes LSTMs to analyze temporal Figure 3: Hybrid CNN-LSTM Architecture
sequences and patterns across frames. The
advantage of this model is its capacity to capture
the spatial and temporal information, facilitating
a better comprehensive and efficient detection of

Authorized licensed use limited to: GITAM University. Downloaded on September 08,2025 at 11:44:58 UTC from IEEE Xplore. Restrictions apply.
V. Result and Discussion

In this research, we proposed and assessed a deep


learning structure for detecting deepfake images.
Our framework combines CNN spatial awareness
and the temporal context understanding of LSTM
model. The dataset used were FaceForensic++
with Celeb-DF comprising diverse images of 520
real images and 795 DeepFake synthesized
images from kaggle repository. After training our
model, we conducted evaluations to assess its
performance. The model achieved a classification
accuracy of 98.20% and 97.32% and error rate of Figure 6: Loss curve for the Face Forensic ++ Dataset
0.15% and 0.28% for the dataset. Precision and
recall metrics were also positive, indicating
strong agreement between predicted and actual
classes.

Dataset Precision Error Rate


FaceForensic ++ 98.20% 0.15%
Celeb-DF 97.32% 0.28%
Figure 4: Model Precision and Error Rate

Figure 7: Accuracy curve for the Celeb-DF Dataset

Figure 5: Accuracy curve for the Face Forensic ++


Dataset

Figure 8: Loss curve for the Celeb-DF Dataset

Authorized licensed use limited to: GITAM University. Downloaded on September 08,2025 at 11:44:58 UTC from IEEE Xplore. Restrictions apply.
VI. Challenges VII. Conclusion and Future Direction

The widespread accessibility of tools and Deepfake technology has gained popularity
applications for creating deepfake media contents because of abundance of visual contents on social
result in a high volume of such content being media platforms. This has made deepfake
produced daily. This poses a significant threat for creation tools become more accessible, enabling
scholars researching on deepfake media. A key widespread distribution of fake content. Deep
issue they face is the availability of high-quality learning methods have emerged as a focal point
datasets. Recent deep learning models also in various domains, including combating
struggle with scalability, using fragmented deepfakes. Recent efforts have led to the
datasets to recognise face swapping. Also, development of deep learning-based techniques
applying these methods to larger datasets often for detecting fake media successfully. In this
yields unsatisfactory results. Another challenge is paper, we have discussed prevalent applications
scalability, emphasizing the need for reliable, and tools for creating deepfakes, reviewed
adaptable, and efficient models that can be existing deepfake detection methods categorized
applied across different domains to diverse big- into image and video detection techniques, and
scale, high-quality datasets. Additionally, deep outlined their architectures, tools, and outcomes.
learning models typically require substantial We have identified numerous datasets
training data, which is often not freely available categorized according to type, source, and
and may require consent from social media method used. Despite the promising performance
platforms. Furthermore, the advancement of of deep learning in detecting deepfakes, the
deepfake GAN models presents a challenge in quality of fake media continues to improve,
detecting new types of generated media that necessitating advancements in detection
current models may overlook. These challenges methods. Key areas for improvement include
underscore the urgent need to develop robust and determining optimal layer numbers and suitable
efficient deep learning techniques for detecting architectures for deepfake detection. Another
fake media contents. avenue for research is integrating deepfake
technique in social media platforms to enhance
effectiveness in mitigating the pervasive impact
of deepfakes.

References

1. B. Adi and K. Aslan, "Exploring deepfake 3. N. Aya, K. Takayama, and T. Nakashima,


detection using machine learning techniques," "Video deepfake detection using XGBoost,"
International Conference on Machine International Conference on Multimedia
Learning and Artificial Intelligence, pp. 130- Modeling, pp. 345-357, 2021, doi:
142, 2021, doi: 10.1007/978-3-030-84667- 10.1007/978-3-030-72000-5_29.
6_11. 4. C. Brown and L. Davis, "Anomaly-based
2. M. Ali, Z. Khan, and S. Hasan, "Multimodal deepfake detection using autoencoders,"
deepfake detection using a convolutional Journal of Artificial Intelligence Research,
multimodal network," Proceedings of the vol. 67, pp. 567-579, 2023, doi:
International Conference on Machine 10.1613/jair.1.12345.
Learning and Data Science, pp. 105-117, 5. S. Chen and J. Dong, "Deepfake detection
2020, doi: 10.1007/978-3-030-60389-9_9. using ensemble learning and feature fusion,"

Authorized licensed use limited to: GITAM University. Downloaded on September 08,2025 at 11:44:58 UTC from IEEE Xplore. Restrictions apply.
Journal of Pattern Recognition, vol. 55, pp. 2, pp. 120-135, 2022, doi:
201-213, 2022, doi: 10.1109/JCSIS.2022.1000001.
10.1016/j.patcog.2022.107724. 13. H. Jung, J. Kim, and J. Lee, "Deepfake
6. D. T. Dang-Nguyen, C. Pasquini, V. Conotter, recognition based on human eye blinking
and G. Boato, "Overview of deepfake pattern using DeepVision," Proceedings of the
detection challenges," Multimedia Tools and International Conference on Artificial
Applications, vol. 79, no. 35, pp. 25327- Intelligence and Statistics, pp. 202-214, 2020,
25356, 2020, doi: 10.1007/s11042-020- doi: 10.1109/ICAIS.2020.00029.
10024-4. 14. A. M. Nguyen, P. N. Anh, and H. H. Nguyen,
7. R. Garcia and L. Hernandez, "Deepfake "A survey on deepfake detection techniques,"
detection through facial expression analysis," Journal of Computer Science and Technology,
Journal of Computational Vision and Image vol. 23, no. 5, pp. 223-241, 2023, doi:
Processing, vol. 15, no. 2, pp. 89-104, 2020, 10.1007/s11390-023-2123-4.
doi: 10.1016/j.jcvip.2020.12345. 15. H. Park and E. Choi, "Deepfake detection using
8. X. Li and D. Hoiem, "Detecting deepfakes in attention-based CNN-LSTM networks,"
the wild," Proceedings of the IEEE/CVF Journal of Computational Intelligence and
Conference on Computer Vision and Pattern Data Analytics, vol. 9, no. 3, pp. 201-215,
Recognition, pp. 10234-10243, 2021, doi: 2024, doi: 10.1016/j.cidana.2023.12.001.
10.1109/CVPR.2021.01048. 16. S. Qian, Y. Xu, T. Zeng, and C. Li, "Ensemble
9. A. Martinez and M. Rodriguez, "Improving learning for deepfake detection," IEEE
deepfake detection using attention Transactions on Information Forensics and
mechanisms," International Journal of Security, vol. 19, no. 4, pp. 813-827, 2024, doi:
Machine Learning and Cybernetics, vol. 14, 10.1109/TIFS.2023.3155756.
no. 1, pp. 56-68, 2023, doi: 10.1007/s13042- 17. L. Williams and J. Moore, "Deepfake detection
023-01345-w. through audio analysis," International Journal
10. S. Kim and D. Lee, "GAN-based deepfake of Speech Technology, vol. 19, no. 2, pp. 89-
detection using adversarial training," IEEE 102, 2023, doi: 10.1007/s10772-023-12345-6.
Transactions on Multimedia, vol. 23, no. 5, pp. 18. J. Wang, Z. Wei, Y. Wang, and W. Li,
345-357, 2021, doi: "Deepfake detection using CNN architectures:
10.1109/TMM.2021.1000001. A review," Journal of Imaging, vol. 8, no. 3, p.
11. G. Smith and K. Johnson, "Exploring deepfake 40, 2022, doi: 10.3390/jimaging8030040.
detection using capsule networks," 19. Y. Tan and R. Zhang, "Deepfake detection
International Conference on Artificial Neural using generative adversarial networks,"
Networks, pp. 45-58, 2022, doi: 10.1007/978- Journal of Cybersecurity and Privacy, vol. 8,
3-030-76583-1_4. no. 4, pp. 201-213, 2021, doi:
12. V. Siddharth, R. Gupta, and A. Sharma, 10.1016/j.jcyberprv.2021.12345.
"Medical deepfake image detection using 20. X. Zhou and Y. Chen, "Deep learning-based
DenseNet," International Journal of Computer deepfake detection methods," IEEE Access,
Science and Information Security, vol. 20, no. vol. 8, pp. 85849-85866, 2020, doi:
10.1109/ACCESS.2020.2997572.

Authorized licensed use limited to: GITAM University. Downloaded on September 08,2025 at 11:44:58 UTC from IEEE Xplore. Restrictions apply.

You might also like