Deepfake Video Face Detection using Deep Learning
1Prasuk Jain, 2Prayas Chaudhary, 3Rajnish Kumar Bhardwaj, 4Vasu Tyagi, 5Mehrbaan Ali
1234Student, 5Professor
12345Meerut Institute of Engineering and Technology, Meerut
Abstract— The proliferation of deepfake technology, deepfake detection methods by leveraging the capabilities of
which uses artificial intelligence to create highly realistic DL in identifying subtle inconsistencies and artifacts
synthetic videos and images, poses significant threats to introduced by deepfake generation processes. Design and
privacy, security, and trust in digital media. Traditional implementation of an LSTM network architecture tailored for
methods for detecting these manipulations often fall short due temporal analysis of video sequences. Integration of additional
to the sophisticated nature of deepfake algorithms. This paper neural network layers if necessary to enhance feature extraction
proposes a novel approach for deepfake face detection using and improve detection performance. Training the LSTM model
Deep Learning (DL) suited for sequential data analysis. Our using the preprocessed dataset, with a focus on optimizing
method leverages the temporal dependencies and patterns hyperparameters to achieve the best possible detection
inherent in video sequences to identify subtle inconsistencies accuracy. Utilization of techniques such as data augmentation
and artifacts introduced by deepfake generation processes. By and regularization to prevent overfitting and improve
analyzing frames in a sequence rather than in isolation, the DL generalization. Evaluation and Validation: Rigorous testing of
model captures dynamic facial features and movements that the trained LSTM model on a separate validation set to assess
are challenging to replicate accurately in deepfakes. The its accuracy, robustness, and generalization capabilities.
proposed model is trained on a comprehensive dataset of real Comparison of the LSTM-based approach with existing state-
and deepfake videos, incorporating various scenarios and of-the-art deepfake detection methods to highlight its
manipulation levels. Experimental results demonstrate that advantages and limitations. Definition and use of relevant
our DL-based approach achieves superior accuracy and performance metrics (e.g., accuracy, precision, recall, F1-
robustness compared to state-of-the-art deepfake detection score) to quantify the effectiveness of the detection system.
techniques, particularly in challenging cases with high-quality Analysis of the model's performance across different types of
deepfakes. Furthermore, the model exhibits strong deepfake techniques and varying levels of video
generalization capabilities across different datasets and quality. Ethical considerations of deepfake detection are
deepfake generation methods. This research highlights the privacy concerns and potential impact on freedom of
potential of DL for enhancing deepfake content detection, expression, are also addressed. By addressing these elements,
contributing to the development of more secure and the project aims to develop a comprehensive and effective
trustworthy digital media platforms. solution for deepfake face detection using LSTM networks. In
Keywords: Deepfake Video Detection, Convolutional Neural present times, Deep learning is now the standard in machine
Network (CNN), Long Short-Term Memory (LSTM). learning, leading to advancements like deepfake, which
threaten privacy and national security. Deepfake manipulate
Keywords: Deepfake Video Detection, convolutional Neural media by replacing a person's likeness using AI algorithms.
network (CNN), long short term memory (LSTM) While often used for humor or politics, they raise ethical
concerns about privacy and misinformation, necessitating
INTRODUCTION regulatory measures and public awareness.
The project aim is to develop a robust deepfake face from
video detection system utilizing Deep learning by leveraging LITERATURE SURVEY
the temporal dependencies in video sequences, the project [1] Nguyen, T.T., Nguyen, Q.V.H., Nguyen, D.T., Nguyen,
seeks to identify and distinguish deepfake videos from D.T., Huynh-The, T., Nahavandi, S., Nguyen, T.T., Pham, Q.V.
genuine ones. To enhance the security and trustworthiness of and Nguyen, C.M., 2022. Deep learning for deepfakes creation
digital media platforms by providing a reliable tool for and detection: A survey. Computer Vision and Image
detecting and mitigating the spread of deepfake content. The Understanding, 223, p.103525. Deep learning has been
use of advanced deep neural networks and the abundance of successfully applied to solve complex problems ranging from
data has made the manipulated images and big data analytics to computer vision and human-level control.
videos indistinguishable. This study promotes the research in Deep learning advances however have also been employed to
the field of deep learning and deepfake detection. The scope create software that can cause threats to privacy, democracy
of this project involves critical aspects of developing, and national security. One of those deep learning-powered
evaluating, and deploying an effective deepfake face detection applications recently emerged is deepfake. The proposal of
system using DL. The key elements of the project's scope technologies that can automatically detect and assess the
include- Collection of a comprehensive dataset consisting of integrity of digital visual media is therefore indispensable. This
real and deepfake videos from various sources. Preprocessing paper presents a survey of algorithms used to create deepfakes
of video frames to ensure uniformity in size, format, and and, more importantly, methods proposed to detect deepfakes
quality for LSTM model. The objective of this project is to in the literature to date. We present extensive discussions on
develop an advanced deepfake detection system using DL. challenges, research trends and directions related to deepfake
The specific objectives are- Develop an LSTM-Based technologies. By reviewing the background of deepfakes and
Detection Model: Create a DL architecture that can effectively state-of-the-art deepfake detection methods, this study provides
analyze the temporal patterns in video sequences to identify a comprehensive overview of deepfake techniques and
deepfake manipulations. Gather and preprocess a large dataset facilitates the development of new and more robust methods to
of both real and deepfake videos to train and evaluate the deal with the increasingly challenging deepfakes. In a narrow
LSTM model, ensuring it performs well across various definition, deepfakes are created by techniques that can
scenarios and manipulation techniques. Temporal Analysis of superimpose face images of a target person onto a video of a
Video Sequences: To utilize LSTM networks for analyzing source person to make a video of the target person doing or
the sequential frames in videos, capturing dynamic facial saying things the source person does. This constitutes a
features and movements that are challenging to replicate category of deepfakes, namely face swap. In a broader
accurately in deepfakes. To utilize DL for analyzing the definition, deepfakes are artificial intelligence-synthesized
sequential frames in videos, capturing dynamic facial features content that can also fall into two other categories, i.e., lip-sync
and movements that are challenging to replicate accurately in and puppet-master. Lip-sync deepfakes refer to videos that are
deepfakes. To improve the accuracy and robustness of modified to make the mouth movements consistent with an
audio recording. Puppet-master deepfakes include videos of a objectives is to evaluate its performance and acceptability in
target person (puppet) who is animated following the facial terms of security, user-friendliness, accuracy and reliability.
expressions, eye and head movements of another Our method is focusing on detecting all types of DF like
person sitting in front of a camera [1]. While some deepfakes replacement DF, retrenchment DF and interpersonal DF.
can be created by traditional visual effects or computer- fig1 represents the simple system architecture of the
graphics approaches, the recent common underlying proposed system: -
mechanism for deepfake creation is deep learning models
such as autoencoders and generative adversarial networks
(GANs). [2] Westerlund, M., 2019. The emergence of
deepfake technology: A review. Technology innovation
management review, 9(11). Novel digital technologies make
it increasingly difficult to distinguish between real and fake
media. One of the most recent developments contributing to
the problem is the emergence of deepfakes which are hyper-
realistic videos that apply artificial intelligence to depict
someone say and do things that never happened. Coupled with
the reach and speed of social media, convincing deepfakes can
quickly reach millions of people and have negative impacts on
our society. While scholarly research on the topic is sparse, Fig1
this study analyzes 84 publicly available online news articles A. Dataset: We are using a mixed dataset which consists
to examine what deepfakes are and who produces them, what of equal number of videos from different dataset
the benefits and threats of deepfake technology are, what sources like YouTube, FaceForensics++ [14], Deep
examples of deepfakes there are, and how to combat fake detection challenge dataset [13]. Our newly
deepfakes. The results suggest that while deepfakes are a prepared dataset contains 50% of the original video
significant threat to our society, political system and business, and 50% of the manipulated deepfake videos. To train
they can be combatted via legislation and regulation, above model we have used Deepfake faces dataset
corporate policies and voluntary action, education and from KAGGLE repository which contains more than
training, as well as the development of technology for 95000 images The dataset is split into 80% train and
deepfake detection, content authentication, and deepfake 20% test set.
prevention. The study provides a comprehensive review of
deepfakes and provides cybersecurity and AI entrepreneurs B. Preprocessing: Dataset preprocessing includes the
with business opportunities in fighting against media forgeries splitting the video into frames. Followed by the face
and fake news. In recent years, fake news has become an issue detection and cropping the frame with detected face.
that is a threat to public discourse, human society, and To maintain the uniformity in the number of frames
democracy (Borges et al., 2018; Qayyum et al., 2019). Fake the mean of the dataset video is calculated and the new
news refers to fictitious news style content that is fabricated processed face cropped dataset is created containing
to deceive the public (Aldwairi & Alwahedi, 2018; Jang & the frames equal to the mean. The frames that don’t
Kim, 2018). False information spreads quickly through social have faces in it are ignored during preprocessing
media, where it can impact millions of users (Figueira & C. Model: -The model consists of resnext50_32x4d
Oliveira, 2017). Presently, one out of five Internet users get followed by one LSTM layer. The Data Loader loads
their news via YouTube, second only to Facebook (Anderson, the preprocessed face cropped videos and split the
2018). This rise in popularity of video highlights the need for videos into train and test set. Further the frames from
tools to confirm media and news content authenticity, as novel the processed videos are passed to the model for
technologies allow convincing manipulation of video training and testing in mini batches
(Anderson, 2018). Given the ease in obtaining and spreading D. ResNext CNN for Feature Extraction: -Instead
misinformation through social media platforms, it is of writing the rewriting the classifier, we are proposing
increasingly hard to know what to trust, which results in to use the ResNext CNN classifier for extracting the
harmful consequences for informed decision making, among features and accurately detecting the frame level
other things (Borges et al., 2018; Britt et al., 2019). Indeed, features. Following, we will be fine-tuning the network
today we live in what some have called a “post-truth” era, by adding extra required layers and selecting a proper
which is characterized by digital disinformation and learning rate to properly converge the gradient descent
information warfare led by malevolent actors running false of the model. The 2048-dimensional feature vectors
information campaigns to manipulate public opinion after the last pooling layers are then used as the
(Anderson, 2018; Qayyum et al., 2019; Zannettou et al., 2019) sequential LSTM input.
E. LSTM for Sequence Processing: - Let us assume
PROPOSED SYSTEM a sequence of ResNext CNN feature vectors of input
frames as input and a 2-node neural network with the
There are many tools available for creating the DF, but for
probabilities of the sequence being part of a deep fake
DF detection there is hardly any tool available. Our
video or an untampered video. The key challenge that
approach for detecting the DF will be great contribution in
we need to address is the de- sign of a model to
avoiding the percolation of the DF over the world wide web.
recursively process a sequence in a meaningful
We will be providing a web-based platform for the user to
manner. For this problem, we are proposing to the use
upload the video and classify it as fake or real. This project
of a 2048 LSTM unit with 0.4 chance of dropout,
can be scaled up from developing a web-based platform to a
which is capable to do achieve our objective. LSTM is
browser plugin for automatic DF detections. Even big
used to process the frames in a sequential manner so
application like WhatsApp, Facebook can integrate this
that the temporal analysis of the video can be made, by
project with their application for easy pre detection of DF
comparing the frame at ‘t’ second with the frame of ‘t-
before sending to another user. One of the important
n’ seconds. Where n can be any number of frames https://www.kaggle.com/datasets/dagnelies/deepfake-faces
before Above dataset contains two different class labels such as Fake
F. Implementation: - To implement this project, we and Real. To detect Deepfake faces we have downloaded videos
have designed following modules from below URL
1) Upload Deepfake Faces Dataset: using this module https://github.com/aaronchong888/DeepFake-
will upload dataset images to application and then Detect/blob/master/train_sample_videos/metadata.json
application will read all images and then resize to equal From above URL file we can see fake and real videos which we
downloaded and tested with our model
sizes and then creating X and Y training array
To run project double click on run.bat file to get screen in fig2.
2) Train DL Model: this module will shuffle, normalize
In screen of fig2click on ‘Upload Deepfake Faces Dataset’ button
and then split all images into 80:20 percent train and to load dataset and get below page
test ratio. 80% images will be input to DL algorithm to
train a model and this model will be applied on 20%
test data to calculate prediction accuracy
3) Video Based Deepfake Detection: using this module
will upload test video and then DL model will analyse
faces from each frame slowly and then predict video as
Real or Deepfake. Once after prediction will get video
playing output with result as fake or real Fig5
In above screen selecting and uploading dataset annotation file
G. Output screens: - The figures from 2 to 4 represents the
and then click on ‘Open’ button to get fig3
output screens. In fig3 screen can see dataset loaded and can see different class
labels found in dataset and then can see number of images found
in dataset and now click on ‘Train DL Model’ button to train
algorithm and get below page
Fig2
Fig6
In above screen can see train and test dataset size and then can
see DL got 99% accuracy and can see other metrics like
precision, recall and FSCORE. Now click on ‘Video Based
Deepfake Detection’ button to upload test video and get below
page
Fig7
In above screen selecting and uploading 11.mp4 video and then
Fig3 click on ‘Open’ button to start analyzing video and get below
page
Fig4
Fig8
RESULTS In above screen in blue color text can see video analysis started
and after thorough analysis by DL will get output screen as
In this project DL algorithm is used to detect Deepfake faces shown in fig 4 and in fig 9
detection from video input. To train above model we have used
Deepfake faces dataset from KAGGLE repository which
contains more than 95000 images and this dataset can be
downloaded from below URL
detect manipulated facial images. In Proceedings of the
IEEE/CVF international conference on computer vision (pp. 1-
11).
[11] DFDC data from Kaggle: -
https://www.kaggle.com/competitions/deepfake-detection-
challenge (Accessed on 13/09/2023)
[12] Li, Y., Yang, X., Sun, P., Qi, H. and Lyu, S., 2020. Celeb-
df: A large-scale challenging dataset for deepfake forensics. In
Proceedings of the IEEE/CVF conference on computer vision
and pattern recognition (pp. 3207-3216).
[13]https://discuss.pytorch.org/t/confused-about-the-
LIMITATIONS imagepreprocessing-in-classification/3965
Our method has not considered the audio. That’s why our [14]https://www.kaggle.com/c/deepfake-
method will not be able to detect the audio deep fake. But we detectionchallenge/data
are proposing to achieve the detection of the audio deep fakes [15]https://github.com/ondyari/FaceForensics
in the future
CONCLUSION
The conclusion of deep fake face detection from videos using
deep learning (DL) emphasizes the importance of leveraging
advanced neural networks to detect subtle facial
manipulations. Deep learning models, such as CNNs and
RNNs, have shown promising results in accurately identifying
fake faces by analyzing temporal inconsistencies, texture
anomalies, and facial landmarks across frames. These
approaches help mitigate the spread of misinformation,
improve security in digital platforms, and enable automated
real-time detection, proving essential in combating the
evolving threats posed by deep fakes
REFERENCE
[1] Nguyen, T.T., Nguyen, Q.V.H., Nguyen, D.T., Nguyen,
D.T., Huynh-The, T., Nahavandi, S., Nguyen, T.T., Pham,
Q.V. and Nguyen, C.M., 2022. Deep learning for deepfakes
creation and detection: A survey. Computer Vision and Image
Understanding, 223, p.103525.
[2] Westerlund, M., 2019. The emergence of deepfake
technology: A review. Technology innovation management
review, 9(11).
[3] Thippanna, G., Priya, M.D. and Srinivas, T.A.S., An
Effective Analysis of Image Processing with Deep Learning
Algorithms. International Journal of Computer Applications,
975, p.8887.
[4] Indolia, S., Goswami, A.K., Mishra, S.P. and Asopa, P.,
2018. Conceptual understanding of convolutional neural
network-a deep learning approach. Procedia computer
science, 132, pp.679-688.
[5] Ralf C. Staudemeyer, “Understanding LSTM – a tutorial
into Long Short-Term Memory Recurrent Neural Networks”,
arXiv:1909.09586v1 [cs.NE] 12 Sep 2019
[6] Güera, D. and Delp, E.J., 2018, November. Deepfake
video detection using recurrent neural networks. In 2018 15th
IEEE international conference on advanced video and signal-
based surveillance (AVSS) (pp. 1-6). IEEE.
[7] Mallet, J., Dave, R., Seliya, N. and Vanamala, M., 2022,
November. Using deep learning to detecting deepfakes. In
2022 9th International Conference on Soft Computing &
Machine Intelligence (ISCMI) (pp. 1-5). IEEE.
[8] Abir, W.H., Khanam, F.R., Alam, K.N., Hadjouni, M.,
Elmannai, H., Bourouis, S., Dey, R. and Khan, M.M., 2023.
Detecting Deepfake Images Using Deep Learning Techniques
and Explainable AI Methods. Intelligent Automation & Soft
Computing., pp.2151-2169.
[9] Gong, D., Kumar, Y.J., Goh, O.S., Ye, Z. and Chi, W.,
2021. DeepfakeNet, an efficient deepfake detection method.
International Journal of Advanced Computer Science and
Applications, 12(6).
[10] Rossler, A., Cozzolino, D., Verdoliva, L., Riess, C.,
Thies, J. and Nießner, M., 2019. Faceforensics++: Learning to