Rest
Rest
CHAPTER-1
INTRODUCTION
1.1 Introduction
In today's digital age, the proliferation of manipulated media has raised concerns about the
authenticity of images and videos shared across various platforms. With the rapid advancement
of technology, creating realistic fake media has become easier, making it crucial to develop
systems that can distinguish between genuine and altered content. To address this growing
challenge, we present DeepScan, a machine learning-powered web application designed to
classify media files as "real" or "fake" with accuracy and ease of use.
DeepScan integrates advanced deep learning techniques with an intuitive user interface to
provide an accessible and reliable tool for media verification. The project employs a carefully
trained neural network model, developed using Python and TensorFlow, to analyze input media.
The training data consists of labeled "real" and "fake" samples, which were preprocessed and
structured for efficient model learning. Key components such as the train_model.py script for
model training, extract_frames.py for video preprocessing, and predict_file.py for classification
enable robust and accurate predictions.
The web interface, built using Flask and modern web technologies, offers a seamless user
experience, allowing users to upload images or videos and receive real-time predictions. By
automating the detection process, DeepScan reduces the burden of manual verification and
provides a scalable solution for combating the spread of misinformation.
This project also highlights the importance of ethical AI applications, ensuring that technology
serves as a force for good. Through DeepScan, we aim to empower individuals and organizations
to verify media authenticity effectively, fostering greater trust in the digital landscape. Future
iterations of this project could focus on improving model accuracy, expanding the dataset to
handle a wider range of scenarios, and deploying the solution for broader societal use.
In the digital age, the manipulation and dissemination of fake media have become
significant challenges. Advancements in technology have enabled the creation of hyper-realistic
fake images and videos through techniques such as Deepfake and other AI-based manipulations.
These media manipulations can be used for malicious purposes, including misinformation
campaigns, identity theft, and financial fraud. The growing prevalence of such manipulated
content poses a threat to the credibility of information, personal security, and societal harmony.
While technological progress has revolutionized communication and content creation, it
has also made it increasingly difficult to distinguish between real and fake media. Traditional
detection techniques often rely on manual scrutiny, which is time-consuming and error-prone,
particularly given the volume of content generated daily. Moreover, existing tools for fake media
detection often lack accessibility, user-friendliness, and efficiency, limiting their effectiveness in
real-world scenarios.
This problem is compounded by the lack of awareness among the general public
regarding media manipulation techniques. Many users unknowingly trust fake content,
amplifying its spread and impact. As a result, there is an urgent need for automated systems
capable of detecting fake media with high accuracy, scalability, and speed.
The absence of a reliable, easily accessible platform for detecting fake images and videos
creates a significant gap in combating this issue. This gap necessitates the development of an
intelligent, AI-driven solution that leverages advanced machine learning techniques to analyze
media content and provide reliable predictions regarding its authenticity. The solution must also
integrate seamlessly into existing workflows, ensuring ease of use for both technical and non-
technical users.
In this context, DeepScan addresses the pressing need for a robust, scalable, and user-
friendly tool to identify fake media. By combining deep learning models, effective preprocessing
methods, and an intuitive web interface, DeepScan aims to bridge the gap between the problem
and the solution, empowering individuals and organizations to combat the spread of
misinformation and preserve the integrity of digital media.
The objectives of the DeepScan project extend beyond mere functionality to address broader
societal and technological challenges posed by digital media manipulation:
To harness the power of deep learning and artificial intelligence to build a robust
classification system capable of analysing complex patterns in media, distinguishing
between genuine and manipulated content with high precision.
To reduce the impact of deepfake technology and other deceptive content on public
discourse by providing a reliable tool that empowers individuals, journalists, and
organizations to verify authenticity before sharing or acting on information.
To ensure that the system operates seamlessly across various file formats and resolutions,
offering adaptability and versatility for diverse use cases in real-world scenarios.
To provide transparency in the detection process, giving users insights into how the
system determines authenticity, thereby fostering trust and understanding in the
technology.
To bridge the gap between advanced AI research and practical applications by translating
cutting-edge concepts into a tangible, accessible tool for everyday use.
By aligning these objectives with technological advancements and societal needs, the DeepScan
project aims to create a meaningful impact, addressing the urgent problem of media authenticity
in an increasingly digital and interconnected world.
The DeepScan project extends its scope to the development of scalable and customizable
solutions that can be integrated into various sectors beyond just media and journalism. With its
core functionality of identifying fake or manipulated media content, it has the potential to
revolutionize industries such as law enforcement, education, and cybersecurity.
For instance, law enforcement agencies can use this technology to verify evidence in
criminal investigations, ensuring that the authenticity of video footage and images is intact. The
ability to quickly validate visual evidence can significantly streamline the investigative process
and prevent the introduction of misleading or fabricated content in legal proceedings.
In the education sector, educators and institutions can use DeepScan to verify the
originality of student submissions, safeguarding against plagiarism and cheating. As more
educational content is delivered online, and students rely heavily on digital resources, the
integrity of submitted work becomes increasingly important. DeepScan can be employed to
validate assignments, ensuring they are original and not manipulated or plagiarized.
Moreover, the project seeks to enhance the understanding and implementation of AI-
powered media analysis tools. By making the system user-friendly and widely accessible,
DeepScan aims to empower individuals and organizations to take an active role in safeguarding
the integrity of digital content. This includes providing educational resources, tutorials, and
documentation to help users understand how to operate the tool effectively and the underlying
principles of AI-driven content verification.
Additionally, DeepScan has the potential to assist in content moderation across various
online platforms. Social media platforms, news websites, and blogs are increasingly facing
challenges in ensuring that the content they distribute is authentic and credible. By integrating
DeepScan into content management systems, these platforms can automate the process of
flagging potentially fake or manipulated media, ensuring that misinformation is reduced before it
spreads.
Furthermore, the project will explore ways to improve the accuracy and efficiency of
detection methods. This may involve refining the model architecture, incorporating additional
datasets, or integrating innovative techniques like adversarial training to bolster the robustness of
the system against evolving manipulation tactics. By continuously improving the model’s ability
to detect subtle alterations in media files, DeepScan aims to stay ahead of emerging manipulation
techniques, offering a reliable and up-to-date solution for content verification.
The long-term goal of DeepScan is not only to serve as a detector of fake content but also
to contribute to the broader fight against the growing concern of digital manipulation in our
digital age. As technology continues to advance, the ability to easily manipulate images and
videos has grown exponentially, leading to widespread challenges in verifying the authenticity of
media. DeepScan aims to be at the forefront of this effort, offering individuals, businesses, and
governments the tools needed to address these challenges.
1.5 Methodology
A.P.S. College of Engineering Dept. of CSE 2024-25
DeepScan Page No.
The methodology for DeepScan is designed to ensure effective classification of real and fake
media. The process includes several phases: data collection, preprocessing, model development,
evaluation, and deployment.
1. Data Collection
The dataset for DeepScan consists of labeled real and fake media, which includes images and
videos. The real media folder contains authentic images and videos, while the fake media folder
holds altered or artificially created media. These files are organized into two categories for
training: real and fake.
2. Data Preprocessing
Preprocessing steps are crucial for ensuring the consistency of the dataset. Images and videos are
resized to a uniform size of 224x224 pixels. The pixel values are normalized between 0 and 1 to
make the training process more efficient. For videos, key frames are extracted and processed into
sequences. This allows the model to handle both individual images and video files with multiple
frames.
3. Model Selection
The core model is a Convolutional Neural Network (CNN), which is ideal for extracting
spatial features from images. For videos, frames are treated as individual images, and the model
is designed to capture temporal features, ensuring that the classification accounts for both the
content of the frames and the sequential information.
4. Model Training
The model is trained using the labeled dataset, where each file (image or video) is classified as
either "real" or "fake." A categorical cross-entropy loss function is used for classification. The
model is trained through multiple epochs with an optimizer like Adam, using data augmentation
to prevent overfitting. The training process adjusts the model’s weights to improve accuracy in
distinguishing between real and fake media.
5. Model Evaluation
Once trained, the model is evaluated using a separate test dataset. The evaluation uses metrics
like accuracy, precision, recall, and F1-score to determine the model's effectiveness in correctly
classifying media as real or fake.
The field of media authenticity detection has seen significant advancements due to the rise of
deep learning techniques and synthetic media, such as deepfakes. Several studies have explored
methods for identifying manipulated images and videos using machine learning.
Early research focused on using Convolutional Neural Networks (CNNs) to detect manipulated
media. Aly et al. (2019) and Zhou et al. (2020) used CNNs to identify fake faces and facial ex-
pressions in images and videos. These approaches were successful but often limited to face-re-
lated manipulations.
Models like Juefei-Xu et al. (2020) utilized a two-stream CNN architecture that processes
both spatial and temporal aspects of videos. This method helps detect fake videos by analyzing
individual frames and the motion between them.
Systems like Google's Perspective API and Microsoft's Video Authenticator provide real-
time media analysis. They rely on large datasets to identify inconsistencies in videos and images,
offering valuable tools for fake media detection.
5. Multimodal Approaches
Recent advancements have explored multimodal approaches, combining both visual and audio
features for detecting fake media. However, these methods are more complex and require both
video and audio inputs, which may not always be available.
Chapter 1: Introduction
This chapter introduces the problem, objectives, and scope of the project. It provides an
overview of the motivation behind the creation of DeepScan, along with the challenges
of detecting fake media in the current digital landscape.
Chapter 4: Implementation
This section describes the technical implementation of the DeepScan web application,
including the code architecture, the integration of machine learning models, and the
frontend and backend components of the system.
Each chapter builds upon the previous one to guide the reader through the development and
results of the DeepScan project, providing clarity on its effectiveness in detecting fake media.
CHAPTER-2
A.P.S. College of Engineering Dept. of CSE 2024-25
DeepScan Page No.
LITERATURE SURVEY
In this chapter, we review the existing research and technologies related to the detection of fake
media, specifically focusing on images and videos manipulated by artificial intelligence (AI),
commonly referred to as deepfakes. This survey includes a range of relevant studies and works,
from foundational research to more recent advancements, to establish a comprehensive under-
standing of the problem and how DeepScan aims to address it.
With the rise of more sophisticated deepfake techniques, such as those involving Generative Ad-
versarial Networks (GANs), the challenge of detecting such media became more complex.
[Goodfellow et al., 2014] introduced GANs, where two networks—the generator and the dis-
criminator—compete to create and evaluate images, respectively. This framework is widely used
in deepfake creation, as it allows for the production of highly realistic images and videos. As
these techniques evolved, researchers had to find ways to develop more resilient detection mod-
els.
A more recent approach by [Yang et al., 2020] proposed using recurrent neural networks
(RNNs) to capture the sequential nature of video frames and improve detection accuracy. This
method addressed the challenge of detecting inconsistencies that span across multiple frames in a
video, such as flickering or unnatural transitions between manipulated and real content.
While these methods focus on learning representations from raw pixel data, other approaches
have leveraged feature-based techniques. [Rossler et al., 2019] explored the idea of analyzing
metadata or examining faces for telltale signs of manipulation. For example, deepfakes often in-
troduce artifacts such as mismatched lighting or blurry faces due to the imperfections in AI mod-
els.
Similarly, [Chollet et al., 2018] proposed a system based on CNNs, which incorporated real-
time detection of both audio and video in a multimedia setting. They demonstrated how deep
learning could detect inconsistencies between the audio and visual components, helping to
identify fabricated content where both components did not match.
Furthermore, [Dolhansky et al., 2020] focused on adversarial attacks to train their deepfake de-
tection models, which allowed their system to be more resistant to the evolving tactics used in
generating fake media. By using adversarial training, the detection system could identify subtle
manipulations that might go unnoticed by traditional methods.
In recent years, a growing emphasis has been placed on cross-domain detection, where models
trained on one dataset are tested on another. This issue arises because many deepfake detection
systems are overfitted to specific datasets. [Matern et al., 2021] highlighted the importance of
generalization and transferability of deepfake detection models across different types of fake me-
dia, ensuring that the models are not just effective on the training data but also on unseen con-
tent.
2.6 Conclusion
The literature reveals that significant strides have been made in deepfake detection, but the con-
stantly evolving nature of the technology requires continual research. DeepScan leverages the
insights gained from these works, combining state-of-the-art machine learning techniques to de-
tect fake media. By focusing on the integration of CNNs and temporal analysis of video frames,
this project aims to create an efficient, accurate solution for real-time detection of deepfakes,
contributing to the ongoing efforts to combat the spread of misinformation.
CHAPTER-3
SOFTWARE REQUIREMENT SPECIFICATION
3.1 Introduction
The Video/Image Classifier Web Application allows users to upload images or videos, which
are classified by an AI model as "Real" or "Fake". This document specifies the functional and
non-functional requirements for the system.
1. Purpose
The goal is to provide a web application that allows users to:
Upload media files.
Process these files using an AI model to predict whether they are real or fake.
Display the results to the user.
2. Scope
The web application will:
Provide login and registration functionalities.
Allow media file uploads.
Classify the uploaded media using an AI model.
Display prediction results.
3. Definitions
User: Person interacting with the system.
Prediction: AI model's classification of media as "Real" or "Fake".
AI Model: A pre-trained model used for classification.
Media File: Image or video uploaded by the user.
1. User Authentication
Login and Registration: Users can register, log in, and reset their password.
2. File Upload
Users can upload images or videos for classification.
Supported formats include .jpg, .png, .mp4, .avi.
3. AI Prediction
The system processes the file and classifies it as "Real" or "Fake".
Results are displayed with a confidence score.
4. User Interface
Simple, user-friendly interface for file upload and displaying results.
Mobile-responsive design.
Performance
Predictions should be completed in under 5 seconds.
Fast file uploads with minimal delay.
Usability
The system should be intuitive and easy to navigate.
Security
User credentials should be securely stored (e.g., password hashing).
Uploaded files should be scanned for security threats.
Compatibility
Supports all modern browsers.
Compatible with both Windows and macOS for file uploads.
1. File Upload
Allows image/video upload with preview before submission.
2. Prediction
AI model predicts if the media is Real or Fake with a confidence score.
3. User Management
Users can register, log in, and reset their password.
3.5. Constraints
File size limit: 50 MB per upload.
AI model must be pre-trained.
This concise SRS outlines the key features and requirements for the Video/Image Classifier
Web Application, focusing on user interaction, file handling, AI predictions, and system
performance.
CHAPTER-4
DESIGN
This chapter details the design of the Video/Image Classifier Web Application, including the
system architecture, diagrams, and models used for both structured and object-oriented design.
Client-side (Frontend):
o Built using HTML, CSS, JavaScript, and PHP for user interaction.
o Allows users to upload media files, register/login, and view classification results.
Server-side (Backend):
o Built using PHP for handling requests, user authentication, and interacting with
the AI model.
o Uses an AI model hosted on the server to process images or videos.
AI Model:
o A pre-trained machine learning model (e.g., TensorFlow, PyTorch) for classifying
images or videos as "Real" or "Fake".
The user interacts with the Web Application (client-side) to upload media files.
The web application sends the media to the AI Model for classification.
The AI model processes the file and returns the classification result to the Web Application.
USER
- -Username: String
- - password: String
- -email: String
+ register()
+ login()
+ resetpassword()
FileUpload
-file: File
+ uploadFile()
+ validateFile()
AIClassifier
- model: Model
+ classify()
+ loadModel()
CHAPTER-5
IMPLEMENTATION
5.1 Overview
The implementation of the DeepFake video detection system involves a structured approach,
leveraging modular programming to ensure scalability, readability, and maintainability. The key
components of implementation include data preprocessing, model development, evaluation, and
the integration of the detection system with a user interface.
5.2 Pseudocode
BEGIN
Load Dataset
Perform Feature Extraction
FOR each video/image in Dataset DO
Align and Crop Frames
END FOR
import cv2
import os
A.P.S. College of Engineering Dept. of CSE 2024-25
DeepScan Page No.
def load_dataset(dataset_path):
frames = [ ]
for file in os.listdir(dataset_path):
video_path = os.path.join(dataset_path, file)
cap = cv2.VideoCapture(video_path)
while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
frames.append(frame)
cap.release()
return frames
data = load_dataset("/path/to/dataset")
def initialize_model():
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(128, 128, 3)))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
return model
cnn_model = initialize_model()
import numpy as np
def detect_visual_artifacts(frames, model):
results = []
for frame in frames:
frame = cv2.resize(frame, (128, 128))
frame = np.expand_dims(frame, axis=0) / 255.0
prediction = model.predict(frame)
results.append("Fake" if prediction > 0.5 else "Real")
return results
A.P.S. College of Engineering Dept. of CSE 2024-25
DeepScan Page No.
CHAPTER-6
TESTING
6.1 Overview
This chapter outlines the testing methodologies and results for the DeepFake video detection
system. Comprehensive testing ensures that the system meets the functional and performance
requirements, and it identifies any potential defects or limitations.
Unit tests are conducted on individual functions and modules to ensure their correctness. For
example, testing the data preprocessing module to validate frame extraction and resizing.
Example Test Case:
Function: load_dataset()
Input: Path to a directory containing video files
Expected Output: List of frames extracted from the videos
Result: Passed
Integration testing validates the interaction between different modules, such as the connection
between the preprocessing module and the CNN model.
Example Test Case:
Modules: load_dataset() and initialize_model()
Scenario: Feed preprocessed data into the initialized model for prediction
Result: Passed
System testing ensures that the end-to-end system works as intended, including data input,
prediction, and output visualization.
Example Test Case:
Scenario: Provide a test dataset containing both real and fake videos/images, including
low-quality examples, and evaluate the system's accuracy.
Methodology: Use a dataset of fake and real photos/videos to verify the model's ability to
classify them correctly.
Expected Accuracy: >= 90%
Result: Passed
Acceptance testing is performed to validate that the system meets user requirements and is ready
for deployment.
Example Test Case:
Scenario: A stakeholder provides a dataset for evaluation.
Expected Result: Accurate classification of videos/images as fake or real.
Outcome: Approved
- Challenge: The fake data was more in number compared to real data, which caused a
decision bias, making the model label almost everything as fake.
- Resolution: Adjusted the ratio of real and fake data to balance the dataset, which
improved the model's accuracy.
CHAPTER-7
SNAPSHOTS