0% found this document useful (0 votes)

11 views5 pages

Detection Synopsis

This project proposes a multi-stream deep learning framework for detecting AI-generated videos, specifically deepfakes, by integrating spatiotemporal analysis with generative adversarial probing. The architecture utilizes a combination of CNNs, Vision Transformers, Temporal Convolutional Networks, and a novel diffusion model to enhance detection capabilities against evolving manipulation techniques. The approach aims to improve generalization and robustness in identifying manipulated content, addressing the urgent need for reliable video authenticity detection methods.

Uploaded by

pensivemontalcini0

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views5 pages

Detection Synopsis

Uploaded by

pensivemontalcini0

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

MINOR PROJECT SYNOPSIS

Deep Learning-Based Detection of AI-Generated Videos

Submitted by:

Basharat Kaif (22BCS020)

Arsalaan Ahmed (22BCS012)

7th Semester, B.Tech

Department of Computer Science and Engineering

Jamia Millia Islamia

Under the guidance of Mr. Hannan Mansoor

Deep Learning-Based Detection of AI-Generated Videos

1. Title
Deep Learning-Based Detection of AI-Generated Videos

2. Abstract
The proliferation of sophisticated AI-based video generation and manipulation techniques,
commonly known as deepfakes, presents a significant threat to information integrity,
personal security, and public trust. While numerous deep learning-based detection methods
have been proposed, they often struggle with generalization to unseen manipulation
techniques and "in-the-wild" videos. Our method integrates spatiotemporal analysis with a
generative adversarial probing mechanism. It employs a multi-stream architecture,
including a Convolutional Neural Network (CNN) for high-frequency artifact detection, a
Vision Transformer (ViT) for global contextual inconsistencies, a Temporal Convolutional
Network (TCN) for identifying inter-frame anomalies, and a novel component that utilizes a
diffusion model to "probe" the authenticity of the video content. An attention-based fusion
module will intelligently combine the outputs from these streams to produce a final, robust
classification. This multi-faceted approach is designed to be more resilient to the evolving
nature of AI-generated media and to improve upon the generalization capabilities of current
SOTA models.

3. Introduction
Recent advances in generative artificial intelligence (AI)—driven by powerful models such
as GANs and diffusion networks—have made it possible to synthesize photorealistic videos
that are almost indistinguishable from those captured by real cameras. These synthetic
videos, or deepfakes, are being weaponized in the spread of misinformation, impersonation
fraud, and digital manipulation. As a result, the demand for robust, scalable video
authenticity detection frameworks has become increasingly urgent.

Although early detection systems relied heavily on Convolutional Neural Networks (CNNs)
to detect spatial anomalies, their effectiveness is often limited to the specific types of
manipulations and datasets on which they are trained. More recent developments involve
the use of Vision Transformers (ViTs), which offer improved global context modeling and
have demonstrated superior performance in architectures like Swin-Fake and DFDT. Hybrid
CNN-ViT solutions have also been proposed to combine localized and contextual signals
more effectively.

Temporal modeling techniques such as LSTMs and Temporal Convolutional Networks

(TCNs) have further improved performance by capturing inter-frame inconsistencies like
flickering or unnatural motion. In parallel, frequency-domain features, extracted via
Discrete Cosine Transform (DCT), have been leveraged to identify GAN-specific generation
artifacts and enhance model robustness against adversarial attacks.

A new and promising direction in deepfake detection is the use of generative frameworks
themselves as part of the detection pipeline. For example, DiffusionFake employs a diffusion
model to reverse the generative process and expose subtle manipulation traces. Visual-
language models (VLMs) are also being explored to reformulate detection as a reasoning
problem, enabling capabilities such as zero-shot classification.

4. Proposed Method / Algorithm

We propose a multi-stream, hybrid deep learning framework for AI video detection. The
architecture, as illustrated in the conceptual flowchart below, is designed to capture a wide
range of artifacts and inconsistencies.

Figure 1: Conceptual Flowchart of the Proposed Hybrid Detection Framework.

The video input is processed through four parallel streams:
1. Spatial Inconsistency Stream: This stream uses a hybrid CNN-ViT model. A CNN
backbone (e.g., EfficientNet) will extract low-level features and potential
compression artifacts. These features will then be fed into a Vision Transformer to
model long-range dependencies and global inconsistencies in the frame.
2. Temporal Anomaly Stream: A Temporal Convolutional Network (TCN) will be used
to analyze the sequence of frames. This stream will focus on detecting temporal
artifacts such as flickering, unnatural facial movements, and inconsistent heart rate
signals that can be inferred from video.
3. Frequency Domain Stream: This stream will apply a Discrete Cosine Transform
(DCT) to the video frames to extract frequency domain features. This is particularly
effective for detecting certain types of GAN-based artifacts and can improve
robustness against adversarial attacks, as shown by Hooda et al. [8].
4. Generative Adversarial Probing Stream: This novel stream is inspired by recent
work on using generative models for detection, particularly the DiffusionFake
framework [9]. We will use a pre-trained latent diffusion model. The video frames
will be passed through the diffusion model's encoder, and we will analyze the
reconstruction error and the latent space representation. The intuition is that
authentic videos will have lower reconstruction error and a more "natural" latent
representation compared to manipulated videos.
The outputs from these four streams, which represent different "views" of the video's
authenticity, will be fed into an Attention-based Fusion Module. This module will learn to
dynamically weigh the importance of each stream's output for a given video, allowing the
model to adapt to different types of manipulations. The final output will be a classification
score indicating the probability of the video being real or AI-generated/manipulated

5. Programming Environment & Tools used

- Framework: PyTorch 2.1
- Backbones: ResNet-50 (spatial), Video Swin Transformer (temporal)
- Optical Flow Estimation: RAFT (Recurrent All-Pairs Field Transforms)
- Dataset Format: MP4 videos, preprocessed to extract 16–32 frames
- Training Tools: Google Colab, Weights & Biases for experiment tracking
- Libraries: OpenCV, TorchVision, TimM, NumPy, Matplotlib
- Evaluation: Accuracy, AUC, F1-score, Confusion Matrix on validation/test splits

6. References
[1] G. Tsaloli, E. Bampis, B. Moser, and A. C. Bovik, "A Multi-Modal In-the-Wild Benchmark of
Deepfakes Circulated in 2024," arXiv preprint arXiv:2503.02857, 2025.

[2] T. Le and S. Woo, "iFakeDetector: Real Time Integrated Web-based Deepfake Detection
System," in Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI),
2024.

[3] R. Pandey and A. K. S. Kushwaha, "Detecting deepfake videos: an enhanced hybrid deep
learning model," Multimedia Tools and Applications, 2024.

[4] K. Mishra, "Can You Spot the AI? A Journey into Video Detection Challenges," Medium, May
26, 2025

[5] Y. Wang et al., "Swin-Fake: A Consistency Learning Transformer-Based Deepfake Video

Detector," Electronics, vol. 13, no. 15, p. 3045, 2024.

[6] A. Kumar et al., "Lightweight and hybrid transformer-based solution for quick and reliable
deepfake detection," Scientific Reports, vol. 15, no. 1, 2025.

[7] H. Heo et al., "DFDT: An End-to-End DeepFake Detection Framework Using Vision
Transformer," IEEE Transactions on Circuits and Systems for Video Technology, 2022.
[8] R. Hooda, M. Gupta, and N. Chand, "D4: Detection of Adversarial Diffusion Deepfakes Using
Disjoint Ensembles," in Proceedings of the IEEE/CVF Winter Conference on Applications of
Computer Vision (WACV), 2024, pp. 642-651.

[9] S. Kim, J. Lee, and J. Kim, "DiffusionFake: Enhancing Generalization in Deepfake Detection via
Guided Stable Diffusion," arXiv preprint arXiv:2410.04372, 2024.

[10] Z. Zhang et al., "Visual Language Models as Zero-Shot Deepfake Detectors," arXiv preprint
arXiv:2507.22469, 2025.

[11] S. Li et al., "DeepfakeBench: A Comprehensive Benchmark of Deepfake Detection," GitHub

repository, 2022.

[12] R. Sha et al., "An Analysis of Recent Advances in Deepfake Image Detection in an Evolving
Threat Landscape," arXiv preprint arXiv:2404.16212, 2024

Call Forth Your Destiny Helpers
No ratings yet
Call Forth Your Destiny Helpers
5 pages
Conferencepaper (Paper ID ICSSAS 0228)
No ratings yet
Conferencepaper (Paper ID ICSSAS 0228)
7 pages
IJRPR7765
No ratings yet
IJRPR7765
5 pages
Mini - 2
No ratings yet
Mini - 2
10 pages
Deep Fake Detection
No ratings yet
Deep Fake Detection
25 pages
8th Sem Major - Project - PPT
No ratings yet
8th Sem Major - Project - PPT
22 pages
A12 - Deepfake Detection Implementation
No ratings yet
A12 - Deepfake Detection Implementation
4 pages
Deepfake Detection Tool Using CNNs
No ratings yet
Deepfake Detection Tool Using CNNs
14 pages
Deepfake Image and Video Detection Using Deep Learning Algorithms
No ratings yet
Deepfake Image and Video Detection Using Deep Learning Algorithms
5 pages
Innovative Project
No ratings yet
Innovative Project
7 pages
Research Paper Draft
No ratings yet
Research Paper Draft
12 pages
Paper - Multi-Modal Deepfake Detection of Images, Videos, and Audio Using AIML Techniques
No ratings yet
Paper - Multi-Modal Deepfake Detection of Images, Videos, and Audio Using AIML Techniques
7 pages
Jaya Term-1
No ratings yet
Jaya Term-1
38 pages
Synopsis 2
No ratings yet
Synopsis 2
7 pages
MINTIME Multi-Identity Size-Invariant Video Deepfake Detection
No ratings yet
MINTIME Multi-Identity Size-Invariant Video Deepfake Detection
13 pages
Computers 13 00031
No ratings yet
Computers 13 00031
18 pages
Genconvit: Deepfake Video Detection Using Generative Convolutional Vision Transformer
No ratings yet
Genconvit: Deepfake Video Detection Using Generative Convolutional Vision Transformer
10 pages
Phase 1 Review 2
No ratings yet
Phase 1 Review 2
21 pages
A Hybrid CNN-LSTM Model For Video Deepfake Detection by Leveraging Optical Flow Features
No ratings yet
A Hybrid CNN-LSTM Model For Video Deepfake Detection by Leveraging Optical Flow Features
7 pages
Phase 1 Review 1
No ratings yet
Phase 1 Review 1
14 pages
A14 - Deepfake Detection Implementation
No ratings yet
A14 - Deepfake Detection Implementation
7 pages
Review 1-5
No ratings yet
Review 1-5
22 pages
Base Paper
No ratings yet
Base Paper
8 pages
Deep Fake Detection Using Deep Learning Ijariie23810
No ratings yet
Deep Fake Detection Using Deep Learning Ijariie23810
5 pages
Document From ?????
No ratings yet
Document From ?????
9 pages
Group 85 Survey Paper (1) - 1
No ratings yet
Group 85 Survey Paper (1) - 1
5 pages
Deep Fake Deection
No ratings yet
Deep Fake Deection
13 pages
DeepFake Synopsis (AutoRecovered)
No ratings yet
DeepFake Synopsis (AutoRecovered)
16 pages
Deep Fake Paper
No ratings yet
Deep Fake Paper
4 pages
Group 4 Review 1
No ratings yet
Group 4 Review 1
13 pages
Deepfake (1) - 2
No ratings yet
Deepfake (1) - 2
11 pages
Deepfake Implementetion Paper
No ratings yet
Deepfake Implementetion Paper
5 pages
CNN-LSTM Model For Deepfake Image Detection
No ratings yet
CNN-LSTM Model For Deepfake Image Detection
6 pages
GRP No 6 8th Sem
No ratings yet
GRP No 6 8th Sem
31 pages
Detecting Deepfake Videos in Data Scarcity Conditions by Means of Video Coding Features
No ratings yet
Detecting Deepfake Videos in Data Scarcity Conditions by Means of Video Coding Features
19 pages
Phase 1 PPT
No ratings yet
Phase 1 PPT
28 pages
Final PDF
No ratings yet
Final PDF
42 pages
Deep Fake Detection
No ratings yet
Deep Fake Detection
8 pages
Ijset v11 Issue6 571
No ratings yet
Ijset v11 Issue6 571
5 pages
Deepfake 2
No ratings yet
Deepfake 2
20 pages
Research Paper
No ratings yet
Research Paper
6 pages
Deepfake Video Detection Using Generative Convolutional Vision Transformer
No ratings yet
Deepfake Video Detection Using Generative Convolutional Vision Transformer
11 pages
Deep Fake Video Detection Using Transfer Learning Approach: Researcharticle-Computerengineeringandcomputerscience
No ratings yet
Deep Fake Video Detection Using Transfer Learning Approach: Researcharticle-Computerengineeringandcomputerscience
11 pages
A Novel Approach For Detecting Deep Fake Videos Using Graph Neural Network
No ratings yet
A Novel Approach For Detecting Deep Fake Videos Using Graph Neural Network
27 pages
Batch - No - 38 - Survey Paper 01
No ratings yet
Batch - No - 38 - Survey Paper 01
5 pages
05 Deepfake Detection
No ratings yet
05 Deepfake Detection
15 pages
A Performance Enhancement of Deepfake Video
No ratings yet
A Performance Enhancement of Deepfake Video
10 pages
ABHINAYA
No ratings yet
ABHINAYA
30 pages
Deepfake Paper
No ratings yet
Deepfake Paper
9 pages
MinorProject 11
No ratings yet
MinorProject 11
1 page
Harsiiiii
No ratings yet
Harsiiiii
34 pages
KECReport
No ratings yet
KECReport
23 pages
Paper 2
No ratings yet
Paper 2
7 pages
Deepfake Detection with CNN Models
No ratings yet
Deepfake Detection with CNN Models
6 pages
CVi T
No ratings yet
CVi T
29 pages
DeepFake Video Detection
No ratings yet
DeepFake Video Detection
22 pages
Deepfake Detection
No ratings yet
Deepfake Detection
10 pages
Aaai 2022
No ratings yet
Aaai 2022
9 pages
10.2305 IUCN - UK.1998.RLTS.T33255A9771604.en
No ratings yet
10.2305 IUCN - UK.1998.RLTS.T33255A9771604.en
5 pages
Science Teaching Reflection
No ratings yet
Science Teaching Reflection
2 pages
Screenshot 2021-06-17 at 4.10.12 PM
No ratings yet
Screenshot 2021-06-17 at 4.10.12 PM
3 pages
Rectangular Pyramid Easy 1
100% (1)
Rectangular Pyramid Easy 1
2 pages
Classification of Lung Sounds Using CNN
No ratings yet
Classification of Lung Sounds Using CNN
10 pages
MAT301 Lecture Notes 2018version
No ratings yet
MAT301 Lecture Notes 2018version
99 pages
Lec 4
No ratings yet
Lec 4
15 pages
Importance of Ecophysiology
No ratings yet
Importance of Ecophysiology
5 pages
Factors Affecting Solubility
No ratings yet
Factors Affecting Solubility
10 pages
Poisons An Introduction For Forensic Investigators, 1st Edition PDF
100% (9)
Poisons An Introduction For Forensic Investigators, 1st Edition PDF
17 pages
Hse Wis32 Safe Collection of Woodwaste Prevention of Fire and Explosion
No ratings yet
Hse Wis32 Safe Collection of Woodwaste Prevention of Fire and Explosion
4 pages
Alternative Depression Remedies
No ratings yet
Alternative Depression Remedies
37 pages
SSF Plastics Baseline Draft For Review
No ratings yet
SSF Plastics Baseline Draft For Review
19 pages
RDO No. 68 - Sorsogon City, Sorsogon 3
No ratings yet
RDO No. 68 - Sorsogon City, Sorsogon 3
703 pages
Industrial Cutting Machines Guide
No ratings yet
Industrial Cutting Machines Guide
8 pages
Depth of Cure in Bulk-Fill Composites
No ratings yet
Depth of Cure in Bulk-Fill Composites
8 pages
Toolbox Solidworks 2016
No ratings yet
Toolbox Solidworks 2016
53 pages
LG - TV - LG Uj6500
100% (1)
LG - TV - LG Uj6500
37 pages
MSDS KLINGERSIL C4430 e
No ratings yet
MSDS KLINGERSIL C4430 e
6 pages
IEEE Standard Terminology For Power and Distribution Transformers
No ratings yet
IEEE Standard Terminology For Power and Distribution Transformers
56 pages
Controller and Io Port
No ratings yet
Controller and Io Port
3 pages
Frankenstein Context
No ratings yet
Frankenstein Context
1 page
2010 Golf GTD Data
No ratings yet
2010 Golf GTD Data
3 pages
Alginate Cap
No ratings yet
Alginate Cap
6 pages
Liver Function Test and Renal Function Test-Final1
No ratings yet
Liver Function Test and Renal Function Test-Final1
8 pages
Ship Door Safety for Crew Members
No ratings yet
Ship Door Safety for Crew Members
19 pages
Comprehensive Gened Booster 1 Questionnaire
No ratings yet
Comprehensive Gened Booster 1 Questionnaire
26 pages
Bacteria Characteristics Worksheet
No ratings yet
Bacteria Characteristics Worksheet
6 pages
The Wilderness Effect and Ecopsychology RobertGreenway
No ratings yet
The Wilderness Effect and Ecopsychology RobertGreenway
13 pages

Detection Synopsis

Uploaded by

Detection Synopsis

Uploaded by

MINOR PROJECT SYNOPSIS

Deep Learning-Based Detection of AI-Generated Videos

Basharat Kaif (22BCS020)

Arsalaan Ahmed (22BCS012)

7th Semester, B.Tech

Department of Computer Science and Engineering

Jamia Millia Islamia

Under the guidance of Mr. Hannan Mansoor

Temporal modeling techniques such as LSTMs and Temporal Convolutional Networks

4. Proposed Method / Algorithm

Figure 1: Conceptual Flowchart of the Proposed Hybrid Detection Framework.

5. Programming Environment & Tools used

[5] Y. Wang et al., "Swin-Fake: A Consistency Learning Transformer-Based Deepfake Video

[11] S. Li et al., "DeepfakeBench: A Comprehensive Benchmark of Deepfake Detection," GitHub

You might also like