0% found this document useful (0 votes)

50 views30 pages

Fake Social Media Account Detection

Here are some sample output screenshots for the key modules of the project: Data Exploration and Pre-processing: ![Load Data](load_data.png) This screenshot shows the data being loaded and some initial pre-processing like handling missing values. Data Visualization: ![Bar Plot](bar_plot.png) This bar plot shows a visualization of some attribute to get insights from the data. Model Training: ![Model Training](model_training.png) This screenshot shows the neural network model being trained on the data with monitoring of metrics like loss and accuracy. Prediction on New Data: ![Prediction](prediction.png) The trained model is

Uploaded by

aryasurve1210

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

50 views30 pages

Fake Social Media Account Detection

Uploaded by

aryasurve1210

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 30

A PROJECT REPORT SUBMITTED ON

“Fake Social Media Account Detection”

MAHARASHTRA STATE BOARD OF TECHNICAL

EDUCATION, MUMBAI
FOR THE AWARD OF THE DEGREE OF

DIPLOMA IN ENGINEERING
(COMPUTER ENGINEERING)
SUBMITTED BY

Mr.Aryan Sopan Surve (2115230021)

Mr.Soham Shivaji Shirgire (2115230017)
Mr.Suraj Dipak Javir (2215230198)

Under the guidance of

Ms.Sugandhi.P.S.

DEPARTMENT OF COMPUTER ENGINEERING

New Satara College of Engineering & Management
(Poly.), Korti-Pandharpur.413304
Academic year 2023-2024
DEPARTMENT OF COMPUTER ENGINEERING
NEW SATARA COLLAGE OF ENGINEERING & MANAGEMENT
(POLY), KORTI-PANDHARPUR.4133042

CERTIFICATE
This is to certify that the project work entitled
“Fake Social Media Account Detection”

Submitted by
Mr.Aryan Sopan Surve (2115230021)
Mr.Soham Shivaji Shirgire (2115230017)
Mr.Suraj Dipak Javir (2215230198)

Is a bonafide work carried out by them under the guidance of Ms.Sugandhi.P.S. And
it is submitted toward the partial fulfillment of the requirement of Maharashtra State Board
of Technical Education, Mumbai for the award of the degree of Diploma in Computer
Engineering

Place: Korti
Date:

Ms.Sugandhi.P.S. Mr. Puri.S.B. Prof.Londhe V.H.

(Project Guide) (H.O.D) (Principal)
DECLARATION

I hereby declare that the project report entitles “Fake Social Media Account

Detection” completed and written by us for the award of the Diploma in Computer Engineering

to Maharashtra State Board of Technical Education, Mumbai has not previously formed for the

award of diploma, degree or similar title of this or any other university or examination body.

Place: -Korti
Date: -

Student Name Sign.

Mr.Aryan Sopan Surve

Mr.Soham Shivaji Shirgire
Mr.Suraj Dipak Javir
ACKNOWLEDGEMENT

It gives us lot of pleasure in submitting our project report on “Fake Social Media
Account Detection”. We should thank all of them who helped us in work and provide the
facilities to develop this application.

We are very much thankful to our project guide. Ms.Sugandhi.P.S. And project
coordinator Mr. Puri.S.B. For their encouragement, technical guidance and valuable assistant
rendered to us.

We are also thankful to all the facilities of computer department for their valuable
guidance, advice, and assistance in our project right from the initial stages.

We also express sincere thanks to all faculty members of our college. Last but not the
least we would like to thank to all our friends, fellow students and our parents for their whole-
hearted support.

Thanking You.
INDEX

1. Abstract

2. Chapter 1 : Introduction and Motivation [Purpose of the problem

statement (societal benefit)

3. Chapter 2: Review of Existing methods and their Limitations

4. Chapter 3 : Proposed Method with System Architecture / Flow Diagram

5. Chapter 4: Modules Description

6. Chapter 5: Implementation requirements

7. Chapter 6: Output Screenshots

8. Conclusion

9. References

10.Appendix A – Source Code

5
ABSTRACT

With the advent of the Internet and social media, while hundreds of people have benefitted

from the vast sources of information available, there has been an enormous increase in the

rise of cyber-crimes. According to a 2019 report in the Economics Times, India has

witnessed a 457% rise in cybercrime in the five year span between 2011 and 2016. Most

speculate that this is due to impact of social media such as Instagram on our daily lives.

While these definitely help in creating a sound social network, creation of user accounts in

these sites usually needs just an email-id. A real life person can create multiple fake IDs and

hence impostors can easily be made. Unlike the real world scenario where multiple rules and

regulations are imposed to identify oneself in a unique manner (for example while issuing

one’s passport or driver’s license), in the virtual world of social media, admission does not

require any such checks. In this project, we study the different accounts of Instagram, in

particular and try to assess an account as fake or real.

6
INTRODUCTION & MOTIVATION

Having the ability to check the authenticity of a user’s following is crucial for brands looking

to work with influencers. Social Media is one of the most important platforms, especially for

youth, to express themselves to the world.

This platform can be used by them as a way of interacting with same type of people and age

group, or to present their views. However, use of technology has also constrained with

various implications – humans can misuse the technology to cause harm and spread hatred

via the same social media platform.

Keeping this is mind, we have tried to perform a basic solution to this problem via deep

learning algorithm implementation over a dataset to check with respect to various social

media platform – Instagram’s attributes , can a neural network actually help to predict a fake

or real user profile.

7
Proposed Method with Flow Diagram

An artificial neural network (ANN) is a computing system designed to simulate how the
human brain analyzes and processes information. It is the foundation of artificial intelligence
(AI) and solves problems that would prove impossible or difficult by human or statistical
standards.

Artificial Neural Networks are primarily designed to mimic and simulate the functioning
of the human brain. Using the mathematical structure, it is ANN constructed to replicate
the biological neurons.

The concept of ANN follows the same process as that of a natural neural net. The objective
of ANN is to make the machines or systems understand and ape how a human brain makes a
decision and then ultimately takes action. Inspired by the human brain, the fundamentals of
neural networks are connected through neurons or nodes.

8
MODULES OF THE PROJECT

▪ Module I - Initial Data Exploration: It is the initial step in data analysis in

which we use data visualization and statistical techniques to describe
dataset characterizations, such as size, quantity, and accuracy, in order to
better understand the nature of the data.

▪ Module II - Data Wrangling: In this process, cleaning and unifying of

messy and complex data sets takes place for easy access and analysis.
With the amount of data and data sources rapidly growing and expanding,
it is getting increasingly essential for large amounts of available data to be
organized for analysis.

▪ Module III - Data Insights: Basic statistical and visual analysis with respect
to scraped datasets, which can help to provide basic overview of how data
needs to be cleaned or further processed with respect to core neural network
development

▪ Module IV - Core Neural Network Development: This module comprises

of core neural network development – a basic artificial neural network
(ANN), which takes input of basic attributes of independent features of
dataset and tries to predict target feature – fake or not.

▪ Module V – Evaluation: After neural network development, this module is

being implemented in order to check how the model is actually performing
training wise and how it performs on unseen test data – accuracy and loss of
model.

▪ Module VI - Testing and Inference: Once the desired and tuned model is
obtained, this module is implemented in order to test model (saved model
and

9
later loaded for future use) on random unseen data attributes to determine whether the user
is fake or not.

10
IMPLEMENTATION REQUIREMENTS
1) Initial Packages – Pandas, NumPy, Matplotlib, Seaborn – for basic statistical
analysis and mathematical insights

2) TensorFlow - TensorFlow is a free and open-source software library for

machine learning and artificial intelligence. It can be used across a range
of tasks but has a particular focus on training and inference of deep neural
networks

3) Scikit-Learn - Scikit-learn is a free software machine learning library for

the Python programming language

4) Python – Python based programming language interface in order to run

and execute the application

5) Google Colab - Colab is a free Jupyter notebook environment that runs

entirely in the cloud – cloud based instance which helps to set up a virtual
python based environments and run machine learning or deep learning
models

11
OUTPUT SCREENSHOTS

Load Data (Pre-processing)

Bar Plot – Visualization (Data Insights)

12
13
KDE Plot (Data Insights)

Heat Map – Correlation Check (Data Insights)

14
Model Training- (Sequential Training)

15
Training Progress - Loss (Training)

Training Progress - Accuracy(Training)

16
Classification Report (Evaluation)

17
Confusion Matrix (Evaluation)

18
CONCLUSION

The proposed project majorly focuses on how deep learning algorithms -

Artificial Neural Network or ANNs can be leveraged for better insights
exploration over a well distributed dataset. The proposed framework exhibits how
different attributes with respect to user’s activity can be learned or analysed by
machine learning or deep learning algorithms to predict any suspicious activity
and tell the probability of that specific account being a fake or genuine one.

Furthermore, this algorithm can be improved by scraping more metadata - like

visual features - images, posts, captions, activity spend time and heavy deep
learning models can be ensemble - like multimodal deep learning for even better
results.

19
REFERENCES

1. Instagram Fake Spammer Dataset - Kaggle

2. Easy ways to analyse if account is fake or not - WikiBlog
3. Tensorflow - Basic Code Base
4. Instagram Fake and Automated Account Detection - Fatih Cagatay
Akyon; M. Esat Kalfaoglu

20
APPENDIX A - Source Code

#Initial Data Exploration and Data Wrangling

import pandas as pd

import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

import numpy as np

import seaborn as sns

import tensorflow as tf

from tensorflow import keras

from tensorflow.keras.layers import Dense, Activation, Dropout

from tensorflow.keras.optimizers import Adam

from tensorflow.keras.metrics import Accuracy

from sklearn import metrics

from sklearn.preprocessing import LabelEncoder

from sklearn.metrics import

classification_report,accuracy_score,roc_curve,confusion_matrix

21
train_data_path = 'datasets/Fake-Instagram-Profile-Detection-
main/insta_train.csv'

test_data_path = 'datasets/Fake-Instagram-Profile-Detection-
main/insta_test.csv'

pd.read_csv(test_data_path)

576 + 120

train_data_path =
'datasets/Insta_Fake_Profile_Detection/train.csv'

test_data_path =
'datasets/Insta_Fake_Profile_Detection/test.csv'

pd.read_csv(train_data_path)

# Load the training dataset

instagram_df_train=pd.read_csv(train_data_path)

instagram_df_train

# Load the testing data

instagram_df_test=pd.read_csv(test_data_path)

instagram_df_test

instagram_df_train.head()

instagram_df_train.tail()

22
instagram_df_test.head()

instagram_df_test.tail()

# Getting dataframe info

instagram_df_train.info()

# Get the statistical summary of the dataframe

instagram_df_train.describe()

# Checking if null values exist

instagram_df_train.isnull().sum()

# Get the number of unique values in the "profile pic" feature

instagram_df_train['profile pic'].value_counts()

# Get the number of unique values in "fake" (Target column)

instagram_df_train['fake'].value_counts()

instagram_df_test.info()

instagram_df_test.describe()

23
instagram_df_test.isnull().sum()

instagram_df_test['fake'].value_counts()

# Perform Data Visualizations

# Visualize the data

sns.countplot(instagram_df_train['fake'])

plt.show()

# Visualize the private column data

sns.countplot(instagram_df_train['private'])

plt.show()

# Visualize the "profile pic" column data

sns.countplot(instagram_df_train['profile pic'])

plt.show()

# Visualize the data

plt.figure(figsize = (20, 10))

sns.distplot(instagram_df_train['nums/length username'])

plt.show()

# Correlation plot

plt.figure(figsize=(20, 20))

24
cm = instagram_df_train.corr()

ax = plt.subplot()

sns.heatmap(cm, annot = True, ax = ax)

plt.show()

sns.countplot(instagram_df_test['fake'])

sns.countplot(instagram_df_test['private'])

sns.countplot(instagram_df_test['profile pic'])

# Preparing Data to Train the Model

# Training and testing dataset (inputs)

X_train = instagram_df_train.drop(columns = ['fake'])

X_test = instagram_df_test.drop(columns = ['fake'])

X_train

X_test

# Training and testing dataset (Outputs)

y_train = instagram_df_train['fake']

y_test = instagram_df_test['fake']

y_train

25
y_test

# Scale the data before training the model

from sklearn.preprocessing import StandardScaler, MinMaxScaler

scaler_x = StandardScaler()

X_train = scaler_x.fit_transform(X_train)

X_test = scaler_x.transform(X_test)

y_train = tf.keras.utils.to_categorical(y_train, num_classes =

y_test = tf.keras.utils.to_categorical(y_test, num_classes = 2)

y_train

y_test

# print the shapes of training and testing datasets

X_train.shape, X_test.shape, y_train.shape, y_test.shape

Training_data = len(X_train)/( len(X_test) + len(X_train) ) *

100

Training_data

26
Testing_data = len(X_test)/( len(X_test) + len(X_train) ) * 100

Testing_data

# Building and Training Deep Training Model

import tensorflow.keras

from tensorflow.keras.models import Sequential

from tensorflow.keras.layers import Dense, Dropout

model = Sequential()

model.add(Dense(50, input_dim=11, activation='relu'))

model.add(Dense(150, activation='relu'))

model.add(Dropout(0.3))

model.add(Dense(150, activation='relu'))

model.add(Dropout(0.3))

model.add(Dense(25, activation='relu'))

model.add(Dropout(0.3))

model.add(Dense(2,activation='softmax'))

model.summary()

model.compile(optimizer = 'adam', loss =

'categorical_crossentropy', metrics = ['accuracy'])

epochs_hist = model.fit(X_train, y_train, epochs = 50, verbose

= 1, validation_split = 0.1)

27
# Access the Performance of the model

print(epochs_hist.history.keys())

plt.plot(epochs_hist.history['loss'])

plt.plot(epochs_hist.history['val_loss'])

plt.title('Model Loss Progression During Training/Validation')

plt.ylabel('Training and Validation Losses')

plt.xlabel('Epoch Number')

plt.legend(['Training Loss', 'Validation Loss'])

plt.show()

predicted = model.predict(X_test)

predicted_value = []

28
test = []

for i in predicted:

predicted_value.append(np.argmax(i))

for i in y_test:

test.append(np.argmax(i))

print(classification_report(test, predicted_value))

plt.figure(figsize=(10, 10))

cm=confusion_matrix(test, predicted_value)

sns.heatmap(cm, annot=True)

plt.show()

29
30

AI Insta Fake Proj Report
No ratings yet
AI Insta Fake Proj Report
27 pages
Fake Profile Detection with ANN
No ratings yet
Fake Profile Detection with ANN
78 pages
Neuralnetwork Baseddetectionoffraudulentprofilesinsocialmediaplatforms 250221160825 5f3d6744
No ratings yet
Neuralnetwork Baseddetectionoffraudulentprofilesinsocialmediaplatforms 250221160825 5f3d6744
70 pages
Social Media Fake Account Prediction Report
No ratings yet
Social Media Fake Account Prediction Report
21 pages
Praveen Final
No ratings yet
Praveen Final
37 pages
Fake Account Detection
No ratings yet
Fake Account Detection
33 pages
TARP Final Poster
No ratings yet
TARP Final Poster
1 page
Fake Account Detection
100% (1)
Fake Account Detection
34 pages
Fake and Automated Account - Report (Sathiyabama)
No ratings yet
Fake and Automated Account - Report (Sathiyabama)
108 pages
Batch 21
No ratings yet
Batch 21
20 pages
Fake Profile Detection Report
No ratings yet
Fake Profile Detection Report
94 pages
Fake Profile Detection Using Machine Learning
No ratings yet
Fake Profile Detection Using Machine Learning
4 pages
Minor Project - REPORT
No ratings yet
Minor Project - REPORT
20 pages
Project Report Final Black Book
No ratings yet
Project Report Final Black Book
40 pages
Mini Project Contents FF
No ratings yet
Mini Project Contents FF
48 pages
Minor - Project - Docx.jpg - 1 - (1) (1) (AutoRecovered) LAST PRINT
No ratings yet
Minor - Project - Docx.jpg - 1 - (1) (1) (AutoRecovered) LAST PRINT
22 pages
Fake Profile Identification - Abstract
No ratings yet
Fake Profile Identification - Abstract
3 pages
Docbatch 3
No ratings yet
Docbatch 3
54 pages
Social Media Fake Account Detection Report 20pages
No ratings yet
Social Media Fake Account Detection Report 20pages
8 pages
Sri Ram Project Phase 1 Report
No ratings yet
Sri Ram Project Phase 1 Report
36 pages
Detecting Fake Social Media Profiles Using Blockchain
No ratings yet
Detecting Fake Social Media Profiles Using Blockchain
21 pages
Abstract New 1 . .
No ratings yet
Abstract New 1 . .
54 pages
Social Media Fake Account Detection Full Report
No ratings yet
Social Media Fake Account Detection Full Report
3 pages
Fake Instagram Profile Detection Using Neural Networks
No ratings yet
Fake Instagram Profile Detection Using Neural Networks
15 pages
ETI11
100% (1)
ETI11
17 pages
Fake Instagram Profile Detection Using Neural Networks
No ratings yet
Fake Instagram Profile Detection Using Neural Networks
17 pages
Fake Id Detection Using Machine Learning
No ratings yet
Fake Id Detection Using Machine Learning
9 pages
Sat - 25.Pdf - Discernment of Autonomous Profiles On Social Networking Services (SNS)
No ratings yet
Sat - 25.Pdf - Discernment of Autonomous Profiles On Social Networking Services (SNS)
11 pages
Social Media Fake Account Detection Report Enhanced Final
No ratings yet
Social Media Fake Account Detection Report Enhanced Final
9 pages
Sample Copy of Major Project Report
No ratings yet
Sample Copy of Major Project Report
60 pages
HackShield (SIH Final)
No ratings yet
HackShield (SIH Final)
15 pages
Fake Social Media Profile Detection
No ratings yet
Fake Social Media Profile Detection
10 pages
Untitled
100% (2)
Untitled
66 pages
Eyeon
No ratings yet
Eyeon
6 pages
Fake Profile Detection
No ratings yet
Fake Profile Detection
4 pages
Project 05
No ratings yet
Project 05
5 pages
Fake Account Detection On Instagram: in Partial Fulfillment of The Requirements For The Award of The Degree
No ratings yet
Fake Account Detection On Instagram: in Partial Fulfillment of The Requirements For The Award of The Degree
5 pages
Use of Artificial Neural Networks To Identify Fake Profiles
No ratings yet
Use of Artificial Neural Networks To Identify Fake Profiles
7 pages
Fake Profile Detection via ML
No ratings yet
Fake Profile Detection via ML
12 pages
Online Net Banking
No ratings yet
Online Net Banking
64 pages
Tarp Da 2 20bce2907
No ratings yet
Tarp Da 2 20bce2907
6 pages
Internal Hackathon
No ratings yet
Internal Hackathon
7 pages
Abstract 1
No ratings yet
Abstract 1
9 pages
Review 2
No ratings yet
Review 2
11 pages
Mr. M.Ravi Kumar: Andiraju Keshava Krishna 22091F0019 Under The Esteemed Guidance of
No ratings yet
Mr. M.Ravi Kumar: Andiraju Keshava Krishna 22091F0019 Under The Esteemed Guidance of
12 pages
Fake News Detection for Engineers
No ratings yet
Fake News Detection for Engineers
78 pages
FinalReport Fake News Detection
No ratings yet
FinalReport Fake News Detection
27 pages
Tarp Da 3
No ratings yet
Tarp Da 3
7 pages
Aaaaa
No ratings yet
Aaaaa
60 pages
Fake News Detection Using Machine Learning: Project Report On
No ratings yet
Fake News Detection Using Machine Learning: Project Report On
57 pages
Conference Template SMVEC (MARCH)
No ratings yet
Conference Template SMVEC (MARCH)
5 pages
Detecting Fake Accounts in Media Application Using Machine Learning
No ratings yet
Detecting Fake Accounts in Media Application Using Machine Learning
4 pages
Fake Profile Identification in Social Network Using Machine Learning and NLP
No ratings yet
Fake Profile Identification in Social Network Using Machine Learning and NLP
10 pages
Detailed Social Media Fake Account Detection Report
No ratings yet
Detailed Social Media Fake Account Detection Report
4 pages
Phase1project (Socialmedia)
No ratings yet
Phase1project (Socialmedia)
22 pages
Aliyah Chapter Four
No ratings yet
Aliyah Chapter Four
51 pages
2022 - Keshav - 2022 - Keshav - A Novel Machine Instagram
No ratings yet
2022 - Keshav - 2022 - Keshav - A Novel Machine Instagram
12 pages
Sihfinalpt 20240930142155
No ratings yet
Sihfinalpt 20240930142155
6 pages
Major Project
No ratings yet
Major Project
11 pages
Web Development & SEO Training Report
No ratings yet
Web Development & SEO Training Report
3 pages
MAN MGT 1hvy
No ratings yet
MAN MGT 1hvy
22 pages
NW Basics Aryan
No ratings yet
NW Basics Aryan
11 pages
17514-2019-Winter-Model-Answer-Paper (Msbte Study Resources)
No ratings yet
17514-2019-Winter-Model-Answer-Paper (Msbte Study Resources)
29 pages
Assignment No. 3
No ratings yet
Assignment No. 3
6 pages
JDBC, Java Server Pages, and MySQL - Database MCQ Questions and Answers - Technical Aptitude
No ratings yet
JDBC, Java Server Pages, and MySQL - Database MCQ Questions and Answers - Technical Aptitude
4 pages
Carshowroom CSS
No ratings yet
Carshowroom CSS
28 pages
Integrating Images and External Materials
No ratings yet
Integrating Images and External Materials
2 pages
Android Word Count App Guide
No ratings yet
Android Word Count App Guide
13 pages
ER Model Slides
No ratings yet
ER Model Slides
58 pages
Prist University, Trichy Campus Department of Comnputer Science and Engineering B.Tech - Arrear Details (Part Time)
No ratings yet
Prist University, Trichy Campus Department of Comnputer Science and Engineering B.Tech - Arrear Details (Part Time)
2 pages
MagPi100 MagPi RaspberryPi 100th Issue
0% (1)
MagPi100 MagPi RaspberryPi 100th Issue
102 pages
Bluetooth Barcode Reader Manual
No ratings yet
Bluetooth Barcode Reader Manual
29 pages
Digital Signature (08IT012)
No ratings yet
Digital Signature (08IT012)
18 pages
How To Install Odoo 16 On Ubuntu 22
No ratings yet
How To Install Odoo 16 On Ubuntu 22
5 pages
What Is Malware - and Its Types - GeeksforGeeks
No ratings yet
What Is Malware - and Its Types - GeeksforGeeks
16 pages
Projects Instruction or Rubric
No ratings yet
Projects Instruction or Rubric
6 pages
Chroma Kopi A
No ratings yet
Chroma Kopi A
1 page
Git & Github
No ratings yet
Git & Github
2 pages
Effective Systems Analysis and Design Gu
No ratings yet
Effective Systems Analysis and Design Gu
56 pages
FCET Unit 2 Notes
No ratings yet
FCET Unit 2 Notes
27 pages
Website Creation and Analytics
No ratings yet
Website Creation and Analytics
39 pages
Designing, Creating Alogo
No ratings yet
Designing, Creating Alogo
7 pages
FDA Form 3674 PDF
0% (1)
FDA Form 3674 PDF
2 pages
MIPS Instruction Guide
No ratings yet
MIPS Instruction Guide
24 pages
MICS Notes
No ratings yet
MICS Notes
39 pages
Dual Mining ZIL
No ratings yet
Dual Mining ZIL
2 pages
690+ Ac Drive Manual 2
No ratings yet
690+ Ac Drive Manual 2
164 pages
Reading 1 VIX Data Processing Using Excel
No ratings yet
Reading 1 VIX Data Processing Using Excel
32 pages
AutoApprove USA Facebook Groups
No ratings yet
AutoApprove USA Facebook Groups
5 pages
100mb Testfile - Org Compressed
No ratings yet
100mb Testfile - Org Compressed
100 pages
HTML5 Web Storage Guide
No ratings yet
HTML5 Web Storage Guide
9 pages
Unit I - Polymorphism
No ratings yet
Unit I - Polymorphism
14 pages
2G Integration Steps and MML Updates
No ratings yet
2G Integration Steps and MML Updates
4 pages
HR Organization Information
No ratings yet
HR Organization Information
5 pages
UVI Soundbank EULA & User Guide
No ratings yet
UVI Soundbank EULA & User Guide
10 pages
File Backup Meaning - Google Search
No ratings yet
File Backup Meaning - Google Search
1 page