0% found this document useful (0 votes)

51 views7 pages

Naive Bayes Classifier CSV Guide

The document describes building a Naive Bayes classifier to predict user click behavior for social network ads using a CSV dataset. It covers data preprocessing, training a Gaussian Naive Bayes model, and evaluating the model's performance which achieved a 100% accuracy score on the test data.

Uploaded by

21EE076 NIDHIN

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

51 views7 pages

Naive Bayes Classifier CSV Guide

Uploaded by

21EE076 NIDHIN

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

Ex.

No: 4 NAIVE BAYESIAN CLASSIFIER FOR A SAMPLE TRAINING

DATE: DATA SET STORED AS A CSV FILE.

AIM:

The aim of this experiment is to use Naive Bayes classifier and its
application for classification tasks. and implement the Naive Bayes algorithm
using a sample training dataset stored in a CSV file.

HARDWARE SPECIFICATION:

Processor : AMD Ryzen 5 3450U with Radeon Vega Mobile Gfx 2.10GHz
Installed RAM : 8.00 GB (5.89 GB usable)
Device ID : A20FB902-090D-43B7-B646-A365E8586922
Product ID : 00356-24550-53284-AAOEM
System Type : 64-bit operating system, x64-based processor

SOFTWARE SPECIFICATION:

PYTHON IDLE( 3.12.1 64 BIT)

LIBRARIES:
NumPy
Pandas
Sklearn

ALGORITHM:

1. Data Acquisition and Preprocessing

This initial phase focuses on gathering and preparing the data for model
development.

 1.1 Library Imports: Essential libraries for data manipulation and

machine learning are imported at the outset. These might include pandas
for data handling and scikit-learn for machine learning algorithms.

NIDHIN S
21EE076
 1.2 Data Loading: The social network ad dataset, typically stored in a
comma-separated values (CSV) file format, is loaded into a pandas
DataFrame. This structure facilitates efficient data exploration and
manipulation.

 1.3 Feature Separation: The DataFrame is carefully examined to

identify and separate the independent variables (features) into a
designated variable named 'X'. These features represent various attributes
that can potentially influence user click behavior (e.g., age, gender,
income, interests).

 Target Variable Isolation: The dependent variable, representing the

target outcome (i.e., whether a user clicked on an ad), is isolated and
stored in a separate variable named 'y'.

 Categorical Data Encoding (if applicable): If any of the features

contain categorical data (e.g., gender with categories like "Male" and
"Female"), these categories are meticulously encoded into numerical
values using appropriate techniques. LabelEncoder from scikit-learn is a
common choice for ordinal data (categories with inherent order).

 Train-Test Split: The entire dataset is rigorously split into training and
testing sets using the train_test_split function from the scikit-learn
library. This crucial step serves to partition the data for model
development and subsequent evaluation. The training data is used to train
the model, while the testing data is used to assess the model's
performance on unseen data.

 Feature Scaling: Feature scaling is meticulously applied to both the

training and testing sets using techniques like StandardScaler from scikit-
learn. This normalization process ensures that all features are on a
comparable scale, preventing any individual feature from
disproportionately influencing the model during the training phase.

NIDHIN S
21EE076
2. Model Development and Evaluation

This phase focuses on building and evaluating the Gaussian Naive Bayes model
to predict click-through rates.

 Model Instantiation: A Gaussian Naive Bayes classifier is instantiated,

leveraging the scikit-learn library. This probabilistic model is particularly
adept at handling situations where the class distribution of the target
variable may be skewed.

 Model Training: The meticulously prepared training data (features 'X'

and target variable 'y') is subsequently fed into the instantiated Gaussian
Naive Bayes classifier to train the model. During this phase, the model
learns the inherent relationships between the various features and the
target variable (click behavior).

 Prediction: Once trained, the model is utilized to make predictions on

the unseen test data. These predictions represent the model's estimation of
click-through rates for users in the test set based on the patterns it learned
from the training data

 Evaluation Metrics: The model's performance is rigorously evaluated

using two key metrics: - Confusion Matrix: This graphical tool is
meticulously constructed to visualize the correct and incorrect predictions
made by the model on the test data. It provides valuable insights into the
model's ability to identify true positives (correctly predicted clicks), true
negatives (correctly predicted non-clicks), false positives (incorrectly
predicted clicks), and false negatives (incorrectly predicted non-clicks). -
Accuracy Score: This metric quantitatively measures the overall
effectiveness of the model by calculating the proportion of correct
predictions made on the test data. A higher accuracy score indicates a
more robust model capable of generalizing its learned patterns to unseen
data.

3. Result Display (Optional)

 The meticulously generated confusion matrix and the calculated accuracy

score are meticulously presented, providing a clear understanding of the

NIDHIN S
21EE076
model's performance in predicting user click-through rates for social
network advertisements. This information can be instrumental in
assessing the model's suitability for real-world deployment.

4. Additional Considerations (Optional)

This section highlights some optional steps that can be incorporated to further
enhance the model's performance and understanding.

Hyperparameter Tuning:

o Techniques like GridSearchCV from scikit-learn can be employed

to optimize the hyperparameters of the Gaussian Naive Bayes
model. Hyperparameters are settings within the model that can
influence its behavior, and tuning them can potentially lead to
improved performance.

PROGRAM:

#import necessary libraries

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

#importing the datasets

dataset = pd.read_csv("Social_Network_Ads.csv")
x = dataset.iloc[:, [1, 4]].values
y = dataset.iloc[:, -1].values
print("X values:")
print(x)

# Encoding categorical data (the Gender column)

from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
x[:, 0] = le.fit_transform(x[:, 0])

# Train-test splitting
from sklearn.model_selection import train_test_split

NIDHIN S
21EE076
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.20,
random_state=0)

# Feature scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
x_train = sc.fit_transform(x_train)
x_test = sc.transform(x_test)

# Training the naive bayes model on the training set

from sklearn.naive_bayes import GaussianNB
classifier = GaussianNB()

classifier.fit(x_train, y_train)
y_pred = classifier.predict(x_test)

# printing values print("\nActual y_test values:")

print(y_test)
print("\nPredicted y_pred values:")
print(y_pred)

# Calculating and printing confusion matrix and accuracy score

from sklearn.metrics import confusion_matrix, accuracy_score
cm = confusion_matrix(y_test, y_pred)
ac = accuracy_score(y_test, y_pred)
print("\nConfusion matrix:")
print(cm)
print("\nAccuracy score:")
print(ac)

OUTPUT:

[0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1
001010100000110000

0010000100101100011
001001010100001001

NIDHIN S
21EE076
0 0 0 0 1 1]

Predicted y_pred values:

[0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1
001010100000110000

0010000100101100011
001001010100001001

0 0 0 0 1 1]

Confusion matrix:

[[58 0]

[ 0 22]]

Accuracy score:

1.0

INFERENCE:

A Gaussian Naive Bayes model is employed within this program to predict

user click-through rates on social network advertisements. The program
accomplishes this through data pre-processing, model training, and performance
evaluation. This facilitates advertisers in identifying user traits that correlate
with successful ad clicks.

NIDHIN S
21EE076
RUBRICS:

RESULT:

Thus, The program successfully built and trained a Gaussian Naive Bayes
model to classify user clicks on social network ads. It achieved an accuracy
score of 1.0 on the test set, indicating perfect prediction for this specific data
split.

NIDHIN S
21EE076

SAP PS Configuration Blogpost Collection Dnjxfi
0% (1)
SAP PS Configuration Blogpost Collection Dnjxfi
76 pages
Introduction To Docker: Ajeet Singh Raina Docker Captain - Docker, Inc
No ratings yet
Introduction To Docker: Ajeet Singh Raina Docker Captain - Docker, Inc
56 pages
Machine Learning Program 4 (SHANKAR)
No ratings yet
Machine Learning Program 4 (SHANKAR)
6 pages
Practical 3
No ratings yet
Practical 3
11 pages
Purva Rawale - BDA Practical No 2
No ratings yet
Purva Rawale - BDA Practical No 2
9 pages
DWM Exp5 C49
No ratings yet
DWM Exp5 C49
12 pages
DSBDA Practical 6 Tutorial
No ratings yet
DSBDA Practical 6 Tutorial
3 pages
Naive Bayes Classifiers - Parta
No ratings yet
Naive Bayes Classifiers - Parta
17 pages
PL LAB 3 File
No ratings yet
PL LAB 3 File
56 pages
ML Mid Sem Sep2023 Paper
No ratings yet
ML Mid Sem Sep2023 Paper
3 pages
Naive Bayes Classifier in Machine Learning Javatpoint
No ratings yet
Naive Bayes Classifier in Machine Learning Javatpoint
23 pages
Exp 3 Bi
No ratings yet
Exp 3 Bi
12 pages
Naive Bayes Algorithm For Classification Tasks: Sana Badagan 1MS24RAI09
No ratings yet
Naive Bayes Algorithm For Classification Tasks: Sana Badagan 1MS24RAI09
31 pages
Ad Prediction Using Click Through Rate and Machine Learning With Reinforcement Learning
No ratings yet
Ad Prediction Using Click Through Rate and Machine Learning With Reinforcement Learning
5 pages
Mod ICETEMS 209
No ratings yet
Mod ICETEMS 209
15 pages
Naive456 Bayes297Classification
No ratings yet
Naive456 Bayes297Classification
21 pages
Adkdd 2014 Camera Ready Junfeng
No ratings yet
Adkdd 2014 Camera Ready Junfeng
9 pages
9 Gausian
No ratings yet
9 Gausian
30 pages
Implemention of Sms Spam Filtering
No ratings yet
Implemention of Sms Spam Filtering
27 pages
Prog 6
No ratings yet
Prog 6
3 pages
Practical # 11
No ratings yet
Practical # 11
10 pages
Dev ML Ex5
No ratings yet
Dev ML Ex5
6 pages
Machen e Learning
No ratings yet
Machen e Learning
9 pages
AI and ML Lab Manual
No ratings yet
AI and ML Lab Manual
29 pages
Naive Bates Classifier
No ratings yet
Naive Bates Classifier
18 pages
Exp 3 Bi 30
No ratings yet
Exp 3 Bi 30
7 pages
Naive Bayes Model With Python 1684166563
No ratings yet
Naive Bayes Model With Python 1684166563
9 pages
Lab Manual
No ratings yet
Lab Manual
17 pages
ML Assignment-7
No ratings yet
ML Assignment-7
3 pages
Pract 8 - Naive Bays Algorithm
No ratings yet
Pract 8 - Naive Bays Algorithm
2 pages
Build Naive Bayes Model for Churn Prediction
No ratings yet
Build Naive Bayes Model for Churn Prediction
9 pages
Naïve Bayes Algorithm Lab
No ratings yet
Naïve Bayes Algorithm Lab
4 pages
Slidesgo Understanding The Naive Bayes Classifier A Comprehensive Analysis 20240507092502bi0e
No ratings yet
Slidesgo Understanding The Naive Bayes Classifier A Comprehensive Analysis 20240507092502bi0e
11 pages
Naive Bayes
No ratings yet
Naive Bayes
11 pages
NaiveBayersClassification BA
No ratings yet
NaiveBayersClassification BA
36 pages
Naive Bayes
No ratings yet
Naive Bayes
9 pages
ML Lab 7 - Naive Bayes
No ratings yet
ML Lab 7 - Naive Bayes
6 pages
Naïve Bayesian Classifier
No ratings yet
Naïve Bayesian Classifier
15 pages
ML Assignment 1
No ratings yet
ML Assignment 1
57 pages
Naïve Bayes Classifier Algorithm
No ratings yet
Naïve Bayes Classifier Algorithm
11 pages
Thesis 450 2 1
No ratings yet
Thesis 450 2 1
36 pages
Ai 5
No ratings yet
Ai 5
7 pages
Report On Email Spam
No ratings yet
Report On Email Spam
7 pages
Project Documentation - LightGBM Tuning For Ad Fraud Detection
No ratings yet
Project Documentation - LightGBM Tuning For Ad Fraud Detection
9 pages
Naive Bayes
No ratings yet
Naive Bayes
4 pages
Algorithms For Exercises
No ratings yet
Algorithms For Exercises
5 pages
9.program Naive Bayes
No ratings yet
9.program Naive Bayes
9 pages
Naive Bayes for Data Scientists
No ratings yet
Naive Bayes for Data Scientists
4 pages
Sms Spam Using Machine Learning 4
No ratings yet
Sms Spam Using Machine Learning 4
42 pages
Spam Detection Model
No ratings yet
Spam Detection Model
4 pages
Cp4252 Machine Learning Lab Manual
No ratings yet
Cp4252 Machine Learning Lab Manual
40 pages
Jeb Am Am Assignment T 1
No ratings yet
Jeb Am Am Assignment T 1
16 pages
Lec 09
No ratings yet
Lec 09
50 pages
AIML - Ex.3 Manual
No ratings yet
AIML - Ex.3 Manual
4 pages
Wa0001
No ratings yet
Wa0001
39 pages
Assignment - 01
No ratings yet
Assignment - 01
4 pages
Ame: Waqar Ali
No ratings yet
Ame: Waqar Ali
22 pages
Naive Bayes Classification Guide
No ratings yet
Naive Bayes Classification Guide
2 pages
Naive Bayes Classifier in Machine Learning
No ratings yet
Naive Bayes Classifier in Machine Learning
16 pages
Evaluating Model Performance Unit 6
No ratings yet
Evaluating Model Performance Unit 6
46 pages
DSCI 303: Machine Learning For Data Science Fall 2020
No ratings yet
DSCI 303: Machine Learning For Data Science Fall 2020
5 pages
Unit 1 2 3
0% (1)
Unit 1 2 3
50 pages
Web Design & Marketing Basics
No ratings yet
Web Design & Marketing Basics
65 pages
M2User Guide
No ratings yet
M2User Guide
178 pages
Simatic Net PG/PC - Industrial Ethernet CP 1623
No ratings yet
Simatic Net PG/PC - Industrial Ethernet CP 1623
22 pages
Learning Guide: Tour Service Level III
No ratings yet
Learning Guide: Tour Service Level III
35 pages
Aircraft IT Ops V10.4 - SEPTEMBER-OCTOBER 2021 - V10.4
No ratings yet
Aircraft IT Ops V10.4 - SEPTEMBER-OCTOBER 2021 - V10.4
77 pages
Royal College Grade 11 Information and Communication Technology Second Term Paper 2022 English Medium
No ratings yet
Royal College Grade 11 Information and Communication Technology Second Term Paper 2022 English Medium
15 pages
Rev1 (0611) Cranex D Service
No ratings yet
Rev1 (0611) Cranex D Service
264 pages
AY24 - 25 S1 Week 2 Engineering Reasoning Framework Traits and Elements Tutorial Handout
No ratings yet
AY24 - 25 S1 Week 2 Engineering Reasoning Framework Traits and Elements Tutorial Handout
7 pages
Ingest 6.5.2 Release Notes
No ratings yet
Ingest 6.5.2 Release Notes
42 pages
Guardmaster Configurable Safety Relay: User Manual
No ratings yet
Guardmaster Configurable Safety Relay: User Manual
190 pages
AI MCQ Quiz for Students
No ratings yet
AI MCQ Quiz for Students
18 pages
What Is TikTok An Introduction
No ratings yet
What Is TikTok An Introduction
8 pages
Designing Forms and Reports Guide
No ratings yet
Designing Forms and Reports Guide
23 pages
CURRICULUM MAP 10 Computer
No ratings yet
CURRICULUM MAP 10 Computer
11 pages
An5042 How To Calibrate The Hse Clock For RF Applications On stm32 Wireless Mcus Stmicroelectronics
No ratings yet
An5042 How To Calibrate The Hse Clock For RF Applications On stm32 Wireless Mcus Stmicroelectronics
67 pages
Network Model
No ratings yet
Network Model
25 pages
Armitage Use, Backtrack 5
No ratings yet
Armitage Use, Backtrack 5
5 pages
Python For Kids (Level1-Level 2) 3rd - Week
No ratings yet
Python For Kids (Level1-Level 2) 3rd - Week
6 pages
CN Lec2
No ratings yet
CN Lec2
49 pages
CV 5
No ratings yet
CV 5
5 pages
Data Logger - DLLTE-IS PDF
No ratings yet
Data Logger - DLLTE-IS PDF
3 pages
Batch - 10 OS
No ratings yet
Batch - 10 OS
12 pages
Zero Pair - Google Search
No ratings yet
Zero Pair - Google Search
1 page
Cloud Classification and Rainfall Prediction
No ratings yet
Cloud Classification and Rainfall Prediction
5 pages
Cse3035 - Principles-Of-Cloud-Computing - Eth - 1.0 - 57 - Cse3035 - 61 Acp
No ratings yet
Cse3035 - Principles-Of-Cloud-Computing - Eth - 1.0 - 57 - Cse3035 - 61 Acp
3 pages
P21 User Manual: 1、Main Technology Parameters
No ratings yet
P21 User Manual: 1、Main Technology Parameters
3 pages
DAX Interview Questions
No ratings yet
DAX Interview Questions
8 pages

Naive Bayes Classifier CSV Guide

Uploaded by

Naive Bayes Classifier CSV Guide

Uploaded by

Ex.

No: 4 NAIVE BAYESIAN CLASSIFIER FOR A SAMPLE TRAINING

PYTHON IDLE( 3.12.1 64 BIT)

1. Data Acquisition and Preprocessing

 1.1 Library Imports: Essential libraries for data manipulation and

 1.3 Feature Separation: The DataFrame is carefully examined to

 Target Variable Isolation: The dependent variable, representing the

 Categorical Data Encoding (if applicable): If any of the features

 Feature Scaling: Feature scaling is meticulously applied to both the

 Model Instantiation: A Gaussian Naive Bayes classifier is instantiated,

 Model Training: The meticulously prepared training data (features 'X'

 Prediction: Once trained, the model is utilized to make predictions on

 Evaluation Metrics: The model's performance is rigorously evaluated

3. Result Display (Optional)

 The meticulously generated confusion matrix and the calculated accuracy

4. Additional Considerations (Optional)

o Techniques like GridSearchCV from scikit-learn can be employed

#import necessary libraries

#importing the datasets

# Encoding categorical data (the Gender column)

# Training the naive bayes model on the training set

# printing values print("\nActual y_test values:")

# Calculating and printing confusion matrix and accuracy score

Predicted y_pred values:

A Gaussian Naive Bayes model is employed within this program to predict

You might also like