0% found this document useful (0 votes)

21 views20 pages

Data Science Record - 05

The document outlines experiments for data exploration, preprocessing, linear regression, logistic regression, and Naive Bayes classification using Python. It includes algorithms, code implementations, and results for each experiment, demonstrating data handling and model evaluation techniques. The experiments aim to provide practical applications of machine learning methods on datasets.

Uploaded by

Deepak Sathis

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views20 pages

Data Science Record - 05

Uploaded by

Deepak Sathis

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 20

EXP.

NO: 01
PERFORM DATA EXPLORATION AND PREPROCESSING
DATE: 23.01.2025

AIM:
To write a python code that will perform data exploration and preprocessing for the
uploaded dataset.
ALGORITHM:
Step 1: Start the program
Step 2: Import the necessary python libraries
Step 3: Load the data set in the current file directory
Step 4: Perform data exploration and data preprocessing for the loaded dataset
Step 5: Display the output
Step 6: Stop the program
CODE:
import pandas as pd
pd.set_option("display.max_rows", None, "display.max_columns", None, "display.width",
None)
file_path = '/content/traffic_accidects.csv'
df = pd.read_csv(file_path)
print("First few rows of the dataset:")
print(df.head())
print("First few rows of the dataset:")
print(df.head())
print("\nSummary Statistics:")
print(df.describe(include="all"))
print("\nMissing Values:")
print(df.isnull().sum())
if 'Age' in df.columns:
df['Age'] = df['Age'].fillna(df['Age'].median())
if 'Salary' in df.columns:
df['Salary'] = df['Salary'].fillna(df['Salary'].mean())
if 'AccidentDate' in df.columns:
df['AccidentDate'] = df['AccidentDate'].fillna("Unknown")
df['AccidentDate'] = pd.to_datetime(df['AccidentDate'], errors='coerce')
if 'Gender' in df.columns:
df['Gender'] = df['Gender'].map({'Male': 0, 'Female': 1})
if 'SeverityScore' in df.columns:
df = df.dropna(subset=['SeverityScore'])
if 'AccidentDate' in df.columns:
current_year = pd.Timestamp.now().year
df['YearsSinceAccident'] = current_year - df['AccidentDate'].dt.year
if 'Salary' in df.columns:
Q1 = df['Salary'].quantile(0.25)
Q3 = df['Salary'].quantile(0.75)
IQR = Q3 - Q1
lower_bound = Q1 - 1.5 * IQR
upper_bound = Q3 + 1.5 * IQR
df = df[(df['Salary'] >= lower_bound) & (df['Salary'] <= upper_bound)]
print("\nCleaned Dataset:")
print(df.head().to_string())
OUTPUT:
Particulars Marks Allotted Marks Awarded

Program / Simulation 40

Program Execution 30

Result 20

Viva Voce 10

Total 100

RESULT:
Thus, a program for data exploration and preprocessing has been successfully
executed.

EXP.NO: 02 (a)

DATE: 30.01.2025 Implement linear and logistic regression

1). Linear regression:

a). Single linear regression:
AIM:
To write a python code for the implementation of single linear regression to find a
straight line that goes through data points.
ALGORITHM:
Step 1: Start the program
Step 2: Import the necessary python libraries
Step 3: load the datasets and using formula build a code for single linear regression
Step 4: Display the output
Step 5: Stop the program
CODE:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import Ridge, Lasso
from sklearn.metrics import mean_squared_error, r2_score
import matplotlib.pyplot as plt
from sklearn.preprocessing import StandardScaler
from sklearn.feature_selection import SelectFromModel
file_path = "/content/datasets/house_prices.csv"
df = pd.read_csv(file_path)
df = df.drop(columns=['id', 'date'])
df['bedrooms'] = df['bedrooms'].fillna(df['bedrooms'].median())
df['bathrooms'] = df['bathrooms'].fillna(df['bathrooms'].median())
df['sqft_living'] = df['sqft_living'].fillna(df['sqft_living'].median())
df['sqft_lot'] = df['sqft_lot'].fillna(df['sqft_lot'].median())
df['waterfront'] = df['waterfront'].fillna(df['waterfront'].mode()[0])
df['view'] = df['view'].fillna(df['view'].mode()[0])
df['condition'] = df['condition'].fillna(df['condition'].mode()[0])
df['grade'] = df['grade'].fillna(df['grade'].mode()[0])
df['age_of_house'] = 2025 - df['yr_built']
df['time_since_renovation'] = 2025 - df['yr_renovated']
df['time_since_renovation'] = df['time_since_renovation'].where(df['yr_renovated'] != 0,
0)
df['total_sqft'] = df['sqft_living'] + df['sqft_basement']
df['bedrooms_sqft'] = df['bedrooms'] * df['sqft_living']
df = pd.get_dummies(df, columns=['waterfront', 'view', 'condition', 'zipcode'],
drop_first=True)
X = df.drop(columns=['price'])
y = df['price']
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2,
random_state=42)
ridge_model = Ridge(alpha=1.0)
lasso_model = Lasso(alpha=0.1)
ridge_model.fit(X_train, y_train)
lasso_model.fit(X_train, y_train)
y_pred_ridge = ridge_model.predict(X_test)
y_pred_lasso = lasso_model.predict(X_test)
mse_ridge = mean_squared_error(y_test, y_pred_ridge)
rmse_ridge = np.sqrt(mse_ridge)
r2_ridge = r2_score(y_test, y_pred_ridge)
mse_lasso = mean_squared_error(y_test, y_pred_lasso)
rmse_lasso = np.sqrt(mse_lasso)
r2_lasso = r2_score(y_test, y_pred_lasso)
print("Ridge Regression Model Evaluation:")
print(f"MSE: {mse_ridge:.2f}")
print(f"RMSE: {rmse_ridge:.2f}")
print(f"R-squared: {r2_ridge:.2f}")
print("\nLasso Regression Model Evaluation:")
print(f"MSE: {mse_lasso:.2f}")
print(f"RMSE: {rmse_lasso:.2f}")
print(f"R-squared: {r2_lasso:.2f}")
plt.figure(figsize=(10, 6))
plt.scatter(y_test, y_pred_ridge, color='blue', alpha=0.6, label="Ridge")
plt.plot([y_test.min(), y_test.max()], [y_test.min(), y_test.max()], 'r--', linewidth=2)
plt.title('Actual vs Predicted Housing Prices (Ridge Regression)', fontsize=16)
plt.xlabel('Actual Housing Price', fontsize=14)
plt.ylabel('Predicted Housing Price', fontsize=14)
plt.legend()
plt.grid()
plt.show()
plt.figure(figsize=(10, 6))
plt.scatter(y_test, y_pred_lasso, color='green', alpha=0.6, label="Lasso")
plt.plot([y_test.min(), y_test.max()], [y_test.min(), y_test.max()], 'r--', linewidth=2)
plt.title('Actual vs Predicted Housing Prices (Lasso Regression)', fontsize=16)
plt.xlabel('Actual Housing Price', fontsize=14)
plt.ylabel('Predicted Housing Price', fontsize=14)
plt.legend()
plt.grid()
plt.show()

OUTPUT:

b). Multi linear regression:

AIM:
To write a python code for the implementation of multi linear regression to find a
straight line that goes through data points.
ALGORITHM:
Step 1: Start the program
Step 2: Import the necessary python libraries
Step 3: load the datasets and using formula build a code for multi linear regression
Step 4: Display the output
Step 5: Stop the program
CODE:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, confusion_matrix
file_path = "/content/house_prices.csv"
df = pd.read_csv(file_path)
price_threshold = 500000
df['price_above_threshold'] = (df['price'] > price_threshold).astype(int)
categorical_columns = ['waterfront', 'view', 'condition', 'grade', 'zipcode']
df = pd.get_dummies(df, columns=categorical_columns, drop_first=True)
features = ['bedrooms', 'bathrooms', 'sqft_living', 'sqft_lot', 'floors', 'sqft_above',
'sqft_basement', 'yr_built', 'yr_renovated', 'lat', 'long', 'sqft_living15',
'sqft_lot15'] + [col for col in df.columns if
col.startswith(tuple(categorical_columns))]
X = df[features]
y = df['price_above_threshold']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)
model = LogisticRegression(max_iter=1000)
model.fit(X_train_scaled, y_train)
y_pred = model.predict(X_test_scaled)
accuracy = accuracy_score(y_test, y_pred)
conf_matrix = confusion_matrix(y_test, y_pred)
plt.figure(figsize=(6, 4))
sns.heatmap(conf_matrix, annot=True, fmt='d', cmap='Blues', xticklabels=["Below",
"Above"], yticklabels=["Below", "Above"])
plt.xlabel("Predicted Label")
plt.ylabel("True Label")
plt.title(f'Confusion Matrix (Accuracy: {accuracy:.2f})')
plt.show()
coefficients = model.coef_[0]
intercept = model.intercept_[0]
coeff_df = pd.DataFrame({'Feature': features, 'Coefficient':
coefficients}).sort_values(by='Coefficient', ascending=False)
accuracy, conf_matrix, coeff_df.head(10), intercept

OUTPUT:
EXP.NO: 02 (b)
DATE: 30.01.2025 Implement linear and logistic regression

2). Logistic regression:

AIM:
To write a python code for the implementation of logistic regression to find a sigmoid
that goes through data points.
ALGORITHM:
Step 1: Start the program
Step 2: Import the necessary python libraries
Step 3: load the datasets and using formula build a code for logistic regression
Step 4: Display the output
Step 5: Stop the program

CODE:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import PolynomialFeatures
iris = load_iris()
X = iris.data[:, 0].reshape(-1, 1)
y = (iris.target == 0).astype(int)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
poly = PolynomialFeatures(degree=2)
X_train_poly = poly.fit_transform(X_train)
X_test_poly = poly.transform(X_test)
def sigmoid(z): return 1 / (1 + np.exp(-z))
def gradient_descent(X, y, theta, lr=0.01, iters=1000):
for _ in range(iters):
theta -= lr * (X.T @ (sigmoid(X @ theta) - y)) / len(y)
return theta
theta = np.zeros(X_train_poly.shape[1])
theta_optimal = gradient_descent(X_train_poly, y_train, theta)
predictions = sigmoid(X_test_poly @ theta_optimal) >= 0.5
accuracy = np.mean(predictions == y_test)
print(f"Accuracy: {accuracy * 100:.2f}%")
x_values = np.linspace(X_train.min(), X_train.max(), 100).reshape(-1, 1)
x_poly = poly.transform(x_values)
y_values = sigmoid(x_poly @ theta_optimal) >= 0.5
plt.scatter(X_train, y_train, color='blue', label='Training data')
plt.plot(x_values, y_values, color='red', label='Decision Boundary')
plt.xlabel('Sepal Length (cm)')
plt.ylabel('Setosa (1) vs Not Setosa (0)')
plt.title('Logistic Regression with Curved Decision Boundary')
plt.legend()
plt.grid(True)
plt.show()

OUTPUT:
Particulars Marks Allotted Marks Awarded

Program / Simulation 40

Program Execution 30

Result 20

Viva Voce 10

Total 100

RESULT:
Thus, a program for linear and logistic regression has been successfully executed.

EXP.NO: 03

DATE: 06.02.2025 Naive bayes classifier

AIM:
To write a Python code for the implementation of the Naive Bayes Classifier for classifying
data based on probability distributions.
ALGORITHM:
Step 1: Start the program
Step 2: Import the necessary Python libraries
Step 3: Load the dataset and preprocess the data
Step 4: Compute the prior probabilities and likelihood using Bayes' theorem
Step 5: Build the Naïve Bayes classifier and train it on the dataset
Step 6: Use the trained model to make predictions
Step 7: Display the output
Step 8: Stop the program
CODE:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.metrics import accuracy_score, confusion_matrix
from sklearn.model_selection import train_test_split
class NaiveBayesClassifier:
def __init__(self):
self.class_priors = {}
self.means = {}
self.variances = {}
self.classes = None

def fit(self, X, y):

self.classes = np.unique(y)
for c in self.classes:
X_c = X[y == c]
self.class_priors[c] = len(X_c) / len(X)
self.means[c] = np.mean(X_c, axis=0)
self.variances[c] = np.var(X_c, axis=0) + 1e-9
def gaussian_pdf(self, x, mean, variance):
coeff = 1 / np.sqrt(2 * np.pi * variance)
exponent = np.exp(-((x - mean) ** 2) / (2 * variance))
return coeff * exponent
def predict(self, X):
predictions = []
for x in X:
posteriors = {}
for c in self.classes:
prior = np.log(self.class_priors[c])
likelihood = np.sum(np.log(self.gaussian_pdf(x, self.means[c], self.variances[c])))
posteriors[c] = prior + likelihood
predictions.append(max(posteriors, key=posteriors.get))
return np.array(predictions)
df = pd.read_csv('/content/house_prices2.csv')
for col in df.columns:
if df[col].dtype == 'object':
df[col] = pd.factorize(df[col])[0]
X = df.iloc[:, :-1].values
y = df.iloc[:, -1].values
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
nb = NaiveBayesClassifier()
nb.fit(X_train, y_train)
y_pred = nb.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.2f}")
conf_matrix = confusion_matrix(y_test, y_pred)
plt.figure(figsize=(6, 5))
sns.heatmap(conf_matrix, annot=True, fmt='d', cmap='Blues', xticklabels=np.unique(y),
yticklabels=np.unique(y))
plt.xlabel("Predicted Label")
plt.ylabel("True Label")
plt.title("Confusion Matrix for Naïve Bayes Classifier")
plt.show()
test_sizes = np.linspace(0.1, 0.5, 5)
accuracies = []
for test_size in test_sizes:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=test_size,
random_state=42)
nb.fit(X_train, y_train)
y_pred = nb.predict(X_test)
accuracies.append(accuracy_score(y_test, y_pred))
plt.figure(figsize=(7, 5))
plt.plot(test_sizes, accuracies, marker='o', linestyle='-', color='m', label="Naïve Bayes
Accuracy")
plt.xlabel("Test Size")
plt.ylabel("Accuracy")
plt.title("Naïve Bayes Accuracy vs. Test Size")
plt.legend()
plt.grid()
plt.show()
OUTPUT:
Particulars Marks Allotted Marks Awarded

Program / Simulation 40

Program Execution 30

Result 20

Viva Voce 10

Total 100

RESULT:

The required Naïve bayes model has been executed successfull

EXP.NO: 04
AIM:
DATE: 13.03.2025 POWER BI

To make an analytical dashboard for E-commerce

ALGORITHM:
STEP 1: Load Data - Import dataset into Power BI using Get Data and

load it into a table.

STEP 2: Preprocess Data - Handle missing values, encode Gender,

compute Experience.

STEP 3: Define Variables - Create binary target variable

SalaryAbove50K, set X and y.

STEP 4: Split Data - Divide features and target into training

and testing sets.

STEP 5: Train Model - Train Naïve Bayes model using Power

BI AI Insights.

STEP 6: Evaluate and Visualize - Compute accuracy, generate confusion matrix, plot
ROC curve.

OUTPUT:
MARK ALLOCATION:

Particulars Marks Allotted Marks Awarded

Program / Simulation 40

Program Execution 30

Result 20

Viva Voce 10

Total 100

RESULT:
Thus, the zomato sales dataset has been successfully visualized using a PowerBI
dashboard.

Regression Analysis - Cheatsheet
No ratings yet
Regression Analysis - Cheatsheet
9 pages
Multilevel Models Explained
No ratings yet
Multilevel Models Explained
13 pages
Train
No ratings yet
Train
17 pages
Machine Learning
No ratings yet
Machine Learning
10 pages
ML Lab Experiment Shivansh
No ratings yet
ML Lab Experiment Shivansh
29 pages
Da Rec
No ratings yet
Da Rec
29 pages
ML Manual
No ratings yet
ML Manual
9 pages
ML 1-11
No ratings yet
ML 1-11
27 pages
DSBDA Prac4 2
No ratings yet
DSBDA Prac4 2
1 page
Python File
No ratings yet
Python File
5 pages
DA Lab2
No ratings yet
DA Lab2
5 pages
Zerox Ready
No ratings yet
Zerox Ready
21 pages
ML
No ratings yet
ML
17 pages
ML Manual
No ratings yet
ML Manual
30 pages
Da Lab Mannual
No ratings yet
Da Lab Mannual
25 pages
Final-12-Lab Programs
No ratings yet
Final-12-Lab Programs
30 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
22 pages
ML Manual
No ratings yet
ML Manual
24 pages
Experiment No 11
No ratings yet
Experiment No 11
19 pages
ML Final Prac
No ratings yet
ML Final Prac
47 pages
Group Work Assignment Supervised and Unsupervised Learning
No ratings yet
Group Work Assignment Supervised and Unsupervised Learning
10 pages
ML Practical 5
No ratings yet
ML Practical 5
10 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
20 pages
ML 6 7 8
No ratings yet
ML 6 7 8
10 pages
Data Analysis for Beginners
No ratings yet
Data Analysis for Beginners
1 page
ML Lab Record
No ratings yet
ML Lab Record
17 pages
Python Data Analysis Guide
No ratings yet
Python Data Analysis Guide
171 pages
DA Programs
No ratings yet
DA Programs
44 pages
Linear Regression Analysis - Polynomial Regression
No ratings yet
Linear Regression Analysis - Polynomial Regression
25 pages
IoT Task4 21BEC0384
No ratings yet
IoT Task4 21BEC0384
9 pages
Aiml Practicals
No ratings yet
Aiml Practicals
22 pages
ML Manual
No ratings yet
ML Manual
29 pages
Parth ML
No ratings yet
Parth ML
24 pages
ML Full For Print New 1
No ratings yet
ML Full For Print New 1
38 pages
Logistic Regression in Python
No ratings yet
Logistic Regression in Python
4 pages
Project 4 - House Price Prediction - Ipynb - Colab
No ratings yet
Project 4 - House Price Prediction - Ipynb - Colab
5 pages
ML Record
No ratings yet
ML Record
19 pages
ML Lap
No ratings yet
ML Lap
23 pages
Regression Analysis - Lasso and Ridge Regularization
No ratings yet
Regression Analysis - Lasso and Ridge Regularization
17 pages
Machinelearning
No ratings yet
Machinelearning
26 pages
External
No ratings yet
External
11 pages
Lasso Regression Aim: Roll Number: 160122733094 Date
No ratings yet
Lasso Regression Aim: Roll Number: 160122733094 Date
8 pages
DS Food
No ratings yet
DS Food
23 pages
CB Lab 221801017
No ratings yet
CB Lab 221801017
33 pages
ML Recordjp
No ratings yet
ML Recordjp
35 pages
AIML Project
No ratings yet
AIML Project
4 pages
CP4252 Machine Learning Lab Manual
No ratings yet
CP4252 Machine Learning Lab Manual
26 pages
Praveen Ai
No ratings yet
Praveen Ai
6 pages
SiddharthShah 1032221195 DivC 50 DL LabAssignment2
No ratings yet
SiddharthShah 1032221195 DivC 50 DL LabAssignment2
7 pages
Exercise4 Solution
No ratings yet
Exercise4 Solution
20 pages
Data Mining Lab: Regression & Clustering
No ratings yet
Data Mining Lab: Regression & Clustering
36 pages
Da 012307
No ratings yet
Da 012307
8 pages
ML All Projectpdf Removed
No ratings yet
ML All Projectpdf Removed
41 pages
Dsbda 5
No ratings yet
Dsbda 5
4 pages
Ex No.: Date: Problem Statement
No ratings yet
Ex No.: Date: Problem Statement
3 pages
Lab 14 Questions
No ratings yet
Lab 14 Questions
4 pages
Linear Regression with Boston Housing Data
No ratings yet
Linear Regression with Boston Housing Data
14 pages
DSBDA Practicals
No ratings yet
DSBDA Practicals
16 pages
Student List Export 1740247935189
No ratings yet
Student List Export 1740247935189
2 pages
CSE (2250 X 3300 PX)
No ratings yet
CSE (2250 X 3300 PX)
1 page
Student Alcohol Consumption Conference Paper
No ratings yet
Student Alcohol Consumption Conference Paper
8 pages
IQR Calculation Chart
No ratings yet
IQR Calculation Chart
2 pages
Deepak S
No ratings yet
Deepak S
2 pages
Conference
No ratings yet
Conference
3 pages
My Resume
No ratings yet
My Resume
1 page
ASWIN
No ratings yet
ASWIN
3 pages
Metrology & Measurement Reliability Guide
No ratings yet
Metrology & Measurement Reliability Guide
109 pages
Action Research
No ratings yet
Action Research
12 pages
Canonical Correlation Analysis Guide
No ratings yet
Canonical Correlation Analysis Guide
8 pages
Research Proposal
No ratings yet
Research Proposal
14 pages
Bank Employee Turnover Factors
100% (1)
Bank Employee Turnover Factors
5 pages
FCE - Presentation - Raj Singh
No ratings yet
FCE - Presentation - Raj Singh
13 pages
Chapter 8 Argumentative Essays
No ratings yet
Chapter 8 Argumentative Essays
44 pages
Solutions To Sample Final Exam ECO2151
No ratings yet
Solutions To Sample Final Exam ECO2151
7 pages
An Analysis of Educational Problems of First Generation Learners
No ratings yet
An Analysis of Educational Problems of First Generation Learners
6 pages
Slide Show
No ratings yet
Slide Show
51 pages
Sensors & Transducers Lecture Notes
No ratings yet
Sensors & Transducers Lecture Notes
119 pages
Biostatistics / Orthodontic Courses by Indian Dental Academy
100% (1)
Biostatistics / Orthodontic Courses by Indian Dental Academy
45 pages
SPC Charts for Quality Control
67% (3)
SPC Charts for Quality Control
3 pages
Coconut Shell Reinforced Cement Bricks
No ratings yet
Coconut Shell Reinforced Cement Bricks
27 pages
11241-Article Text-23214-1-10-20190524 PDF
No ratings yet
11241-Article Text-23214-1-10-20190524 PDF
13 pages
Lecture6 Clustering
No ratings yet
Lecture6 Clustering
47 pages
Parametric & Non-Parametric Test... (Stats) Part2
100% (1)
Parametric & Non-Parametric Test... (Stats) Part2
4 pages
Endogeneity
No ratings yet
Endogeneity
10 pages
Cold Storage Case Analysis Final
No ratings yet
Cold Storage Case Analysis Final
7 pages
Digital Escape Rooms Boost Student Motivation
No ratings yet
Digital Escape Rooms Boost Student Motivation
14 pages
Q.P. Code: 383822
No ratings yet
Q.P. Code: 383822
20 pages
PTE Prediction 21-27 Jan
No ratings yet
PTE Prediction 21-27 Jan
82 pages
Sekolah Menengah Kebangsaan Bandar Bintulu: Additional Mathematics Project Work 2 (2017)
No ratings yet
Sekolah Menengah Kebangsaan Bandar Bintulu: Additional Mathematics Project Work 2 (2017)
11 pages
Phys 0120 Studen 20 Work Book
No ratings yet
Phys 0120 Studen 20 Work Book
318 pages
Enhanced Electronic Vital Events Registration System For Ethiopia (EEVERSE)
No ratings yet
Enhanced Electronic Vital Events Registration System For Ethiopia (EEVERSE)
70 pages
Errors in Measurements: Ali Asghar Khan - 20PWMEC4992
No ratings yet
Errors in Measurements: Ali Asghar Khan - 20PWMEC4992
3 pages
PTSP Ii Ece
No ratings yet
PTSP Ii Ece
3 pages
Thesis On Teachers Job Satisfaction
100% (3)
Thesis On Teachers Job Satisfaction
4 pages
SPSS Data Analysis Guide
100% (1)
SPSS Data Analysis Guide
38 pages

Data Science Record - 05

Uploaded by

Data Science Record - 05

Uploaded by

EXP.

DATE: 30.01.2025 Implement linear and logistic regression

1). Linear regression:

b). Multi linear regression:

2). Logistic regression:

DATE: 06.02.2025 Naive bayes classifier

def fit(self, X, y):

The required Naïve bayes model has been executed successfull

To make an analytical dashboard for E-commerce

load it into a table.

STEP 2: Preprocess Data - Handle missing values, encode Gender,

STEP 3: Define Variables - Create binary target variable

SalaryAbove50K, set X and y.

STEP 4: Split Data - Divide features and target into training

and testing sets.

STEP 5: Train Model - Train Naïve Bayes model using Power

Particulars Marks Allotted Marks Awarded

You might also like