0% found this document useful (0 votes)

36 views15 pages

23BCE7199 ML Lab Assignment

The document contains a lab assignment by Ch. Abhiram, detailing various machine learning techniques including Decision Trees, Linear Regression, Logistic Regression, Random Forest Classifier, and Clustering using K-Means. Each section includes code snippets for loading datasets, preprocessing data, training models, and evaluating their performance. The assignment emphasizes practical applications of these algorithms on datasets like weather, crop production, study hours, Titanic, and student marks.

Uploaded by

abhiram.ch.2005

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views15 pages

23BCE7199 ML Lab Assignment

Uploaded by

abhiram.ch.2005

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

Lab Assignment

Name: Ch.Abhiram
RegNo : 23BCE7199
SLOT: L41+L42

Faculty : Swanth Boppudi

1. Decision Tree for Whether Dataset:

Code:
import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

from sklearn.tree import DecisionTreeClassifier,

plot_tree from sklearn.preprocessing import

LabelEncoder

from sklearn.model_selection import train_test_split

# Load the weather dataset

filename = "weather.csv" # Ensure this path is correct

df = pd.read_csv(filename)

print(df)

# Remove the 'Day' feature if present

df = df.drop(columns=['Day'], errors='ignore')
# Display the first few rows of the dataset

print(df.head())

# Encode categorical features using LabelEncoder

label_encoders = {}

for column in df.columns:

if df[column].dtype == 'object': # Apply encoding only to categorical columns

le = LabelEncoder()

df[column] = le.fit_transform(df[column])

label_encoders[column] = le

print(" After fit and transform

") print(df)

# Define features and target

X = df.iloc[:, :-1] # All columns except the last as features

y = df.iloc[:, -1] # Last column as target

# Split the data into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Build the decision tree classifier using the entropy criterion

model = DecisionTreeClassifier(criterion='entropy', random_state=42)

model.fit(X_train, y_train)

# Visualize the decision tree

plt.figure(figsize=(10, 6))

plot_tree(

model,

feature_names=X.columns,
class_names=label_encoders[df.columns[-1]].classes_ if df.columns[-1] in label_encoders else
None,

filled=True,

rounded=True,

fontsize=10

plt.title("Simple ID3 Decision Tree for Weather Dataset")

plt.show()

Output:
2.Linear Regression

Code:
# Import required libraries

import pandas as pd

import numpy as np

from sklearn.model_selection import train_test_split

from sklearn.linear_model import LinearRegression

from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score

from sklearn.preprocessing import LabelEncoder, StandardScaler

# Load dataset

data = pd.read_csv('India_Crop_Production (1).csv')

# Display basic info

print(data.head())

print(data.info())

# Handle missing values (example: drop rows with missing values)

data = data.dropna()

data = data[data['Production'] != '=']

# Verify the rows are removed

print(data[data['Production'] ==

'='])

# Encode categorical features

categorical_cols = ['State_Name', 'District_Name', 'Crop', 'Season']

label_encoders = {}

for col in categorical_cols:

le = LabelEncoder()

data[col] = le.fit_transform(data[col])

label_encoders[col] = le

# Define features and target variable

X = data[['Area', 'Season', 'Crop', 'Crop_Year']] # Example features

y = data['Production']

# Split the dataset

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Scale the features

scaler = StandardScaler()

X_train = scaler.fit_transform(X_train)

X_test = scaler.transform(X_test)

# Train the model

model = LinearRegression()

model.fit(X_train, y_train)

# Predict on test data

y_pred = model.predict(X_test)

# Evaluate the model

mae = mean_absolute_error(y_test, y_pred)

mse = mean_squared_error(y_test, y_pred)

r2 = r2_score(y_test, y_pred)

print(f"Mean Absolute Error:

{mae}") print(f"Mean Squared Error:

{mse}") print(f"R-squared: {r2}")

Output:
3. Logistic Regression

Code:
import numpy as np

import pandas as pd

from sklearn.model_selection import train_test_split

from sklearn.linear_model import LogisticRegression

from sklearn.metrics import accuracy_score, confusion_matrix

# Read the dataset using pandas (replace 'study_hours.csv' with your actual file path)

data = pd.read_csv('study_hours.csv')

print(data)

# Assuming the target column is 'status' and all other columns are features

X = data.drop(columns=['status'])

y = data['status'] # Target variable

# Split the data into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4, random_state=20)

# Initialize the Logistic Regression model

model = LogisticRegression()

# Train the model

model.fit(X_train, y_train)

# Make predictions on the test data

y_pred = model.predict(X_test)
# Evaluate the model

accuracy = accuracy_score(y_test, y_pred)

conf_matrix = confusion_matrix(y_test, y_pred)

# Print results

print("Accuracy:", accuracy)

print("Confusion Matrix:")

print(conf_matrix)

Output:
4.Titanic Dataset:

Code:
import pandas as pd

from sklearn.model_selection import train_test_split

from sklearn.ensemble import RandomForestClassifier

from sklearn.metrics import accuracy_score, confusion_matrix, classification_report

from sklearn.preprocessing import LabelEncoder

# Load the Titanic dataset

file_path = 'Titanic-Dataset.csv' # Replace with your Titanic dataset file path

data = pd.read_csv(file_path)

# Display the first few rows of the dataset

print("Dataset Preview:")

print(data.head())

# Drop columns not relevant for the model

data = data.drop(['PassengerId', 'Name', 'Ticket', 'Cabin'], axis=1, errors='ignore')

# Fill missing values

data['Age'].fillna(data['Age'].median(), inplace=True)

data['Embarked'].fillna(data['Embarked'].mode()[0], inplace=True)

# Encode categorical features

categorical_cols = ['Sex', 'Embarked']

label_encoders = {}

for col in categorical_cols:

le = LabelEncoder()

data[col] = le.fit_transform(data[col])

label_encoders[col] = le

# Define features and target variable

X = data.drop(['Survived'], axis=1)

y = data['Survived']

# Split the dataset into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Initialize the Random Forest Classifier

model = RandomForestClassifier(n_estimators=100, random_state=42)

# Train the model

model.fit(X_train, y_train)

# Make predictions

y_pred = model.predict(X_test)

# Evaluate the model

accuracy = accuracy_score(y_test, y_pred)

conf_matrix = confusion_matrix(y_test, y_pred)

class_report = classification_report(y_test, y_pred)

# Display results

print("\nModel Evaluation:")

print(f"Accuracy: {accuracy:.2f}")

print("\nConfusion Matrix:")

print(conf_matrix)

print("\nClassification Report:")

print(class_report)
Output:
5. Clustering:

Code:
import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

from sklearn.cluster import KMeans

from sklearn.preprocessing import StandardScaler

from sklearn.metrics import silhouette_score, davies_bouldin_score

# Load dataset from CSV file

df = pd.read_csv('clustering.csv') # Ensure the file exists

# Selecting relevant features

marks = df[['Subject1', 'Subject2']].values

# Standardizing the data

scaler = StandardScaler()

marks_scaled = scaler.fit_transform(marks)

# Applying K-Means Clustering

k = 2 # Number of clusters

kmeans = KMeans(n_clusters=k, random_state=42, n_init=10)

df['Cluster'] = kmeans.fit_predict(marks_scaled)

# Get centroids

centroids = kmeans.cluster_centers_

# Assign cluster names based on performance

cluster_names = {0: 'High Performers', 1: 'Low Performers'} # Modify as needed

df['Cluster Name'] = df['Cluster'].map(cluster_names)

# Save clustered data to CSV

df.to_csv('student_marks_clustered.csv', index=False)

# Performance Metrics

inertia = kmeans.inertia_ # SSE

silhouette_avg = silhouette_score(marks_scaled,

df['Cluster']) db_index = davies_bouldin_score(marks_scaled,

df['Cluster']) print(f"Inertia (SSE): {inertia:.2f}")

print(f"Silhouette Score:

{silhouette_avg:.2f}") print(f"Davies-Bouldin

Index: {db_index:.2f}") # Display cluster-wise

information print("\nCluster Information:")

print(df.groupby('Cluster Name')[['Subject1', 'Subject2']].mean())

# Plot the clusters

plt.figure(figsize=(8, 6))

plt.scatter(marks_scaled[:, 0], marks_scaled[:, 1], c=df['Cluster'], cmap='viridis', marker='o',

edgecolors='k', label='Students')

plt.scatter(centroids[:, 0], centroids[:, 1], s=200, c='red', marker='X', label='Centroids')

plt.xlabel('Subject 1 (Scaled)')

plt.ylabel('Subject 2 (Scaled)')

plt.title('K-Means Clustering of Student Marks')

plt.legend()

plt.show()

Output:

50Cc Scooter Ac Ignition System: B G/Y G Y/R BR BR/W B Y BL/W
100% (1)
50Cc Scooter Ac Ignition System: B G/Y G Y/R BR BR/W B Y BL/W
1 page
RR Trent 60
100% (6)
RR Trent 60
39 pages
Pvs4 Information
No ratings yet
Pvs4 Information
110 pages
Final-12-Lab Programs
No ratings yet
Final-12-Lab Programs
30 pages
Machine Learning Lab Assignment 1
No ratings yet
Machine Learning Lab Assignment 1
23 pages
ML Lab Manual
No ratings yet
ML Lab Manual
17 pages
Iii Aid - ML
No ratings yet
Iii Aid - ML
30 pages
Sudan ATC VSAT Network: (Bay-Sat, PS14009)
No ratings yet
Sudan ATC VSAT Network: (Bay-Sat, PS14009)
7 pages
ML Lab
No ratings yet
ML Lab
29 pages
Assignment 3
No ratings yet
Assignment 3
8 pages
23BCE7092 ML Lab Assignment
No ratings yet
23BCE7092 ML Lab Assignment
14 pages
HIRA Night Works
No ratings yet
HIRA Night Works
13 pages
ML Lab1
No ratings yet
ML Lab1
11 pages
SanatKulkarni - AP22110010183 - Assignment5
No ratings yet
SanatKulkarni - AP22110010183 - Assignment5
8 pages
Basic ML Algo
No ratings yet
Basic ML Algo
10 pages
Medical Data ML
No ratings yet
Medical Data ML
6 pages
AI&ML
No ratings yet
AI&ML
9 pages
AIML Project
No ratings yet
AIML Project
4 pages
ML Functions
No ratings yet
ML Functions
12 pages
Titanic Survival Prediction Report
No ratings yet
Titanic Survival Prediction Report
4 pages
22K61A0654 2 Sasi Auto
No ratings yet
22K61A0654 2 Sasi Auto
24 pages
Bacdeaf 23032025 115708 Split 1
No ratings yet
Bacdeaf 23032025 115708 Split 1
37 pages
ML Lab Works
No ratings yet
ML Lab Works
14 pages
ML5 Implementation
No ratings yet
ML5 Implementation
32 pages
Da Lab Mannual
No ratings yet
Da Lab Mannual
25 pages
Chapter 1-3
No ratings yet
Chapter 1-3
53 pages
ML - Other Pracs
No ratings yet
ML - Other Pracs
7 pages
ML Assignment
No ratings yet
ML Assignment
34 pages
1
No ratings yet
1
13 pages
ML PDF
No ratings yet
ML PDF
30 pages
Practicalpgm ML
No ratings yet
Practicalpgm ML
33 pages
Final ML Programs 075005
No ratings yet
Final ML Programs 075005
15 pages
CP4252 Lab Manual
No ratings yet
CP4252 Lab Manual
13 pages
ADS Expt5 BE9 29
No ratings yet
ADS Expt5 BE9 29
3 pages
SAP MM - Purchase Info Record
100% (1)
SAP MM - Purchase Info Record
6 pages
Designz Tweet Book
No ratings yet
Designz Tweet Book
117 pages
Da 012307
No ratings yet
Da 012307
8 pages
ML Codes
No ratings yet
ML Codes
9 pages
EX - NO:3: Algorithm
No ratings yet
EX - NO:3: Algorithm
11 pages
Import As Import As From Import From Import From Import From Import
No ratings yet
Import As Import As From Import From Import From Import From Import
4 pages
CFA LEVEL 1 - CFA Exam Core Video Series
No ratings yet
CFA LEVEL 1 - CFA Exam Core Video Series
2 pages
Multi Classification - Py (For 1 Class TP, TN, FP, FN)
No ratings yet
Multi Classification - Py (For 1 Class TP, TN, FP, FN)
25 pages
Train
No ratings yet
Train
17 pages
Aiml Ex 4-7
No ratings yet
Aiml Ex 4-7
8 pages
Data Analytics
No ratings yet
Data Analytics
10 pages
Naive Bayes Classification
No ratings yet
Naive Bayes Classification
8 pages
Classification
No ratings yet
Classification
3 pages
Import Pandas As PD DF PD - Read - CSV ("Titanic - Train - CSV") DF - Head
No ratings yet
Import Pandas As PD DF PD - Read - CSV ("Titanic - Train - CSV") DF - Head
20 pages
ADS - Phase 3
No ratings yet
ADS - Phase 3
34 pages
ML 1-10
No ratings yet
ML 1-10
53 pages
Naive Bayes Gaussian Table Tennis - Jupyter Notebook
No ratings yet
Naive Bayes Gaussian Table Tennis - Jupyter Notebook
6 pages
Codes For Project
No ratings yet
Codes For Project
8 pages
Decision Tree
No ratings yet
Decision Tree
6 pages
AI ML - Cycle 2 Programs
No ratings yet
AI ML - Cycle 2 Programs
15 pages
221IT027 DA Lab3
No ratings yet
221IT027 DA Lab3
5 pages
Supervised Classi & Regression
No ratings yet
Supervised Classi & Regression
5 pages
Slip
No ratings yet
Slip
5 pages
Data Preprocessing and Model Training
No ratings yet
Data Preprocessing and Model Training
21 pages
3 Classification
No ratings yet
3 Classification
16 pages
ML Lab Manual
No ratings yet
ML Lab Manual
12 pages
Machine Learning Model Building
No ratings yet
Machine Learning Model Building
6 pages
Switchyard Equipment Footing
No ratings yet
Switchyard Equipment Footing
11 pages
Titanic Akshaya
No ratings yet
Titanic Akshaya
12 pages
Shortcut Virus Remover
100% (1)
Shortcut Virus Remover
5 pages
DWDM Lab 3
No ratings yet
DWDM Lab 3
10 pages
PE
No ratings yet
PE
552 pages
Car Evaluation Data Analysis & Random Forest Model
No ratings yet
Car Evaluation Data Analysis & Random Forest Model
12 pages
Cat Planner
No ratings yet
Cat Planner
50 pages
Rotary Valve Fast Cycle Pressure Swing Adsorption Paper
No ratings yet
Rotary Valve Fast Cycle Pressure Swing Adsorption Paper
14 pages
Comer Letter To NARA
No ratings yet
Comer Letter To NARA
3 pages
St. John College of Engineering and Management, Palghar - Maharashtra
No ratings yet
St. John College of Engineering and Management, Palghar - Maharashtra
11 pages
LAB 1 - Matlab Basic
100% (1)
LAB 1 - Matlab Basic
26 pages
Shibaura D265F
No ratings yet
Shibaura D265F
3 pages
Eddie Soriano
No ratings yet
Eddie Soriano
3 pages
Factors and Norms Influencing Unpaid Care Work
No ratings yet
Factors and Norms Influencing Unpaid Care Work
64 pages
Samarthvresume 21
No ratings yet
Samarthvresume 21
2 pages
Business Plan: "A Lamp That Will Make Your Future Brighter
No ratings yet
Business Plan: "A Lamp That Will Make Your Future Brighter
19 pages
Microeconomics Analysis
No ratings yet
Microeconomics Analysis
167 pages
Subject Wise Prompts Te54p8
No ratings yet
Subject Wise Prompts Te54p8
25 pages
Project Year 12 English
No ratings yet
Project Year 12 English
7 pages
JD - Commissioning Supervisor
No ratings yet
JD - Commissioning Supervisor
2 pages
Machine Learning Lab: Raheel Aslam (74-FET/BSEE/F16)
No ratings yet
Machine Learning Lab: Raheel Aslam (74-FET/BSEE/F16)
3 pages
Chem-Project 1
No ratings yet
Chem-Project 1
4 pages
Assignment
No ratings yet
Assignment
6 pages
PORT AND TERMINAL INFORMATION BOOK-Ver 3 1 - 18 12 13
No ratings yet
PORT AND TERMINAL INFORMATION BOOK-Ver 3 1 - 18 12 13
21 pages
3D Passwords: Advanced Authentication
No ratings yet
3D Passwords: Advanced Authentication
16 pages
MAEF636850781708236636 EOI Seekho Aur Kamao 18-19
No ratings yet
MAEF636850781708236636 EOI Seekho Aur Kamao 18-19
13 pages
Home Work
No ratings yet
Home Work
12 pages
RONSAIRO
No ratings yet
RONSAIRO
3 pages
Sum of Array Using Function
No ratings yet
Sum of Array Using Function
1 page
Self-Reflection On Instructional Coaching (1) 2
No ratings yet
Self-Reflection On Instructional Coaching (1) 2
3 pages
Libreoffiice Basic: Libreoffic E Referen E Card
No ratings yet
Libreoffiice Basic: Libreoffic E Referen E Card
2 pages