Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
18 views48 pages

Artificial Intelligence and Machine Learning Lab Manual

The document outlines the laboratory manual for the Artificial Intelligence and Machine Learning course at Jayalakshmi Institute of Technology, detailing guidelines for students, including preparation, execution of experiments, and record-keeping. It also describes the department's vision and mission, educational objectives, and specific outcomes for graduates in the field of Artificial Intelligence and Data Science. Additionally, the manual includes course objectives and a list of practical exercises related to search algorithms and machine learning techniques.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views48 pages

Artificial Intelligence and Machine Learning Lab Manual

The document outlines the laboratory manual for the Artificial Intelligence and Machine Learning course at Jayalakshmi Institute of Technology, detailing guidelines for students, including preparation, execution of experiments, and record-keeping. It also describes the department's vision and mission, educational objectives, and specific outcomes for graduates in the field of Artificial Intelligence and Data Science. Additionally, the manual includes course objectives and a list of practical exercises related to search algorithms and machine learning techniques.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

Jayalakshmi Institute of Technology

NH7, SalemMainRd, T. Kanigarahalli, Thoppur, Dharmapuri, Tamil Nadu636352.


(Approved by AICTE - New Delhi, Affiliated to Anna University - Chennai)

CS3491- ARTIFICAL INTELLIGENCE AND MACHINE LEARNING LAB MANUAL

( FOR III B.E EL EC TRICAL COMMUN ICATION AND ENGINEERING STUDENTS)

NAME :
REGISTER NO :
YEAR/SEMESTER :
ACADEMIC YEAR :

AS PER ANNA UNIVERSITY (CHENNAI) SYLLABUS


2021 REGULATION
ABOUT OBSERVATION NOTES & PREPARATION OF RECORD

1. Students are advised to come to the laboratory at least 5 minutes before (to the starting time), those who come
after 5 minutes will not be allowed into the lab.
2. Student should enter into the laboratory with:
a. Laboratory observation notes with all the details (Problem statement, Aim, Algorithm, Procedure,
Program, Expected Output, etc.,) filled in for the lab session.
b. Laboratory Record updated up to the last session experiments and other utensils (if any) needed in the lab.
c. Proper Dress code and Identity card.
d. Sign in the laboratory login register, write the TIME-IN, and occupy the computer system allotted to you
by the faculty.
3. Execute your task in the laboratory, and record the results / output in the lab observation note book, and get
certified by the concerned faculty.
4. All the students should be polite and cooperative with the laboratory staff, must maintain the discipline and
decency in the laboratory.
5. Computer labs are established with sophisticated and high-end branded systems, which should be utilized
properly.
6. Students / Faculty must keep their mobile phones in SWITCHED OFF mode during the lab sessions with the
staff and systems etc., will attract severe punishment.
7. Students must take the permission of the faculty in case of any urgency to go out; if anybody found loitering
outside the lab / class without permission during working hours will be treated seriously and punished
appropriately.
8. Students should LOG OFF/ SHUT DOWN the computer system before he/she leaves the lab after completing
the task (experiment) in all aspects. He/she must ensure the system / seat is kept properly.
9. This Observation contains the basic diagrams of the circuits enlisted in the syllabus of the CS3491
ARTIFICAL INTELLIGENCE AND MACHINE LEARNING course, along with the design of various
components of the circuit and controller.
10. The aim of the experiment is also given at the beginning of each experiment. Once the student can design the
circuit as per the circuit diagram, he/she is supposed to go through the instructions carefully and do the
experiments step by step.
11. They should get their observations verified and signed by the staff within two days and prepare & submit the
record of the experiment when they come to the laboratory in the subsequent week.
12. The record should contain experiment No., Date, Aim, Apparatus required, Theory, Procedure, and result on
one side (i.e., Right-hand side, where rulings are provided) and Circuit diagram, Design, Model Graphs,
Tabulations, Calculations. Pre-Lab and Post Lab questions on the other side (i.e., Left-hand side, where black
space are provided)
13. The students are directed to discuss & clarify their doubts with the staff members as and when required. They
are also directed to follow strictly the guidelines specified.
Jayalakshmi Institute of Technology
Thoppur, Dharmapuri, Tamil Nadu 636352.

BONAFIDE CERTIFICATE

Name: ………………………………………………………………………..
Academic Year:………………….. Semester:…………… Branch:……………….

Register No.

Certified that this is the bonafide record of work done by the above student in the

…......................................................................................... Laboratory during the year

202 - 202.

Signature of Faculty in-charge

Submitted for the Practical Examination held on …………………………………….

Internal Examiner External Examiner


lOMoARcPSD|374 581 18

Jayalakshmi Institute of Technology


NH7, SalemMainRd, T. Kanigarahalli, Thoppur, Dharmapuri, Tamil Nadu636352.
(Approved by AICTE - New Delhi, Affiliated to Anna University - Chennai)

Department of Artificial Intelligence and Data Science


Department Vision

To build a conducive academic and research environment in the stream of Artificial Intelligence and Data

Science for enabling global education, research and entrepreneurship.

Department Mission

❖ Empower students with knowledge through experiential learning.

❖ Empower faculty of artificial intelligence and data science through continuous professional development,

research support and global collaborations to deliver quality education aligned with international standards.

❖ Establish student-centric learning environment that empowers future innovators in the dynamic field of

artificial intelligence through rigorous academic programs, cutting-edge research opportunities, and industry-

aligned projects.

❖ Conduct outreach activities for the society that involve the use of artificial intelligence, data science and

machine learning concepts to deal with societal issues.

❖ To develop professionals who are skilled in the area of Artificial Intelligence and Data Science.

❖ To impart quality and value based education and contribute towards the innovation of computing, expert

system, Data Science to raise satisfaction level of all stakeholders.

❖ Our effort is to apply new advancements in high performance computing hardware and software.

Programs offered

• The Department offers Under Graduate program in B.E. (Artificial Intelligence & Data Science) with an

intake of 120 students. At the Post Graduate level, the Department offers specialization in M.Tech. (Artificial

Intelligence & Data Science) with an intake of 18 students


lOMoARcPSD|374 581 18

Program Educational Objectives (PEOs)

Graduates of AI & DS will be able to:

PEO1: To provide graduates with the proficiency to utilize the fundamental knowledge of basic sciences,

mathematics, Artificial Intelligence, data science and statistics to build systems that require management and

analysis of large volume of data.

PEO2: To demonstrate excellence in cutting-edge technologies of Artificial Intelligence and Data Science and

solve problems in society.

PEO3: To enable graduates to think logically, pursue lifelong learning and collaborate with an ethical attitude in

a multidisciplinary team.

Program Specific Outcomes (PSOs)

After successful completion of the program, students will be able to:

PSO1: Graduates should be able to arrive at actionable fore sight, Insight, hind sight from data for solving business

and engineering problems.

PSO2: Apply the skills in the areas of Health Care, Education, Agriculture, Intelligent Transport, Environment, Smart

Systems & in the multi-disciplinary area of Artificial Intelligence and Data Science.

PSO3: Graduates should be able to create, select and apply the theoretical knowledge of AI and Data Analytics along

with practical industrial tools and techniques to manage and solve wicked societal problems.

Programme Outcomes:

PO1: Engineering Knowledge: Apply the knowledge of mathematics, science, engineering fundamentals, and an

engineering specialization to the solution of complex engineering problems.

PO2: Problem Analysis: Identify, formulate, review research literature, and analyse complex engineering problems

reaching substantiated conclusions using first principles of mathematics, natural sciences, and engineering sciences.

PO3: Design / Development of solutions: Design / Development of solutions: Design solutions for complex

engineering problems and design system components or processes that meet the specified needs with appropriate

consideration for the public health and safety, and the cultural, societal, and environmental considerations.
lOMoARcPSD|374 581 18

PO4: Conduct Investigations of Complex problems: Use research-based knowledge and research methods including

design of experiments, analysis and interpretation of data, and synthesis of the information to provide valid

conclusions.

PO5: Modern Tool Usage:Modern Tool Usage: Create, select, and apply appropriate techniques, resources, and

modern engineering and IT tools including prediction and modelling to complex engineering activities with an

understanding of the limitations.

PO6: The Engineer Society:The Engineer Society: Apply reasoning informed by the contextual knowledge to assess

societal, health, safety, legal and cultural issues and the consequent responsibilities relevant to the professional

engineering practice.

PO7: Environment and Sustainability:Environment and Sustainability: Understand the impact of the professional

engineering solutions in societal and environmental contexts, and demonstrate the knowledge of, and need for

sustainable development.

PO8: Ethics: Apply ethical principles and commit to professional ethics and responsibilities and norms of the

engineering practice.

PO9: Individual and Teamwork:Function effectively as an individual, and as a member or leader in diverse teams,

and in multidisciplinary settings.

PO10: Communication:Communicate effectively on complex engineering activities with the engineering community

and with society at large, such as, being able to comprehend and write effective reports and design documentation,

make effective presentations, and give and receive clear instructions.

PO11: Project Management and Finance:Demonstrate knowledge and understanding of the engineering and

management principles and apply these to one’s own work, as a member and leader in a team, to manage projects and

in multidisciplinary environments.

PO12: Life Long Learning: Recognize the need for, and have the preparation and ability to engage in independent

and life-long learning in the broadest context of technological change.

.
lOMoARcPSD|374 581 18

CS3691 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

COURSE OBJECTIVES:

The main objectives of this course are to:

Study about uninformed and Heuristic search techniques.

Learn techniques for reasoning under uncertainty

Introduce Machine Learning and supervised learning algorithms

Study about ensembling and unsupervised learning algorithms

Learn the basics of deep learning using neural networks

LIST OF EXPERIMENTS

PRACTICAL EXERCISES:

1.Implementation of Uninformed search algorithms (BFS, DFS)


2. Implementation of Informed search algorithms (A*, memory-bounded A*)
3. Implement naïve Bayes models
4. Implement Bayesian Networks
5. Build Regression models
6. Build decision trees and random forests
7. Build SVM models
8. Implement ensembling techniques
9. Implement clustering algorithms
10. Implement EM for Bayesian networks
11. Build simple NN models
12. Build deep learning NN models

TOTAL: 30 PERIODS
COURSE OUTCOMES:

At the end of this course, the students will be able to:

CO1: Use appropriate search algorithms for problem solving

CO2: Apply reasoning under uncertainty

CO3: Build supervised learning models

CO4: Build ensembling and unsupervised models

CO5: Build deep learning


lOMoARcPSD|374 581 18

EXP NO: 1 IMPLEMENTING BREADTH-FIRST SEARCH (BFS) AND DEPTH-


FIRST SEARCH (DFS)
DATE:

Aim:
The aim of implementing Breadth-First Search (BFS) and Depth-First Search(DFS)

algorithms traverse tree data structure in asystematic way,visiting all nodes and edges

in the structure in a particular order, without revisiting any node twice.

Algorithm:
Breadth-First Search (BFS)algorithm:

1. Create an empty queue and enqueue the starting node

2. Mark the starting node as visited

3. While the queue is not empty, dequeue a node from the queue and visit it

4. Enqueue all of its neighbors that have not been visited yet, and mark them as visited

5. Repeat steps 3-4 until the queue is empty.

Depth-First Search (DFS)algorithm:

1. Mark the starting nodeas visited and printit

2. For each adjacent node of the current node that has not been visited,repeat step1

3. If all adjacent nodes have beenvisited,back track to the previous node and repeat step2

4. Repeat steps 2-3 until all nodes have been visited.


lOMoARcPSD|374 581 18

Program:
graph = {
'5': ['3', '7'],
'3': ['2', '4'],
'7': ['8'],
'2': [],
'4': ['8'],
'8': []
}
visited = [] # List for visited nodes
queue = [] # Initialize a queue
def bfs(visited, graph, node): # Function for BFS
visited.append(node)
queue.append(node)
while queue: # Creating loop to visit each node
m = queue.pop(0)
print(m, end=" ")
for neighbour in graph[m]:
if neighbour not in visited:
visited.append(neighbour)
queue.append(neighbour)
# Driver Code
print("Following is the Breadth-First Search")
bfs(visited, graph, '5')
# DFS Implementation
graph = {
'5': ['3', '7'],
'3': ['2', '4'],
'7': ['8'],
'2': [],
'4': ['8'],
'8': []
}
visited = set() # Set to keep track of visited nodes
def dfs(visited, graph, node): # Function for DFS
if node not in visited:
print(node, end=" ")
visited.add(node)
for neighbour in graph[node]:
lOMoARcPSD|374 581 18

dfs(visited, graph, neighbour)


# Driver Code
print("\nFollowing is the Depth-First Search")
dfs(visited, graph, '5') '8': []
}
visited = set() # Set to keep track of visited nodes
def dfs(visited, graph, node): # Function for DFS
if node not in visited:
print(node, end=" ")
visited.add(node)
for neighbour in graph[node]:
dfs(visited, graph, neighbour)
# Driver Code
print("\nFollowing is the Depth-First Search")
dfs(visited, graph, '5')
lOMoARcPSD|374 581 18

Previous Lab Questions:


1. What input format will you use for the graph in your program (adjacency list, adjacency matrix,
etc.)?
2. What are the expected outputs of BFS and DFS when run on a sample graph?
3. If a graph has multiple components (disconnected subgraphs), how would BFS and DFS behave?
4. What modifications would you need to make to handle weighted graphs?
5. How would you test your implementation to ensure correctness?

Post-Lab Questions:

1. Explain the main differences between BFS and DFS in terms of their approach to searching a graph.
2. What data structures are typically used to implement BFS and DFS?
3. In which scenarios would BFS be preferred over DFS? Provide a real-world example.
4. In which scenarios would DFS be preferred over BFS? Provide a real-world example.
5. How does the space complexity of BFS compare to that of DFS?

MARKS ALLOCATION

Details Marks Marks


Allotted Awarded
Pre Lab Questions 10
Aim & Procedure 30
Coding 30
Execution & Output 20
Post Lab 10
Questions(Viva)
Total 100

Result:
Thus the program for BFS and DFS is executed successfully and output is verified.
lOMoARcPSD|374 581 18

EXP NO: 2 IMPLEMENTING INFORMED SEARCH ALGORITMS LIKE A* AND

MEMORY-BOUNDED A*
DATE:

Aim:
The aim of a C program for implementing informed search algorithms like A*and
memory-bounded A* is to efficiently find the shortest path between two points in a graph or
network. The A* algorithm is a heuristic-based search algorithm that finds the shortest path
between two points by evaluating the cost function of each possible path. The memory-
bounded A* algorithm is a variant of the A* algorithm that uses a limited amount of memory
and is suitable for large search spaces.

Algorithm:
Algorithm for A*
1. Initialize the starting node with a cost of zero and add it to an open list.
2. While the open list is not empty:
a. Find the node with the lowest cost in the open list and remove it.
b. If this node is the goal node, return the path to this node.
c. Generate all success or nodes of the current node.
d. For each success or node, calculate its cost and add it to the open list.
3. If the open list is empty and the goal node has not been found, then there is no path from
the start node to the goal node.
Algorithm for memory-bounded A*
1. Initialize the starting node with a cost of zero and add it to an open list and a closed list.
2. While the open list is not empty:
a. Find the node with the lowest cost in the open list and remove it.
b. If this node is the goal node, return the path to this node.
c. Generate all success or nodes of the current node.
d. For each success or node, calculate its cost and add it to the open list if it is not in the
closed list. e. If the open list is too large, remove the node with the highest cost from the open
list and add it to the closed list.
f. Add the current node to the closed list.
3. If the open list is empty and the goal node has not been found, then there is no path from
the start node to the goal node.
lOMoARcPSD|374 581 18

Program:
from queue import PriorityQueue
import numpy as np
import heapq
# Number of vertices
v = 14
graph = [[] for i in range(v)]
# Function for Best-First Search (Lowest cost path)
def best_first_search(actual_Src, target, n):
visited = [False] * n
pq = PriorityQueue()
pq.put((0, actual_Src))
visited[actual_Src] = True
while not pq.empty():
u = pq.get()[1]
print(u, end=" ")
if u == target:
break
for v, c in graph[u]:
if not visited[v]:
visited[v] = True
pq.put((c, v))
print()
# Function for adding edges to the graph
def add_edge(x, y, cost):
graph[x].append((y, cost))
graph[y].append((x, cost))
# Adding edges
add_edge(0, 1, 3)
add_edge(0, 2, 6)
add_edge(0, 3, 5)
add_edge(1, 4, 9)
add_edge(1, 5, 8)
add_edge(2, 6, 12)
add_edge(2, 7, 14)
add_edge(3, 8, 7)
add_edge(8, 9, 5)
add_edge(8, 10, 6)
add_edge(9, 11, 1)
lOMoARcPSD|374 581 18

add_edge(9, 12, 10)


add_edge(9, 13, 2)
source = 0
target = 9
print("Following is the Best-First Search path:")
best_first_search(source, target, v)
# Memory-Bounded A* Algorithm
class Graph:
def __init__(self, adjacency_matrix):
self.adjacency_matrix = adjacency_matrix
self.num_nodes = len(adjacency_matrix)
def get_neighbors(self, node):
return [neighbor for neighbor, is_connected in enumerate(self.adjacency_matrix[node]) if is_connected]
def memory_bounded_a_star(graph, start, goal, memory_limit):
visited = set()
priority_queue = [(0, start)]
memory_usage = 0
while priority_queue:
memory_usage = max(memory_usage, len(visited))
if memory_usage > memory_limit:
print("Memory limit exceeded!")
return None
cost, current_node = heapq.heappop(priority_queue)
if current_node == goal:
print("Goal found!")
return cost
if current_node not in visited:
visited.add(current_node)
for neighbor in graph.get_neighbors(current_node):
heapq.heappush(priority_queue, (cost + 1, neighbor))

print("Goal not reachable!")


return None
if __name__ == "__main__":
# Define the adjacency matrix of the graph
adjacency_matrix = np.array([
[0, 1, 1, 0, 0, 0, 0],
[1, 0, 0, 1, 1, 0, 0],
[1, 0, 0, 0, 0, 1, 0],
lOMoARcPSD|374 581 18

[0, 1, 0, 0, 0, 1, 1],
[0, 1, 0, 0, 0, 0, 1],
[0, 0, 1, 1, 0, 0, 1],
[0, 0, 0, 1, 1, 1, 0]
])
graph = Graph(adjacency_matrix)
start_node = 0
goal_node = 6
memory_limit = 10 # Set the memory limit
result = memory_bounded_a_star(graph, start_node, goal_node, memory_limit)
print("Shortest path cost:", result)
lOMoARcPSD|374 581 18

Previous Lab Questions:

1. What are the key differences between uninformed and informed search algorithms?
2. How does A* search differ from BFS and DFS in terms of exploration strategy?
3. Explain the role of the heuristic function in A*. What are the properties of a good heuristic?
4. What is the significance of the cost function f(n) = g(n) + h(n) in A* search?
5. How do admissible and consistent heuristics affect the performance of A*?

Post Lab Questions :

1. What challenges did you encounter while implementing A*? How did you resolve them?
2. How did your choice of heuristic (e.g., Manhattan distance, Euclidean distance) affect A*’s performance?
3. Why does A* guarantee finding the shortest path if the heuristic is admissible?
4. Compare the execution time of A* when using different heuristics. Which performed better and why?
5. How does A* handle ties when multiple nodes have the same f(n) value?
6. What is the impact of using a poor heuristic on the performance of A*?

MARKS ALLOCATION

Details Marks Marks


Allotted Awarded
Pre Lab Questions 10
Aim & Procedure 30
Coding 30
Execution & Output 20
Post Lab 10
Questions(Viva)
Total 100

Result:
Thus the program for implementing informed search algorithms like A* and memory-
Bounded A* has verified successfully and output is verified.
lOMoARcPSD|374 581 18

EXP NO: 3 IMPLEMENT NAÏVE BAYE

DATE:

Aim:
The aim of the Naïve Bayes algorithm is to classify set of data points into
different classes based on the probability of each data point belonging to a particular class.
This algorithm is based on the Bayes theorem, which states that the probability of an even to
occurring given the prior knowledge of another event can be calculated using conditional
probability.

Algorithm:
1. Collect the dataset: The first step in using Naïve Bayes is to collect a dataset that contains
a set of data points and their corresponding classes.
2. Prepare the data: The next step is to preprocess the data and prepare it for the Naïve
Bayes algorithm. This involves removing any unnecessary features or attributes and
normalizing the data.
3. Compute the prior probabilities: The prior probabilities of each class can be computed by
calculating the number of data points belonging to each class and dividing it by the total
number of data points.
4. Compute the likelihoods: The likelihoods of each feature for each class can be computed
by calculating the conditional probability of the feature given the class. This involves
counting the number of data points in each class that have the feature and dividing it by the
total number of data points in that class.
5. Compute the posterior probabilities: The posterior probabilities of each class can be
computed by multiplying the prior probability of the class with the product of the likelihoods
of each feature for that class.
6. Make predictions: Once the posterior probabilities have been computed for each class, the
Naïve Bayes algorithm can be used to make predictions by selecting the class with the
highest probability.
7. Evaluate the model: The final step is to evaluate the performance of the Naïve Bayes
model. This can be done by computing various performance metrics such as accuracy,
precision, recall, and F1score.
lOMoARcPSD|374 581 18

Program:
# Import required libraries
import math
import random
import csv
# Encode categorical class names to numeric values (e.g., 'yes' -> 1, 'no' -> 0)
def encode_class(mydata):
classes = []
for i in range(len(mydata)):
if mydata[i][-1] not in classes:
classes.append(mydata[i][-1])
for i in range(len(classes)):
for j in range(len(mydata)):
if mydata[j][-1] == classes[i]:
mydata[j][-1] = i
return mydata
# Splitting the dataset into training and testing sets
def splitting(mydata, ratio):
train_num = int(len(mydata) * ratio)
train = []
test = list(mydata) # Initially, test set has all the data
while len(train) < train_num:
index = random.randrange(len(test)) # Generate random index
train.append(test.pop(index)) # Pop data from test set and move to train set
return train, test
# Group data rows under each class (e.g., dict['yes'], dict['no'])
def group_under_class(mydata):
grouped_data = {}
for i in range(len(mydata)):
class_label = mydata[i][-1]
if class_label not in grouped_data:
grouped_data[class_label] = []
grouped_data[class_label].append(mydata[i])
return grouped_data
# Calculate Mean
def mean(numbers):
return sum(numbers) / float(len(numbers))
# Calculate Standard Deviation
lOMoARcPSD|374 581 18

def std_dev(numbers):
avg = mean(numbers)
variance = sum([(x - avg) ** 2 for x in numbers]) / float(len(numbers) - 1)
return math.sqrt(variance)
# Compute mean and standard deviation for each feature
def mean_and_std_dev(mydata):
info = [(mean(attribute), std_dev(attribute)) for attribute in zip(*mydata)]
del info[-1] # Remove the last element (class label)
return info
# Find mean and standard deviation under each class
def mean_and_std_dev_for_class(mydata):
info = {}
grouped_data = group_under_class(mydata)
for class_value, instances in grouped_data.items():
info[class_value] = mean_and_std_dev(instances)
return info
# Calculate Gaussian Probability Density Function
def calculate_gaussian_probability(x, mean, stdev):
exponent = math.exp(-(math.pow(x - mean, 2) / (2 * math.pow(stdev, 2))))
return (1 / (math.sqrt(2 * math.pi) * stdev)) * exponent
# Calculate Class Probabilities
def calculate_class_probabilities(info, test):
probabilities = {}
for class_value, class_summaries in info.items():
probabilities[class_value] = 1
for i in range(len(class_summaries)):
mean, std_dev = class_summaries[i]
x = test[i]
probabilities[class_value] *= calculate_gaussian_probability(x, mean, std_dev)
return probabilities
# Predict the class with the highest probability
def predict(info, test):
probabilities = calculate_class_probabilities(info, test)
best_label, best_prob = None, -1
for class_value, probability in probabilities.items():
if best_label is None or probability > best_prob:
best_prob = probability
best_label = class_value
return best_label
lOMoARcPSD|374 581 18

# Generate predictions for a set of test examples


def get_predictions(info, test):
predictions = []
for i in range(len(test)):
result = predict(info, test[i])
predictions.append(result)
return predictions
# Calculate accuracy of the model
def accuracy_rate(test, predictions):
correct = 0
for i in range(len(test)):
if test[i][-1] == predictions[i]:
correct += 1
return (correct / float(len(test))) * 100.0
# ---------------- Driver Code ----------------
# Load the dataset
filename = r'E:\user\MACHINELEARNING\machinelearningalgos\Naive bayes\filedata.csv'
mydata = csv.reader(open(filename, "rt"))
mydata = list(mydata)
# Encode categorical class labels
mydata = encode_class(mydata)
# Convert data to float
for i in range(len(mydata)):
mydata[i] = [float(x) for x in mydata[i]]
# Split data (70% train, 30% test)
ratio = 0.7
train_data, test_data = splitting(mydata, ratio)
print('Total number of examples:', len(mydata))
print('Training examples:', len(train_data))
print("Test examples:", len(test_data))
# Prepare model
info = mean_and_std_dev_for_class(train_data)
# Test model
predictions = get_predictions(info, test_data)
accuracy = accuracy_rate(test_data, predictions)
print("Accuracy of your model:", accuracy)
lOMoARcPSD|374 581 18

Post Lab Questions :

1. What assumptions does the Naïve Bayes algorithm make about the data?
2. How does Naïve Bayes calculate the probability of a class given a set of features?
3. What are the key differences between Gaussian, Multinomial, and Bernoulli Naïve Bayes classifiers?
4. Why is Naïve Bayes considered a "naïve" algorithm?
5. In what scenarios does Naïve Bayes perform well? When does it struggle?

Previous Lab Questions :

1. What is the Naïve Bayes algorithm, and why is it called "naïve"?


2. What is Bayes’ Theorem, and how is it applied in Naïve Bayes classification?
3. What are the different types of Naïve Bayes classifiers, and how do they differ (Gaussian, Multinomial,
Bernoulli)?
4. What are the key assumptions made by the Naïve Bayes classifier?
5. How does Naïve Bayes handle categorical vs. continuous data?
6. What are some real-world applications of Naïve Bayes?

MARKS ALLOCATION

Details Marks Marks


Allotted Awarded
Pre Lab Questions 10
Aim & Procedure 30
Coding 30
Execution & Output 20
Post Lab 10
Questions(Viva)
Total 100

Result:
Thus the program for Navy Bayes is verified success fully and output is verified.
lOMoARcPSD|374 581 18

EXP NO: 4 IMPLEMENT BAYESIAN NETWORKS

DATE:

Aim:
The aim of implementing Bayesian Networks is to model the probabilistic
relationships between a set of variables. A Bayesian Network is a graphical model that
represents the conditional dependencies between different variables in a probabilistic manner.
It is a powerful tool for reasoning under uncertainty and can be used for a wide range of
applications, including decision making, risk analysis, and prediction.

Algorithm:

1. Define the variables: The first step in implementing a Bayesian Network is to define the
variables that will be used in the model. Each variable should be clearly defined and its
possible states should be enumerated.
2. Determine the relationships between variables: The next step is to determine the
probabilistic relationships between the variables. This can be done by identifying the causal
relationships between the variables or by using data to estimate the conditional probabilities
of each variable given its parents.
3. Construct the Bayesian Network: The Bayesian Network can be constructed by
representing the variables as nodes in a directed acyclic graph (DAG). The edges between the
nodes represent the conditional dependencies between the variables.
4. Assign probabilities to the variables: Once the structure of the Bayesian Network has
been defined, the probabilities of each variable must be assigned. This can be done by using
expert knowledge, data, or a combination of both.
5. Inference: Inference refers to the process of using the Bayesian Network to make
predictions or draw conclusions. This can be done by using various inference algorithms,
such as variable elimination or belief propagation.
6. Learning: Learning refers to the process of updating the probabilities in the Bayesian
Network based on new data. This can be done using various learning algorithms, such as
maximum likelihood or Bayesian learning.
7. Evaluation: The final step in implementing a Bayesian Network is to evaluate its
performance. This can be done by comparing the predictions of the model to actual data and
computing various performance metrics, such as accuracy or precision.
lOMoARcPSD|374 581 18

Program:
import numpy as np
import pandas as pd
from pgmpy.models import BayesianNetwork
from pgmpy.estimators import MaximumLikelihoodEstimator
from pgmpy.inference import VariableElimination
# Read Cleveland Heart Disease dataset
heart_disease = pd.read_csv('heart.csv')
heart_disease = heart_disease.replace('?', np.nan)
# Display the dataset
print('Few examples from the dataset are given below:')
print(heart_disease.head())
# Define the Bayesian Network structure
model = BayesianNetwork([
('age', 'trestbps'),
('age', 'fbs'),
('sex', 'trestbps'),
('exang', 'trestbps'),
('trestbps', 'heartdisease'),
('fbs', 'heartdisease'),
('heartdisease', 'restecg'),
('heartdisease', 'thalach'),
('heartdisease', 'chol')
])
# Learning CPDs using Maximum Likelihood Estimator
print('\nLearning CPD using Maximum Likelihood Estimator...')
model.fit(heart_disease, estimator=MaximumLikelihoodEstimator)
# Inferencing with Bayesian Network
print('\nInferencing with Bayesian Network...')
heart_disease_infer = VariableElimination(model)
# Computing the probability of Heart Disease given Age
print('\n1. Probability of Heart Disease given Age=30')
q1 = heart_disease_infer.query(variables=['heartdisease'], evidence={'age': 30})
print(q1)
# Computing the probability of Heart Disease given cholesterol
print('\n2. Probability of Heart Disease given Cholesterol=100')
q2 = heart_disease_infer.query(variables=['heartdisease'], evidence={'chol': 100})
print(q2)
lOMoARcPSD|374 581 18

Pre-Lab Questions :

1. What is a Bayesian Network, and how does it differ from a Naïve Bayes classifier?
2. What are the key components of a Bayesian Network? (e.g., nodes, edges, conditional probability tables)
3. How does a Bayesian Network represent dependencies and independencies between variables?
4. What is a Directed Acyclic Graph (DAG), and why is it important in Bayesian Networks?
5. What is conditional probability, and how is it used in Bayesian Networks?
6. Explain Bayes' Theorem and how it applies to Bayesian Networks.

Post-Lab Questions :

1. What challenges did you face while implementing Bayesian Networks? How did you overcome them?
2. How did you construct the graph structure of your Bayesian Network? Was it predefined, or did you learn it
from data?
3. How does a Bayesian Network handle uncertainty in decision-making?
4. How did you compute the Conditional Probability Tables (CPTs) for your network?
5. What assumptions did you make about conditional independence in your Bayesian Network?
6. How does your Bayesian Network compare to simpler models like Naïve Bayes?

MARKS ALLOCATION

Details Marks Marks


Allotted Awarded
Pre Lab Questions 10
Aim & Procedure 30
Coding 30
Execution & Output 20
Post Lab 10
Questions(Viva)
Total 100

Result:
Thus the program is executed successfully and output is verified.
lOMoARcPSD|374 581 18

EXP NO: 5 BUILD REGRESSION MODEL

DATE:

Aim:
The aim of building a regression model is to predict a continuous numerical outcome
Variable based on one or more input variables. There are several algorithms that can be used
to build regression models, including linear regression, polynomial regression, decision trees,
random forests, and neural networks.

Algorithm:
1. Collecting and cleaning the data: The first step in building a regression model is to gather
the data needed for analysis and ensure that it is clean and consistent. This may involver
moving missing values, outliers, and other errors.
2. Exploring the data: Once the data is cleaned, it is important to explore it to gain an
understanding of the relationships between the input and outcome variables. This may
involve calculating summary statistics, creating visualizations, and testing for correlations.
3. Choosing the algorithm: Based on the nature of the problem and the characteristics of the
data, an appropriate regression algorithm is chosen.
4. Preprocessing the data: Before applying the regression algorithm, it may be necessary to
preprocess the data to ensure that it is in a suitable format. This may involve standardizing or
normalizing the data, encoding categorical variables, or applying feature engineering
techniques.
5. Training the model: The regression model is trained on a subset of the data, using an
optimization algorithm to find the values of the model parameters that minimize the
difference between the predicted and actual values.
6. Evaluating the model: Once the model is trained, it is evaluated using a separate test
dataset to determine its accuracy and generalization performance. Metrics such as mean
squared error, R-squared, or root mean squared error can be used to assess the model's
performance.
7. Improving the model: Based on the evaluation results, the model can be refined by
adjusting the model parameters or using different algorithms.
8. Deploying the model: Finally, the model can be deployed to make predictions on new
data.
lOMoARcPSD|374 581 18

Program:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, r2_score
# Load the dataset
df = pd.read_csv('dataset.csv')
# Split the dataset into training and testing sets
X = df[['feature1', 'feature2']] # Adjust feature names based on your dataset
y = df['target']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Train the regression model
reg = LinearRegression()
reg.fit(X_train, y_train)
# Make predictions on the test set
y_pred = reg.predict(X_test)
# Evaluate the model
print('Mean Squared Error: %.2f' % mean_squared_error(y_test, y_pred))
print('Coefficient of Determination (R² Score): %.2f' % r2_score(y_test, y_pred))
# Plot the results
plt.scatter(X_test['feature1'], y_test, color='black', label="Actual Data")
plt.plot(X_test['feature1'], y_pred, color='blue', linewidth=3, label="Predicted Regression Line")
plt.xlabel('Feature 1')
plt.ylabel('Target')
plt.title('Linear Regression Model')
plt.legend()
plt.show()
lOMoARcPSD|374 581 18

Post Lab Questions:

1. What type of regression model did you implement (Linear, Multiple, Polynomial, Ridge, Lasso, etc.), and
why?
2. What assumptions does your regression model make about the data?
3. How do you interpret the coefficients in your regression model?
4. How did you handle multicollinearity in your model?
5. What is the difference between R-squared and Adjusted R-squared, and how did they help evaluate your
model?
6. What does the p-value of a coefficient indicate in regression analysis?

Previous Lab Questions:

1. What is regression analysis, and why is it used in predictive modeling?


2. What are the differences between Linear Regression, Multiple Regression, and Polynomial Regression?
3. What assumptions are made in Linear Regression?
4. How do you interpret the equation of a regression model (y = mx + b for simple regression)?
5. What is the difference between correlation and regression?
6. How does multicollinearity affect a regression model?

MARKS ALLOCATION

Details Marks Marks


Allotted Awarded
Pre Lab Questions 10
Aim & Procedure 30
Coding 30
Execution & Output 20
Post Lab 10
Questions(Viva)
Total 100

Result:
Thus the program for build regression models is executed successfully and output is verified.
lOMoARcPSD|374 581 18

EXP NO: 6 BUILD DECISION TREES AND RANDOM FORESTS

DATE:

Aim:
The aim of building decision trees and random forests is to create models that can be
Used to predict at variable based on a set of input features. Decision trees and random forests
are both popular machine learning algorithms for building predictive models.

Algorithm:
Decision Trees.
1. Select the feature that best splits the data: The first step is to select the feature that
best separates the data into groups with different target values.
2. Recursively split the data: For each group created in step 1, repeat the process of
selecting the best feature to split the data until a stopping criterion is met. The stopping
criterion may be a maximum tree depth, a minimum number of samples in a leaf node, or
another condition.
3. Assign a prediction value to each leaf node: Once the tree is built, assign a prediction
value to each leaf node. This value may be the mean or median target value of the samples in
the leaf node.

Random Forest
1. Randomly select a subset of features: Before building each decision tree, randomly
select a subset of features to consider for splitting the data.
2. Build multiple decision trees: Build multiple decision trees using the process
described above, each with a different subset of features.
3. Aggregate the predictions: When making predictions on new data, aggregate the
predictions from all decision trees to obtain a final prediction value. This can be done by
taking the average or majority vote of the predictions.
lOMoARcPSD|374 581 18

Program:
import pandas as pd
from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
# Load data
data = pd.read_csv('data.csv')
# Split data into training and testing sets
X = data.drop(['target'], axis=1)
y = data['target']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Build Decision Tree model
dt = DecisionTreeRegressor()
dt.fit(X_train, y_train)
# Predict on test set
y_pred_dt = dt.predict(X_test)
# Evaluate Decision Tree performance
mse_dt = mean_squared_error(y_test, y_pred_dt)
print(f"Decision Tree Mean Squared Error: {mse_dt:.4f}")
# Build Random Forest model
rf = RandomForestRegressor()
rf.fit(X_train, y_train)
# Predict on test set
y_pred_rf = rf.predict(X_test)
# Evaluate Random Forest performance
mse_rf = mean_squared_error(y_test, y_pred_rf)
print(f"Random Forest Mean Squared Error: {mse_rf:.4f}")
lOMoARcPSD|374 581 18

Post Lab Questions:

1. What are the main differences between Decision Trees and Random Forests?
2. How does entropy and Gini impurity affect the splitting process in Decision Trees?
3. How did your Decision Tree model handle categorical and numerical data?
4. What hyperparameters did you tune in your Decision Tree model, and how did they impact performance?
5. How does a Random Forest improve over a single Decision Tree?
6. What are the advantages and disadvantages of using Random Forests?

Previous Lab Questions:


1. What dataset will you use for building your Decision Tree and Random Forest models?
2. How will you handle missing values in your dataset?
3. What Python libraries will you use to implement Decision Trees and Random Forests? (e.g., scikit-
learn, pandas, matplotlib)
4. How will you split your dataset into training and testing sets?
5. What hyperparameters will you tune in your Decision Tree model?
6. How will you evaluate the performance of your model? (e.g., Accuracy, Precision, Recall, F1-score,
AUC-ROC)

MARKS ALLOCATION

Details Marks Marks


Allotted Awarded
Pre Lab Questions 10
Aim & Procedure 30
Coding 30
Execution & Output 20
Post Lab 10
Questions(Viva)
Total 100

Result:
Thus the program for decision trees is executed successfully and output is verified.
lOMoARcPSD|374 581 18

EXP NO: 7 BUILD SVM MODELS

DATE:

Aim:
The aim of this Python code is to demonstrate how to use the scikit-learn library to
Train support vector machine (SVM)models for classification tasks.

Algorithm:
1. Load a data set using the pandas library
2. Split the dataset into training and testing sets using train_test_split Function from
scikit-learn
3. Train three SVM models with different kernels (linear, polynomial, and RBF) using
SVC
function from scikit-learn
4. Predict the test set labels using the trained models
accuracy_score
5. Evaluate the accuracy of the models using the function from scikit-
learn
6. Print the accuracy of each model
lOMoARcPSD|374 581 18

Program:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score
# Load the dataset
data = pd.read_csv('data.csv')
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(data.drop('target', axis=1), data['target'], test_size=0.3,
random_state=42)
# Train an SVM model with a linear kernel
svm_linear = SVC(kernel='linear')
svm_linear.fit(X_train, y_train)
# Predict the test set labels
y_pred_linear = svm_linear.predict(X_test)
# Evaluate the model's accuracy
accuracy_linear = accuracy_score(y_test, y_pred_linear)
print(f'Linear SVM Accuracy: {accuracy_linear:.2f}')
# Train an SVM model with a polynomial kernel
svm_poly = SVC(kernel='poly', degree=3)
svm_poly.fit(X_train, y_train)
# Predict the test set labels
y_pred_poly = svm_poly.predict(X_test)
# Evaluate the model's accuracy
accuracy_poly = accuracy_score(y_test, y_pred_poly)
print(f'Polynomial SVM Accuracy: {accuracy_poly:.2f}')
# Train an SVM model with an RBF kernel
svm_rbf = SVC(kernel='rbf')
svm_rbf.fit(X_train, y_train)
# Predict the test set labels
y_pred_rbf = svm_rbf.predict(X_test)
# Evaluate the model's accuracy
accuracy_rbf = accuracy_score(y_test, y_pred_rbf)
print(f'RBF SVM Accuracy: {accuracy_rbf:.2f}')
lOMoARcPSD|374 581 18

Post Lab Questions:

1. What kernel function did you use in your SVM model (Linear, Polynomial, RBF, Sigmoid), and why?
2. How does the choice of kernel affect model performance and computational efficiency?
3. What is the significance of support vectors in SVM?
4. How does the C (regularization) parameter influence the decision boundary of an SVM model?
5. What impact does the gamma parameter have in SVM with an RBF kernel?
6. How did you handle imbalanced data while training your SVM model?

Previous Lab Questions:


1. What dataset did you use, and how did you preprocess it before training the SVM model?
2. How did you handle missing values and outliers in your dataset?
3. How did you determine the best hyperparameters for SVM? (e.g., Grid Search, Randomized
Search)
4. What performance metrics did you use to evaluate your SVM model? (Accuracy, Precision, Recall,
F1-score, AUC-ROC)
5. How did you split your dataset into training and testing sets, and what ratio did you use?
6. How did you visualize the decision boundary for your SVM model?

MARKS ALLOCATION

Details Marks Marks


Allotted Awarded
Pre Lab Questions 10
Aim & Procedure 30
Coding 30
Execution & Output 20
Post Lab 10
Questions(Viva)
Total 100

Result:
Thus the program for Build SVM Model has been executed successfully and output is
verified.
lOMoARcPSD|374 581 18

EXP NO: 8 IMPLEMENT ENSEMBLING TECHNIQUES

DATE:

Aim:
The aim of ensembling is to combine the predictions of multiple individual models,
Known as base models, in order to produce a final prediction that is more accurate and
reliable than any individual model.(Voting, Bagging, Boosting)

Algorithm:
1.Load the data set and split it into training and testing sets

2. Choose the base models to be included in the ensemble

3. Train each base model on the training set.

4. Combine the predictions of the base models using the chosen ensembling technique

(voting, bagging, boosting, etc.).

5.Evaluate the performance of the ensemble model on the testing set.

6. If the performance is satisfactory, deploy the ensemble model for making predictions

On new data
lOMoARcPSD|374 581 18

Program:
# Import required libraries
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier, VotingClassifier
from sklearn.svm import SVC
from sklearn.linear_model import LogisticRegression
# Load sample dataset
iris = datasets.load_iris()
# Split dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.3, random_state=42)
# Build individual models
svc_model = SVC(kernel='linear', probability=True)
rf_model = RandomForestClassifier(n_estimators=10, random_state=42)
lr_model = LogisticRegression(max_iter=200)
# Create ensemble model
ensemble = VotingClassifier(estimators=[
('svc', svc_model),
('rf', rf_model),
('lr', lr_model)
], voting='soft')
# Train ensemble model
ensemble.fit(X_train, y_train)
# Make predictions on test set
y_pred = ensemble.predict(X_test)
# Print ensemble model accuracy
print("Ensemble Accuracy:", ensemble.score(X_test, y_test))
lOMoARcPSD|374 581 18

Post Lab Questions:

1. What is ensemble learning, and why is it used in machine learning?


2. What are the main types of ensemble learning techniques? (Bagging, Boosting, Stacking)
3. What is the difference between Bagging and Boosting?
4. How does Bootstrap Aggregating (Bagging) improve model performance?
5. What is the purpose of Random Forest, and how does it use Bagging?

Previous Lab Questions:


1. How did your ensemble model perform relative to individual models (e.g., Decision Trees, SVM)?
2. How did Bagging and Boosting influence the accuracy and robustness of your model?
3. What challenges did you face while implementing ensemble methods, and how did you overcome
them?
4. How did you decide between using Bagging, Boosting, or Stacking for this particular problem?
5. What improvements would you make to your ensemble model if given more time (e.g., tuning
additional hyperparameters, using different base models)?

MARKS ALLOCATION

Details Marks Marks


Allotted Awarded
Pre Lab Questions 10
Aim & Procedure 30
Coding 30
Execution & Output 20
Post Lab 10
Questions(Viva)
Total 100

Result:
Thus the program for Implement ensemble techniques is executed successfully and Output is
verified.
lOMoARcPSD|374 581 18

EXP NO: 9 IMPLEMENT CLUSTERING ALGORITHMS

DATE:

Aim:
The aim of clustering is to find patterns and structure in data that may not be
Immediately apparent, and to discover relationships and associations between data points.

Algorithm:
1. Data preparation: The first step is to prepare the data that we want to cluster. This

may involve data cleaning, normalization, and feature extraction, depending on the type and

quality of the data.

2. Choosing a distance metric: The next step is to choose a distance metric or similarity

measure that will be used to determine the similarity between data points. Common distance

metrics include Euclidean distance, Manhattan distance, and cosine similarity.

3. Choosing a clustering algorithm: There are many clustering algorithms available, each

with its own strengths and weaknesses. Some popular clustering algorithms include K-

Means, Hierarchical clustering, and DBSCAN.

4. Choosing the number of clusters: Depending on the clustering algorithm chosen, we

may need to specify the number of clusters we want to form. This can be done using domain

knowledge or by using techniques such as the elbow method or silhouette analysis.

5. Cluster assignment: Once the clusters have been formed, we need to assign each data

point to its nearest cluster based on the chosen distance metric.

6. Interpretation and evaluation: Finally, we need to interpret and evaluate the results of

the clustering algorithm to determine if the clustering has produced meaningful and useful

insights.
lOMoARcPSD|374 581 18

Program:
# Import required libraries
from sklearn.datasets import make_blobs
from sklearn.cluster import KMeans, AgglomerativeClustering
import matplotlib.pyplot as plt
# Generate a random dataset with 100 samples and 4 clusters
X, y = make_blobs(n_samples=100, centers=4, random_state=42)
# Create a K-Means clustering object with 4 clusters
kmeans = KMeans(n_clusters=4, random_state=42, n_init=10)
# Fit the K-Means model to the dataset
kmeans.fit(X)
# Create a scatter plot of the data colored by K-Means cluster assignment
plt.scatter(X[:, 0], X[:, 1], c=kmeans.labels_, cmap='viridis')
plt.title("K-Means Clustering")
plt.show()
# Create a Hierarchical clustering object with 4 clusters
hierarchical = AgglomerativeClustering(n_clusters=4)
# Fit the Hierarchical model to the dataset
hierarchical.fit(X)
# Create a scatter plot of the data colored by Hierarchical cluster assignment
plt.scatter(X[:, 0], X[:, 1], c=hierarchical.labels_, cmap='viridis')
plt.title("Hierarchical Clustering")
plt.show()
lOMoARcPSD|374 581 18

Post Lab Questions:

1. What is the main goal of clustering in machine learning?


2. How does unsupervised learning differ from supervised learning, and how does clustering fit into this
category?
3. What are the differences between K-Means and Hierarchical Clustering in terms of approach and use cases?
4. How does the choice of distance metric (e.g., Euclidean, Manhattan) impact clustering results?
5. What is the curse of dimensionality, and how does it affect clustering algorithms?
6. What are the strengths and weaknesses of the K-Means algorithm?

Previous Lab Questions:

1. How do DBSCAN and K-Means compare in terms of handling clusters of different shapes and
densities?
2. How does silhouette scoring help assess the quality of clusters?
3. How does agglomerative hierarchical clustering differ from divisive hierarchical clustering?
4. What are some real-world applications where clustering is particularly useful?

MARKS ALLOCATION

Details Marks Marks


Allotted Awarded
Pre Lab Questions 10
Aim & Procedure 30
Coding 30
Execution & Output 20
Post Lab 10
Questions(Viva)
Total 100

Result:
Thus the program is executed successfully and output is verified.
lOMoARcPSD|374 581 18

EXP NO: 10 IMPLEMENTS THE EXPECTATION-MAXIMIZATION(EM)

DATE:

Aim:
The aim of implementing EM for Bayesian networks is to learn the parameters of the
network from incomplete or noisy data. This involves estimating the conditional probability
distributions(CPDs)for each node in the network given the observed data.The EM algorithm
is particularly useful when some of the variables are hidden or unobserved, as it can estimate
the likelihood of the hidden variables based on the observed data.

Algorithm:
1. Initialize the parameters: Start by initializing the parameters of the Bayesian

network, such as the CPDs for each node. These can be initialized randomly or using

some prior knowledge.

2. E-step: In the E-step, we estimate the expected sufficient statistics for the

unobserved variables in the network, given the observed data and the current

parameter estimates. This involves computing the posterior probability distribution

over the hidden variables, given the observed data and the current parameter estimates.

3. M-step: In the M-step, we maximize the expected log-likelihood of the

observed data with respect to the parameters. This involves updating the parameter

estimates using the expected sufficient statistics computed in the E-step.

4. Repeat steps 2 and 3 until convergence: Iterate between the E-step and M-

step until the parameter estimates converge, or some other stopping criterion is met.
lOMoARcPSD|374 581 18

Program:
# Import required libraries
from pgmpy.models import BayesianModel
from pgmpy.estimators import MaximumLikelihoodEstimator
from pgmpy.inference import VariableElimination
from pgmpy.factors.discrete import TabularCPD
import numpy as np
# Define the structure of the Bayesian network
model = BayesianModel([('C', 'S'), ('D', 'S')])
# Define the conditional probability distributions (CPDs)
cpd_c = TabularCPD(variable='C', variable_card=2, values=[[0.5], [0.5]])
cpd_d = TabularCPD(variable='D', variable_card=2, values=[[0.5], [0.5]])
cpd_s = TabularCPD(variable='S', variable_card=2,
values=[[0.8, 0.6, 0.6, 0.2],
[0.2, 0.4, 0.4, 0.8]],
evidence=['C', 'D'], evidence_card=[2, 2])
# Add the CPDs to the model
model.add_cpds(cpd_c, cpd_d, cpd_s)
# Verify the model
assert model.check_model(), "Model is inconsistent!"

# Generate some random data (5000 samples, 3 variables)


data = np.random.randint(low=0, high=2, size=(5000, 3))
# Create a MaximumLikelihoodEstimator and fit the model to the data
mle = MaximumLikelihoodEstimator(model, data)
model.fit(data, estimator=MaximumLikelihoodEstimator)
# Create a VariableElimination object to perform inference
infer = VariableElimination(model)
# Perform inference on some observed evidence
query = infer.query(variables=['S'], evidence={'C': 1})
print(query)
lOMoARcPSD|374 581 18

Post Lab Questions:

1. What is the Expectation-Maximization (EM) algorithm, and what type of problems does it solve?
2. How does the EM algorithm differ from other clustering algorithms like K-Means or DBSCAN?
3. In the context of the EM algorithm, what do the E-step and M-step represent, and how do they work together?
4. What are the assumptions made by the Gaussian Mixture Model (GMM), which is commonly used in
conjunction with EM?
5. How does the likelihood function play a role in the EM algorithm, and how is it optimized?

Previous Lab Questions:


1. What dataset did you use to implement the EM algorithm, and how did you determine that it was
suitable for this method?
2. How did you preprocess the data before applying the EM algorithm?
3. What method did you use to initialize the parameters (e.g., mean, covariance, weight) for the
Gaussian Mixture Model?
4. How did you implement the E-step and M-step in the EM algorithm? What steps did you follow?
5. How did you handle missing data during the application of the EM algorithm?

MARKS ALLOCATION

Details Marks Marks


Allotted Awarded
Pre Lab Questions 10
Aim & Procedure 30
Coding 30
Execution & Output 20
Post Lab 10
Questions(Viva)
Total 100

Result:
Thus the program is executed successfully and output is verified.
lOMoARcPSD|374 581 18

EXP NO: 11 BUILD SIMPLE NN MODELS

DATE:

Aim:

The aim of building simple neural network(NN)models is to create a basic


architecture that can learn patterns from data and make predictions based on the input. This
can involve defining the structure of the NN, selecting appropriate activation functions, and
tuning the hyper parameters to optimize the performance of the model.

Algorithm:
1. Data preparation: Preprocess the data to make it suitable for training the NN. This
may involve normalizing the input data, splitting the data into training and validation sets,
and encoding the output variables if necessary.
2. Define the architecture: Choose the number of layers and neurons in the NN, and
define the activation functions for each layer. The input layer should have one neuron per
input feature, and the output layer should have one neuron per output variable.
3. Initialize the weights: Initialize the weights of the NN randomly, using a small value
to avoid saturating the activation functions.
4. Forward propagation: Feed the input data forward through the NN, applying the
activation functions at each layer, and compute the output of the NN.
5. Compute the loss: Calculate the error between the predicted output and the true
output, using a suitable loss function such as mean squared error or cross-entropy.
6. Backward propagation: Compute the gradient of the loss with respect to the weights,
using the chain rule and back propagate the error through the NN to adjust the weights.
7. Update the weights: Adjust the weights using an optimization algorithm such as
stochastic gradient descent or Adam, and repeat steps 4-7 for a fixed number of epochs or
until the performance on the validation set stops improving.
8. Evaluate the model: Test the performance of the model on a held-out test set and
report the accuracy or other performance metrics.
lOMoARcPSD|374 581 18

Program:
# Import required libraries
import tensorflow as tf
from tensorflow import keras
# Load the MNIST dataset
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
# Normalize the input data
x_train = x_train / 255.0
x_test = x_test / 255.0
# Define the model architecture
model = keras.Sequential([
keras.layers.Flatten(input_shape=(28, 28)),
keras.layers.Dense(128, activation='relu'),
keras.layers.Dense(10, activation='softmax')
])
# Compile the model
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
# Train the model
model.fit(x_train, y_train, epochs=10, validation_data=(x_test, y_test))
# Evaluate the model
test_loss, test_acc = model.evaluate(x_test, y_test, verbose=2)
print('Test accuracy:', test_acc)
lOMoARcPSD|374 581 18

Post Lab Questions:

1. What is the purpose of the hidden layers in a neural network, and how do they contribute to learning complex
patterns?
2. How does backpropagation work, and how is it used to adjust the weights in a neural network?
3. What are the differences between shallow and deep neural networks, and when might you choose one over the
other?
4. How does gradient descent work in the context of neural networks, and what are the differences between
batch, stochastic, and mini-batch gradient descent?
5. What is overfitting, and how does it manifest in a neural network model?

Previous Lab Questions:


1. How did your model perform in terms of generalization to new data, and how do you evaluate
whether a model generalizes well?
2. How did changing the number of hidden layers or neurons affect the training and performance of
your neural network?
3. What challenges did you face in terms of training time or model convergence, and how did you
address them?
4. Did you face issues related to vanishing gradients or exploding gradients during training? If so,
how did you handle them?
5. How did the learning rate impact the speed of convergence during training? What would you do if
your model was taking too long to converge?

MARKS ALLOCATION

Details Marks Marks


Allotted Awarded
Pre Lab Questions 10
Aim & Procedure 30
Coding 30
Execution & Output 20
Post Lab 10
Questions(Viva)
Total 100

Result:
Thus the program is executed successfully and output is verified.
lOMoARcPSD|374 581 18

EXP NO: 12 BUILD DEEP LEARNING NN MODELS

DATE:

Aim:
The aim of building deep learning neural network (NN) models is to create a more complex
architecture that can learn hierarchical representations of data, allowing for more accurate predictions
and better generalization to new data. Deep learning models are typically characterized by having many
layers and a large number of parameters.

Algorithm:
1. Data preparation: Preprocess the data to make it suitable for training the NN. This
may involve normalizing the input data, splitting the data into training and validation sets,
and encoding the output variables if necessary.
2. Define the architecture: Choose the number of layers and neurons in the NN, and
define the activation functions for each layer. Deep learning models typically use activation
functions such as ReLU or variants thereof, and often incorporate dropout or other
regularization techniques to prevent overfitting.
3. Initialize the weights: Initialize the weights of the NN randomly, using a small value
to avoid saturating the activation functions.
4. Forward propagation: Feed the input data forward through the NN, applying the
activation functions at each layer, and compute the output of the NN.
5. Compute the loss: Calculate the error between the predicted output and the true
output, using a suitable loss function such as mean squared error or cross-entropy.
6. Backward propagation: Compute the gradient of the loss with respect to the weights,
using the chain rule and back propagate the error through the NN to adjust the weights.
7. Update the weights: Adjust the weights using an optimization algorithm such as
stochastic gradient descent or Adam, and repeat steps 4-7 for a fixed number of epochs or
until the performance on the validation set stops improving
8. Evaluate the model: Test the performance of the model on a held-out test set and
report the accuracy or other performance metrics.
9. Fine-tune the model: If necessary, fine-tune the model by adjusting the
hyper parameters or experimenting with different architectures
lOMoARcPSD|374 581 18

Program:
# Import required libraries
import tensorflow as tf
from tensorflow import keras
# Load the MNIST dataset
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
# Normalize the input data
x_train = x_train / 255.0
x_test = x_test / 255.0
# Define the model architecture
model = keras.Sequential([
keras.layers.Flatten(input_shape=(28, 28)),
keras.layers.Dense(128, activation='relu'),
keras.layers.Dropout(0.2),
keras.layers.Dense(10)
])
# Compile the model
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
# Train the model
model.fit(x_train, y_train, epochs=10, validation_data=(x_test, y_test))
# Evaluate the model
test_loss, test_acc = model.evaluate(x_test, y_test, verbose=2)
print('Test accuracy:', test_acc)
lOMoARcPSD|374 581 18

Post Lab Questions:

1. What distinguishes a deep neural network (DNN) from a shallow neural network?
2. How do deep learning models handle complex data representations compared to traditional machine learning
models?
3. Why are multiple hidden layers beneficial in deep neural networks? How do they contribute to learning
hierarchical features

4. Explain how backpropagation works in deep neural networks and why it is important for updating weights in
multiple layers.

Previous Lab Questions:


1. How did the depth of your model (i.e., the number of hidden layers) impact the training time
and model performance?
2. How did you assess whether your deep learning model was overfitting to the training data?
What strategies did you use to prevent overfitting?
3. What was the impact of using a higher learning rate or a lower learning rate on the
convergence speed and final performance of your model?
4. How did the choice of optimizer (e.g., Adam vs SGD) influence the model’s convergence rate
and overall training efficiency?

MARKS ALLOCATION

Details Marks Marks


Allotted Awarded
Pre Lab Questions 10

Aim & Procedure 30

Coding 30

Execution & Output 20

Post Lab 10
Questions(Viva)
Total 100

Result:
Thus the program is executed successfully and output is verified.

You might also like