0% found this document useful (0 votes)

13 views19 pages

Harsh Kumar MLP Assignment 1

n/a

Uploaded by

lionnov23

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views19 pages

Harsh Kumar MLP Assignment 1

n/a

Uploaded by

lionnov23

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

Affiliated to Dr. A. P. J.

Abdul Kalam Technical University, Lucknow, Uttar Pradesh

ASSIGNMENT-1

PROGRAM: -MBA (BUSINESS ANALYTICS)

SEMESTER-3

ACADEMIC YEAR: - 2024-2025

SUBJECT: - MACHINE LEARNING USING PYTHON

SUBJECT CODE: - KMBA-352

SUBMITTED BY: - SUBMITTED TO: -

HARSH KUMAR DR. SMITA AGARWAL
ROLL NO.: 2302022 (Professor)
INDEX

S.No. Module Signature

1. Write a Python program to load the iris data from
a given csv file into a data frame and print the
shape of the data, type of the data and first 3 rows.
a) Importing the required libraries
b) Loading the data into the data frame
c) Print the shape of the data
d) Print the datatype of "data"
e) Print the first 3 rows using head()
2. Write a Python program using Scikit-learn to
print the keys, number of rows columns, feature
names, and the description of the Iris data
a) Print Keys of the data
b) Print the data type of each feature
c) Print the column names, total data, and data
types of the iris data set
d) Print the total number of data for each species.
e) Statistical Exploratory Data Analysis
f) Find the unique values of the species
3. Write a Python program to split the iris dataset
into its attributes (X) and labels (y).
a) Importing the required libraries
b) Loading the data into the data frame
c) Drop the columns that are not required
d) Split the Columns
4. Write a Python program to draw a scatterplot,
then add a joint density estimate to describe
individual distributions on the same plot between
Sepal length and Sepal width.
a) Importing the required libraries
b) Loading the data into data frame
c) Scatter Plot
d) Kernel Density Plots in a Joint Plot
e) Joint Plot with KDE
5. Write a Python program using Scikit-learn to split
the iris dataset into 70% train data and 30% test
data. Out of a total of 150 records, the training set
contains 120 records and the test contains 30
records. Print both dataset
a) Importing the required libraries
b) Loading the data into the data frame
c) Drop the columns that are not required
d) Split the columns
e) Split arrays or matrices into random train and test
subsets
6. Implement and demonstrate the any suitable
algorithm for finding the most specific hypothesis
based on a given set of training data samples.
Read the training data from a .CSV file.
a) Importing the required libraries
b) Loading the data into the data frame
c) Dropping the result column
d) Initializing the hypothesis array with the data in
the first row
e) Iterating the row and replacing the value
7. For a given set of training data examples stored in
a .CSV file, implement and demonstrate the
Candidate-Elimination algorithm to output a
description of the set of all hypotheses consistent
with the training examples.
a) Importing the required libraries
b) Loading the data into the data frame
c) Separate Input data(concept) and the
Output(target)
Module 1

Write a Python program to load the iris data from a given csv file into a data
frame and print the shape of the data, type of the data and first 3 rows.
Dataset Source

Dataset: Iris Dataset

Download from:

Github: https://gist.github.com/netj/8836201

Kaggle : https://www.kaggle.com/datasets/arshid/iris-flower-dataset

a) Importing the required libraries

Code:

b) Loading the data into the data frame

Code:

c) Print the shape of the data

Code:

Output:

d) Print the datatype of “data”

Code:

Output:

e) Print the first 3 rows using head()

Code:

Output:
f) Print the last 3 rows using tail()
Code:

Output:

MODULE 2
Write a Python program using Scikit-learn to print the keys, number of rows
columns, feature names, and the description of the Iris data

a) Print keys of the data

Code:

Output:

b) Print the data type of each feature

Code:

Output:

c) Print the column names, total data, data types of the iris data set
Code:

Output:

d) Print the total number of data for each species

Code:

Output:

e) Statistical Exploratory Data Analysis

Code:

Output:

f) Find the unique values of the species

Code:

Output:
MODULE 3

Write a Python program to split the iris dataset into its attributes (X) and
labels (y).

Dataset: Iris Dataset

Download from:
Github: https://gist.github.com/netj/8836201
Kaggle : https://www.kaggle.com/datasets/arshid/iris-flower-dataset

a) Importing the required libraries

Code:

b) Loading the data into the data frame

Code:

Output:

c) Drop the columns that are not required

Code:

Output:
d) Split the Columns
Code:

Output:

Code:

Output:
MODULE 4
Write a Python program to draw a scatterplot, then add a joint density
estimate to describe individual distributions on the same plot between Sepal
length and Sepal width.

Theory: The joint plot is a way of understanding the relationship between two variables and the
distribution of individuals of each variable (Distribution Plots). The joint plot mainly consists of
three separate plots in which, one of it was the middle figure that is used to see the relationship
between x and y. So, this area will give the information about the joint distribution, while the
remaining two areas will provide us with the marginal distribution for the x-axis and y-axis.
Syntax:
1. seaborn.jointplot(x, y, data=None, kind='scatter', stat_func=None, color=None, height=5,
ratio=3, space=0.3, dropna=True, xlim=None, ylim=None, joint_kws=None,
marginal_kws=None, annot_kws=None, **kwargs)
Parameters
• x,y: These are variables which will specify the x-axis and y-axis.
• data: It is an input dataset.
• kind: It is a protocol to draw
• color: It is the parameter used to take a color for the plot elements.
• space: It denotes the space between a joint distribution and marginal distribution.
• xlim, ylim: It represents the limit of the x-axis and y-axis.
A joint plot consisting of 3 separate plots. From those three,
• One of the plots displays the bivariate graph showing how the dependent variable (Y)is
different from the independent variable (X). The bivariate will tell the relationship
between two variables and represent the strength of their relationship.
• The other plot is placed horizontally at the top of the bivariate graph, showing the
distribution of the dependent variable (Y). It is because the univariate will mainly focus on
one variable, describing, summarising and showing any patterns in our data.
• The function called joint plot() in the library called Seaborn will create the scatter plot by
default with two histograms at the top and right margins of the graph.

The parameter 'kind' will be set to 'kde' in the above function so that the joint plot will display a
bivariate density curve on the main plot, and univariate density will curve on the margins.

Plot overlay
plot_joint(): This method is used to customize the joint plot (the central area where two
variables are plotted). By default, a jointplot creates a scatter plot in the joint area, but
.plot_joint() allows you to overlay a different type of plot, such as a KDE plot, histogram, or
regression line.
a) Importing the required libraries
Code:

b) Loading the data into the data frame

Code:

Output:

c) Scatter Plot
Code:

Output:

d) Kernel Density Plots in a Joint Plot

Code:
Output:

e) Joint Plot with KDE

Code:

Output:

MODULE 5

Write a Python program using Scikit-learn to split the iris dataset into 70%
train data and 30% test data. Out of a total of 150 records, the training set
contains 120 records and the test contains 30 records. Print both datasets.
Dataset: Iris Dataset

Download from :
Github: https://gist.github.com/netj/8836201
Kaggle : https://www.kaggle.com/datasets/arshid/iris-flower-dataset

a) Importing the required libraries

Code:

b) Loading the data into the data frame

Code:

Output:

c) Drop the columns that are not required

Code:

Output:

d) Split the columns

Code:
Output:

e) Split arrays or matrices into random train and test subsets

Code:

Output:
MODULE 6

Implement and demonstrate any suitable algorithm for finding the most
specific hypothesis based on a given set of training data samples. Read the
training data from a .CSV file.
Theory: Find S Algorithm¶

Algorithm:

Initialize h to the most specific hypothesis in H

For each positive training instance x For each attribute constraint a, in h

• If the constraint a, is satisfied by x

• Then do nothing
• Else replace a, in h by the next more general constraint that is satisfied by x Output
hypothesis h

a) Importing the required libraries

Code:
b) Loading the data into the data frame
Code:

Output:

c) Dropping the result column

Code:

Output:

d) Initializing the hypothesis array with the data in the first row
Code:

Output:
e) Iterating the row and replacing the value
Code:

Output:

Code:

Output:

Code:

Output:

Code:

Output:
Code:

Output:

MODULE 7

For a given set of training data examples stored in .CSV file, implement and
demonstrate the Candidate-Elimination algorithm to output a description of
the set of all hypotheses consistent with training examples.

Theory: Term used

• Concept learning: Concept learning is basically the learning task of the machine
(Learn by Train data)
• General Hypothesis: Not Specifying features to learn the machine.
• G = {‘?’, ‘?’,’?’,’?’…}: Number of attributes
• Specific Hypothesis: Specifying features to learn machine (Specific feature)
• S= {‘pi’,’pi’,’pi’…}: The number of pi depends on a number of attributes
• Version Space: It is an intermediate of general hypothesis and Specific hypothesis. It
not only just writes one hypothesis but a set of all possible hypotheses based on
training data-set.-set. set.

Algorithm

• Step1: Load Data set

• Step2: Initialize General Hypothesis and Specific Hypothesis.
• Step3: For each training example
• Step4: If example is positive example
- if attribute_value == hypothesis_value: -- Do nothing
- else: -- replace attribute value with '?' (Basically generalizing it)
• Step5: If example is Negative example
Make generalize hypothesis more specific

a) Importing the required libraries

Code:

b) Loading the data into the data frame

Code:
Output:

c) Separate Input data (concept) and the Output(target)

Code:

Output:

d) Candidate Elimination Algorithm

Code:
Output:

ML Project Assigment
No ratings yet
ML Project Assigment
32 pages
Mlpy 2
No ratings yet
Mlpy 2
18 pages
Lab Exercise 2
No ratings yet
Lab Exercise 2
5 pages
ML N PY Programs
No ratings yet
ML N PY Programs
17 pages
CS3362 Data Science Laboratory Manual 2022-23
No ratings yet
CS3362 Data Science Laboratory Manual 2022-23
54 pages
Lab 3 - SciKitLearn ML
No ratings yet
Lab 3 - SciKitLearn ML
2 pages
Eda Unit 1
No ratings yet
Eda Unit 1
7 pages
Machine Learning Lab Dlihebca6sem
100% (1)
Machine Learning Lab Dlihebca6sem
25 pages
Exercise and Experiment 3
No ratings yet
Exercise and Experiment 3
14 pages
ML Lab File
No ratings yet
ML Lab File
43 pages
DS Manual
No ratings yet
DS Manual
34 pages
ML Lab - Abbs
No ratings yet
ML Lab - Abbs
23 pages
Ass-1 Prac
No ratings yet
Ass-1 Prac
23 pages
12 AI Lab Practical File HW
No ratings yet
12 AI Lab Practical File HW
25 pages
Batch1 Ds
No ratings yet
Batch1 Ds
15 pages
Lab Manual
No ratings yet
Lab Manual
19 pages
Dataset Iris Flower. Final
No ratings yet
Dataset Iris Flower. Final
7 pages
ML LabReport Final Index Edited
No ratings yet
ML LabReport Final Index Edited
35 pages
Task 1
No ratings yet
Task 1
14 pages
Data Analysis Lab with Python
No ratings yet
Data Analysis Lab with Python
11 pages
Sheet1 2
No ratings yet
Sheet1 2
2 pages
Python ML Exercises for Beginners
No ratings yet
Python ML Exercises for Beginners
19 pages
ML File Syllabus
No ratings yet
ML File Syllabus
43 pages
DSBDA Lab Manual
No ratings yet
DSBDA Lab Manual
155 pages
Unit-2 Feature Selection
No ratings yet
Unit-2 Feature Selection
92 pages
ML Pgms - 24mar2025
No ratings yet
ML Pgms - 24mar2025
23 pages
Lab Manual
No ratings yet
Lab Manual
7 pages
Python Lesson 6 Classes, Objects, Dictionaries, Modules, Libraries, and Graphing Techniques
No ratings yet
Python Lesson 6 Classes, Objects, Dictionaries, Modules, Libraries, and Graphing Techniques
21 pages
Assignment 1
No ratings yet
Assignment 1
2 pages
To Study About Numpy, Pandas and Matplotlib Libraries in Python
No ratings yet
To Study About Numpy, Pandas and Matplotlib Libraries in Python
21 pages
Lab Cs
No ratings yet
Lab Cs
38 pages
FDS Lab Manual (Print)
No ratings yet
FDS Lab Manual (Print)
43 pages
PML Lab Manual-7-12
No ratings yet
PML Lab Manual-7-12
6 pages
ML Lab Manual for CSE Students
No ratings yet
ML Lab Manual for CSE Students
32 pages
Assignment 4
No ratings yet
Assignment 4
10 pages
Exp 1
No ratings yet
Exp 1
22 pages
Import Pandas As PD S1 PD - Series ( (1,2,3,4) ) S2 PD - Series ( (7,8) ) S3 S1 + S2 Print (S3.size)
No ratings yet
Import Pandas As PD S1 PD - Series ( (1,2,3,4) ) S2 PD - Series ( (7,8) ) S3 S1 + S2 Print (S3.size)
6 pages
Ai Record Programs
No ratings yet
Ai Record Programs
34 pages
Exp7 11 Data Science
No ratings yet
Exp7 11 Data Science
23 pages
AD3301 DEV Lab Manual
No ratings yet
AD3301 DEV Lab Manual
26 pages
Ludic - Workshop - Iris - Copie
No ratings yet
Ludic - Workshop - Iris - Copie
5 pages
BDA File
No ratings yet
BDA File
26 pages
Edap Lab
No ratings yet
Edap Lab
47 pages
BCSL606 Machine Learning Lab
No ratings yet
BCSL606 Machine Learning Lab
33 pages
Handout 2 Machine Learning-Matplotlib and Pandas: Plot Function
No ratings yet
Handout 2 Machine Learning-Matplotlib and Pandas: Plot Function
5 pages
DXV Guidelines
No ratings yet
DXV Guidelines
3 pages
Python Data Analysis Package Guide
No ratings yet
Python Data Analysis Package Guide
18 pages
Data Sci
No ratings yet
Data Sci
6 pages
9 Data Visualization
No ratings yet
9 Data Visualization
3 pages
IP Book 12 Question Bank
No ratings yet
IP Book 12 Question Bank
20 pages
ML Yogesh
No ratings yet
ML Yogesh
23 pages
Data Analytics Lab Manual
No ratings yet
Data Analytics Lab Manual
23 pages
Module 4 - Writing Functions in Python
No ratings yet
Module 4 - Writing Functions in Python
20 pages
EDA Document
No ratings yet
EDA Document
13 pages
DSBDA Lab Manual
No ratings yet
DSBDA Lab Manual
155 pages
Data Science
No ratings yet
Data Science
3 pages
ML Lab Programs
No ratings yet
ML Lab Programs
2 pages
Dav Pracs
No ratings yet
Dav Pracs
9 pages
Python
No ratings yet
Python
20 pages
Cost Analysis
No ratings yet
Cost Analysis
16 pages
Correlation
No ratings yet
Correlation
51 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
42 pages
Measure of Dispersion
No ratings yet
Measure of Dispersion
69 pages
Big Data Question
No ratings yet
Big Data Question
24 pages
Health Budget 2024-25
No ratings yet
Health Budget 2024-25
7 pages
PHY312
No ratings yet
PHY312
187 pages
Pig Latin Reference Manual 2
No ratings yet
Pig Latin Reference Manual 2
149 pages
1.probability Random Variables and Stochastic Processes Athanasios Papoulis S. Unnikrishna Pillai 1 300 91 120
No ratings yet
1.probability Random Variables and Stochastic Processes Athanasios Papoulis S. Unnikrishna Pillai 1 300 91 120
30 pages
Differentiation and Limits Guide
No ratings yet
Differentiation and Limits Guide
2 pages
DPP 5 Bilinear Transformations
No ratings yet
DPP 5 Bilinear Transformations
2 pages
Algebra Handbook
100% (1)
Algebra Handbook
187 pages
Electronics 12 00911 v2
No ratings yet
Electronics 12 00911 v2
19 pages
Handbook of Mathematics by Arihant @
100% (5)
Handbook of Mathematics by Arihant @
464 pages
000 Mathematics
No ratings yet
000 Mathematics
13 pages
Practice
No ratings yet
Practice
8 pages
Mathematics-I Syllabus
No ratings yet
Mathematics-I Syllabus
3 pages
Abb Reg PTL
No ratings yet
Abb Reg PTL
13 pages
Indefinite Integral Theorems & Examples
No ratings yet
Indefinite Integral Theorems & Examples
3 pages
Python Functions for Beginners
No ratings yet
Python Functions for Beginners
17 pages
MATH01 CO1 Lesson 2 Inverse Functions
100% (1)
MATH01 CO1 Lesson 2 Inverse Functions
16 pages
Example MYP Unit Planner Grade 9 Quadratics
100% (1)
Example MYP Unit Planner Grade 9 Quadratics
5 pages
U5l12 Activity Guide - Traversals Make
80% (5)
U5l12 Activity Guide - Traversals Make
4 pages
Block Diagram Reduction
No ratings yet
Block Diagram Reduction
14 pages
624faab7abeeb10018b3e81a - ## - Chapter - 01 Mathematics JEE XII Exericse - 2
No ratings yet
624faab7abeeb10018b3e81a - ## - Chapter - 01 Mathematics JEE XII Exericse - 2
6 pages
Module 1
No ratings yet
Module 1
11 pages
User Defined Function
No ratings yet
User Defined Function
18 pages
Cmi Preparation
No ratings yet
Cmi Preparation
71 pages
Answer Calculus Exam
No ratings yet
Answer Calculus Exam
21 pages
Spotlight - Phase-2 - (2022-23) - Week-1 - Paper-2 - Compile (2020-P-2) - (Only Que.)
No ratings yet
Spotlight - Phase-2 - (2022-23) - Week-1 - Paper-2 - Compile (2020-P-2) - (Only Que.)
12 pages
SAP HR PD Auths Step by Step
No ratings yet
SAP HR PD Auths Step by Step
12 pages
Portfolio: Edleen Gay O. Pumarada
No ratings yet
Portfolio: Edleen Gay O. Pumarada
18 pages
Semi-Detailed Lesson Plan: School Grade Level
No ratings yet
Semi-Detailed Lesson Plan: School Grade Level
10 pages
Programming Well - Harvard CS51
No ratings yet
Programming Well - Harvard CS51
509 pages
Veusz Manual
No ratings yet
Veusz Manual
53 pages

Harsh Kumar MLP Assignment 1

Uploaded by

Harsh Kumar MLP Assignment 1

Uploaded by

Affiliated to Dr. A. P. J.

Abdul Kalam Technical University, Lucknow, Uttar Pradesh

PROGRAM: -MBA (BUSINESS ANALYTICS)

ACADEMIC YEAR: - 2024-2025

SUBJECT: - MACHINE LEARNING USING PYTHON

SUBJECT CODE: - KMBA-352

SUBMITTED BY: - SUBMITTED TO: -

S.No. Module Signature

Dataset: Iris Dataset

a) Importing the required libraries

b) Loading the data into the data frame

c) Print the shape of the data

d) Print the datatype of “data”

e) Print the first 3 rows using head()

a) Print keys of the data

b) Print the data type of each feature

d) Print the total number of data for each species

e) Statistical Exploratory Data Analysis

f) Find the unique values of the species

Dataset: Iris Dataset

a) Importing the required libraries

b) Loading the data into the data frame

c) Drop the columns that are not required

b) Loading the data into the data frame

d) Kernel Density Plots in a Joint Plot

e) Joint Plot with KDE

a) Importing the required libraries

b) Loading the data into the data frame

c) Drop the columns that are not required

d) Split the columns

e) Split arrays or matrices into random train and test subsets

Initialize h to the most specific hypothesis in H

For each positive training instance x For each attribute constraint a, in h

• If the constraint a, is satisfied by x

a) Importing the required libraries

c) Dropping the result column

Theory: Term used

• Step1: Load Data set

a) Importing the required libraries

b) Loading the data into the data frame

c) Separate Input data (concept) and the Output(target)

d) Candidate Elimination Algorithm

You might also like