PRACTICAL FILE
Department of CSE(AI&ML)
ARTIFICIAL PROGRAMMING LAB - III
NAME: Piyush Mudgal
Joyal Biju
BRANCH: CSE(AIML)
SEM :8th
ROLL NO: 24248
24939
CERTIFICATE
Certified that this Practical entitled “Artificial Programming Lab (LC-AI-444G)” submitted by Piyush
Joyal
Biju Roll
Mudgal No. 24248
24939, student of Computer Science Engineering (Artificial Intelligence & Machine
Learning) Department, Dronacharya College of Engineering, Gurgaon in the partial fulfillment of the
requirement for the award of Bachelors of Technology (Computer Science Engineering- Artificial
Intelligence & Machine Learning) Degree of Maharshi Dayanand University, Rohtak is a record of
students own study carried under my supervision & guidance.
Prof Praveen Kumari Dr Ritu Pahwa
Assistant Professor HOD
CSE (AI&ML) Department CSE (AI&ML) Department
INDEX
S.No. Experiment Signature
1. W.A.P on Python using Real World Weather Dataset. Show its
implementation.
.
2. W.A.P on Python using Real World Covid-19 Dataset. Show its
implementation.
3. W.A.P on Python using Real World Census Dataset.Show its
implementation.
4. W.A.P on Python using Real World London Housing Dataset.
Show its implementation.
5. W.A.P on Python using Real World Udemy Courses Dataset.
Show its implementation.
6. W.A.P on Python using Real World Netflix Dataset. Show its
implementation.
7. W.A.P on Python using Real World Cars Dataset.Show its
implementation.
8. W.A.P using brainsize and weight. Implement it
9. W.A.P Naive Bayes for SMS spam classification.
10. W.AP to load the dataset on breast cancer using SVM for the
Prediction if cancer is Benign or malignant. Using historical data
about patients diagnosed with cancer enables doctors to
differentiate malignant cases and benign ones are given
independent attributes.
EXPERIMENT – 1
Aim:W.A.P on Python using Real World Weather Dataset. Show its implementation.
Program:
EXPERIMENT – 2
Aim :W.A.P on Python using Real World Covid-19 Dataset. Show its implementation.
Program :
EXPERIMENT – 3
Aim :W.A.P on Python using Real World Census Dataset. Show its implementation.
Program :
EXPERIMENT – 4
Aim :W.A.P on Python using Real World London Housing Dataset. Show its
implementation.
Program :
EXPERIMENT – 5
Aim :W.A.P on Python using Real World Udemy Courses Dataset. Show its
implementation.
Program :
EXPERIMENT – 6
Aim: W.A.P on Python using Real World Netflix Dataset. Show its implementation.
Program :
EXPERIMENT – 7
Aim :W.A.P on Python using Real World Cars Dataset. Show its implementation.
Program :
EXPERIMENT – 8
Aim :W.A.P using brainsize and weight. Implement it
import pandas as pd
fromsklearn.model_selection import train_test_split
fromsklearn.linear_model import LinearRegression
importmatplotlib.pyplot as plt
# Load the dataset
data = pd.read_csv('brain_size.csv') # Replace 'brain_size.csv' with your dataset file path
print("Sample of the dataset:")
print(data.head())
# Selecting feature and target
X = data[['Head Size(cm^3)']] # Feature
y = data['Brain Weight(grams)'] # Target
# Splitting the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Creating and training the model
model = LinearRegression()
model.fit(X_train, y_train)
# Making predictions
y_pred = model.predict(X_test)
# Plotting predictions against actual values
plt.scatter(X_test, y_test, color='blue', label='Actual')
plt.plot(X_test, y_pred, color='red', label='Predicted')
plt.xlabel('Head Size (cm^3)')
plt.ylabel('Brain Weight (grams)')
plt.title('Brain Size Prediction using Linear Regression')
plt.legend()
plt.show()
# Calculating the coefficient of determination (R^2 score)
r_squared = model.score(X_test, y_test)
print("R^2 Score:", r_squared)
Sample of the dataset:
Gender Age Range Head Size(cm^3) Brain Weight(grams)
0 1 1 4512 1530
1 1 1 3738 1297
2 1 1 4261 1335
3 1 1 3777 1282
4 1 1 4177 1590
Output
R^2 Score: 0.6295952261276744
EXPERIMENT – 9
Aim :W.A.P Naive Bayes for SMS spam classification.
# Import necessary libraries
import pandas as pd
fromsklearn.model_selection import train_test_split
fromsklearn.feature_extraction.text import CountVectorizer
fromsklearn.naive_bayes import MultinomialNB
fromsklearn.metrics import accuracy_score, classification_report
# Load the dataset
df = pd.read_csv('spam.csv', encoding='latin-1')
# Drop unnecessary columns and rename columns
df = df[['v1', 'v2']]
df.columns = ['label', 'text']
# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(df['text'], df['label'], test_size=0.2, random_state=42)
# Vectorize the text data
vectorizer = CountVectorizer()
X_train_vectorized = vectorizer.fit_transform(X_train)
X_test_vectorized = vectorizer.transform(X_test)
# Train the Naive Bayes classifier
nb_classifier = MultinomialNB()
nb_classifier.fit(X_train_vectorized, y_train)
# Make predictions on the testing set
y_pred = nb_classifier.predict(X_test_vectorized)
# Evaluate the classifier
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)
report = classification_report(y_test, y_pred)
print("Classification Report:")
print(report)
Make sure to replace 'spam.csv' with the path to your dataset file. This code assumes that your
dataset file contains two columns: one for the label (spam or ham) and one for the text of the SMS
messages. The dataset should be in CSV format.
This code performs the following steps:
1. Loads the dataset.
2. Preprocesses the dataset by renaming columns.
3. Splits the dataset into training and testing sets.
4. Vectorizes the text data using CountVectorizer.
5. Trains a Multinomial Naive Bayes classifier.
6. Makes predictions on the testing set.
7. Evaluates the classifier's performance using accuracy and a classification report.
Ensure that you have the necessary libraries installed (pandas, scikit-learn) to run this code.
Additionally, you may need to preprocess your dataset further based on its format and content before
using it with this code.
Results:
Accuracy: 0.9874439461883409
Classification Report:
precision recall f1-score support
ham 0.99 1.00 0.99 966
spam 0.98 0.93 0.96 149
accuracy 0.99 1115
macroavg 0.98 0.96 0.98 1115
weightedavg 0.99 0.99 0.99 1115
The accuracy of the Naive Bayes classifier on the testing set is approximately 98.74%. This
indicates that the model correctly predicted the class (spam or ham) for nearly 98.74% of the
SMS messages in the testing set.
The classification report provides detailed metrics for each class (ham and spam), including
precision, recall, and F1-score. These metrics give insights into the classifier's performance
for each class.
For the 'ham' class (non-spam messages), the precision, recall, and F1-score are all very high,
indicating that the classifier performs well in identifying non-spam messages.
For the 'spam' class, the precision is slightly lower than for the 'ham' class, but still high.
However, the recall and F1-score are slightly lower, indicating that the classifier is slightly
less effective at identifying spam messages compared to non-spam messages.
Overall, the classifier demonstrates excellent performance, with high accuracy and strong metrics for
both classes.
EXPERIMENT – 10
Aim: W.AP to load the dataset on breast cancer using SVM for the Prediction if cancer
is Benign or malignant. Using historical data about patients diagnosed with cancer
enables doctors to differentiate malignant cases and benign ones are given independent
attributes.
SVM implementation in Python
Predict if cancer is Benign or malignant. Using historical data about patients diagnosed with cancer
enables doctors to differentiate malignant cases and benign ones are given independent attributes.
Steps
Load the breast cancer dataset from sklearn.datasets
Separate input features and target variables.
Buil and train the SVM classifiers using RBF kernel.
Plot the scatter plot of the input features.
Plot the decision boundary.
Plot the decision boundary
Python3
# Load the important packages
from sklearn.datasets import load_breast_cancer
import matplotlib.pyplot as plt
from sklearn.inspection import DecisionBoundaryDisplay
from sklearn.svm import SVC
# Load the datasets
cancer = load_breast_cancer()
X = cancer.data[:, :2]
y = cancer.target
#Build the model
svm = SVC(kernel="rbf", gamma=0.5, C=1.0)
# Trained the model
svm.fit(X, y)
# Plot Decision Boundary
DecisionBoundaryDisplay.from_estimator(
svm,
X,
response_method="predict",
cmap=plt.cm.Spectral,
alpha=0.8,
xlabel=cancer.feature_names[0],
ylabel=cancer.feature_names[1],
)
# Scatter plot
plt.scatter(X[:, 0], X[:, 1],
c=y,
s=20, edgecolors="k")
plt.show()