Data Mining Regression and Classification

This homework assignment involves analyzing a 'Restaurant Orders' dataset using regression techniques, divided into three parts: single-feature linear regression, multiple-feature linear regression, and polynomial regression. Students are required to implement models in PyTorch, visualize results, and provide detailed analyses of their findings. Additionally, there are sections on binary and multi-class classification, emphasizing the importance of clear organization, correctness, and insightful interpretation in the final report.

Uploaded by

xexudragon11

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views11 pages

Data Mining Regression and Classification

Uploaded by

xexudragon11

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Homework: Description

Overview
In this assignment, you will revisit the pre-processed “Restaurant Orders” dataset from
Homework #1 and apply regression techniques to uncover relationships between different
variables (features) in the dataset. The assignment is divided into three main parts:
1. Single-Feature Linear Regression

2. Multiple Linear Regression (3 Features)

3. Polynomial Regression
TableNumber WaiterID OrderDateTime ItemsOrdered NumberOfGuests BillAmount PaymentMethod DiscountUsed WaitTime Tip CustomerSatisfaction

5 W005 2025-01-06 13:38 [Water,Soup,Pizza] 3 41.34 CreditCard Yes 14 2.94 Neutral

7 W005 2025-01-05 03:03 [Fries,Soup] 1 44.18 PayPal No 18 5.49 Satisfied

4 W003 2025-01-02 01:12 [Bread,Beer,Soup,Coke] 5 60.16 Cash Yes 49 4.72 Neutral

5 W007 2025-01-08 22:41 [Salad,Fries] 1 40.75 Cash Yes 42 6.4 Satisfied

2 W003 2025-01-01 18:21 [Soup,Burger,Pasta,Bread] 1 71.96 Cash No 58 10.14 Satisfied

2 W006 2025-01-06 21:43 [Burger,Salad] 4 33.28 PayPal Yes 15 0 Unsatisfied

10 W003 2025-01-05 16:15 [Juice,Pizza,Pasta,Water,Salad] 2 92.57 PayPal Yes 29 14.35 Satisfied

2 W005 2025-01-03 22:05 [Wine] 4 35.06 PayPal Yes 29 5.49 Satisfied

Homework: Description
1. Single-Feature Linear Regression
Objective

• Select one dependent variable (output) and one independent variable (feature) from the
restaurant dataset.

• Train a simple linear regression model to predict the output from the single chosen feature.

Deliverables

• A plot with data points and the regression line.

• A short write-up explaining your training procedure, final parameters, final loss, and
observations.
Homework: Description
1. Single-Feature Linear Regression
Tasks/Steps
Data Selection:
Justify which single feature and which output you chose.
Implementation in PyTorch:
Initialize model parameters.
Forward pass and loss function (MSE).
Optimization algorithm (gradient descent).
Visualization:
Plot the data points (scatter plot).
Plot the best-fit line learned by your model on the same figure.
Analysis:
Summarize the training process.
Discuss any difficulties or anomalies you observed when fitting the line.
Interpret how well the linear model fits the data visually and numerically (final loss, etc.).
Homework: Description
2. Multiple-Feature Linear Regression
Objective

• Select 3 features and 1 output from the restaurant dataset (may be the same output
variable as in Part 1 or a different one).

• Train a simple linear regression model to predict the output from the chosen features.

Deliverables

• If it is an achievale task, the deliverables should be similar with the previous Single-Feature
Linear Regression; follow the tasks in the next page.

• If you think it is not an achievale task, prvide the analysis to show the reason; just ignore
the tasks in the next page.
Homework: Description
2. Single-Feature Linear Regression
Tasks/Steps
Data Selection:
Justify which three features and which output you chose; Provide a brief rationale.
Implementation in PyTorch:
Build a multi-feature linear model.
Train and optimize (MSE; gradient descent).
Results:
Show the final loss (training error; MSE).
If the model successfully converges, report your final set of learned weights and bias; if the
model fails to converge or you encounter difficulties, analyze and explain potential reasons.
Interpretation:
Discuss whether the multi-feature regression model appears to be a better fit than a single-
feature model.
Reflect on any new challenges that arose when using multiple features.
Homework: Description
3. Polynomial Regression
Objective

• Using the same 3 features and 1 output from Part 2, implement polynomial regression of at
least three different polynomial degrees (e.g., degree=2, degree=4, degree=6).

• Train a polynomial regression model to predict the output from the chosen features.

Deliverables

• A summary table or short discussion comparing performance for each chosen polynomial
degree.

• Plots or numeric results illustrating how well each polynomial model fits.
• A reflection on potential risks of higher-degree polynomials (e.g., overfitting).
Homework: Description
3. Single-Feature Linear Regression
Tasks/Steps
Feature Transformation:
Explain how you generated polynomial terms (e.g., by manually expanding each feature or
using a PyTorch mechanism for polynomial features).
Decide how you handle interactions (only single-feature powers vs. cross-terms).
Training & Model Comparison:
Train a polynomial regression model for each degree (≥ 3 degrees).
Compare the training losses across different degrees.
Analysis:
Discuss any overfitting or underfitting you observe.
Identify which polynomial degree produced the most favorable result based on loss or
other metrics.
Provide any insights into runtime or complexity differences.
Homework: Description
4. Binary Classification
Tasks/Steps
Choose a Binary Label:
Construct a binary classification label from the dataset (e.g., “Satisfied” vs. “Unsatisfied”
or build any other appropriate yes/no outcome, e.g., “Satisfied” vs. “Not Satisfied
(Unsatisfied + Neutral)”).
Implement Logistic Regression in PyTorch:
Loss function: Binary Cross-Entropy (BCE).
Data Splitting & Preprocessing:
Clearly split your data into training and testing (or validation) sets.
Model Training & Evaluation:
Train on the training set for a certain number of epochs or until convergence.
Report & Visualization
Summarize final training loss, test performance metrics, and any interesting findings.
(Optional) Provide a decision boundary plot if feasible (for a single or two-feature
scenario), or a confusion matrix heatmap to illustrate predictions vs. ground truth.
Homework: Description
5. Multiple-classes Classification
Tasks/Steps
Label Selection:
Identify a multi-class label from your dataset (≥ 3 classes); if your data does not inherently have three
or more distinct classes, you can derive one.
Strategy:
Implement one-vs-all (OvA) and one-vs-one (OvO) logistic regression.
OvA: Train a separate logistic regression classifier for each class vs. “all others.”
OvO: Train pairwise classifiers for each possible pair of classes.
Implementation Details:
In any case, the core idea is to manually handle the multi-class scenario within PyTorch, rather than
relying on built-in high-level methods.
Training & Evaluation
Train each classifier on the training data.
On the test set, produce predictions by combining the outputs of your sub-classifiers (OvA and OvO
logic).
Analysis & Discussion
How does OvA compare to OvO (in terms of code complexity, training time, or performance)?
Homework: Description
Implementation Requirements
1. PyTorch Only
• You must implement the regression logic (forward pass, gradient updates, etc.) in
PyTorch.
• Do not use scikit-learn or other high-level ML libraries to handle the model training.
2. Data Handling
• You may use pandas or plain Python to load the dataset from CSV or other formats.
• Feel free to do any necessary feature engineering or transformations to handle missing
values, scaling, etc.
3. Plots & Visualization
• matplotlib or seaborn is recommended for plotting.
• Clearly label axes, legend, and titles for each figure.
4. Written Report
• Provide your observations, interpretations, and analysis for each part.
• Discuss any difficulties or additional experiments you performed.
Homework: Description
Report & Grading
1. Organization (20%)
• Is your submission clearly structured? Are code, plots, and analysis sections logically
presented?
2. Correctness & Implementation (40%)
• Proper usage of PyTorch for linear and polynomial regression.
• Evidence of correct gradient-based training for each part.
3. Analysis & Interpretation (40%)
• Clarity in explaining results, including final losses, potential reasons for success or
failure.
• Depth of insight into overfitting, data distribution, or hyperparameter choices.
4. Extra Credit / Deep Thinking (up to +10%)
• If your report is well-organized, provides deeper insights or additional experiments
(e.g., trying different regularization, comparing different subsets of features,
exploring other polynomial expansions), you may receive extra points.

CIS 419/519 Introduction To Machine Learning Assignment 2: Instructions
No ratings yet
CIS 419/519 Introduction To Machine Learning Assignment 2: Instructions
12 pages
C2W3 Lab 01 Model Evaluation and Selection
No ratings yet
C2W3 Lab 01 Model Evaluation and Selection
21 pages
C2W3 Lab 01 Model Evaluation and Selection
No ratings yet
C2W3 Lab 01 Model Evaluation and Selection
21 pages
PythonForML2023 Laboratory07 08 Regression Classification Update2
No ratings yet
PythonForML2023 Laboratory07 08 Regression Classification Update2
6 pages
A3 Classification and Feature Engineering
No ratings yet
A3 Classification and Feature Engineering
2 pages
ML Lab 06 Manual - Linear Regression 1 (Version 6)
No ratings yet
ML Lab 06 Manual - Linear Regression 1 (Version 6)
8 pages
ML-Lab07-Building and Evaluating Multivariate Regression Models in Python
No ratings yet
ML-Lab07-Building and Evaluating Multivariate Regression Models in Python
5 pages
Polynomial Regression Lab Guide
No ratings yet
Polynomial Regression Lab Guide
10 pages
Logistic Regression Lab Guide
No ratings yet
Logistic Regression Lab Guide
9 pages
Homework 2
No ratings yet
Homework 2
3 pages
Machine Learning Hands-On
100% (1)
Machine Learning Hands-On
18 pages
Wine Classification
No ratings yet
Wine Classification
10 pages
Machine Learning Strategies
No ratings yet
Machine Learning Strategies
59 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
22 pages
Exp 4 - LM
No ratings yet
Exp 4 - LM
5 pages
Bil470 hw2 Summer2024
No ratings yet
Bil470 hw2 Summer2024
4 pages
ML Lab Programs
No ratings yet
ML Lab Programs
9 pages
CP4252 Machine Learning Lab Manual
No ratings yet
CP4252 Machine Learning Lab Manual
26 pages
Message
No ratings yet
Message
2 pages
Machine Learning Assignment Guide
No ratings yet
Machine Learning Assignment Guide
2 pages
C1 W2 Lab05 Sklearn GD Soln
No ratings yet
C1 W2 Lab05 Sklearn GD Soln
3 pages
Practical File OF Machine Learning
No ratings yet
Practical File OF Machine Learning
16 pages
Lab Mannual of ML
No ratings yet
Lab Mannual of ML
43 pages
Machine Learning-SEAIML-241P (PR) Bharat
No ratings yet
Machine Learning-SEAIML-241P (PR) Bharat
42 pages
hw1 2487155975100812
No ratings yet
hw1 2487155975100812
6 pages
27 KrishParasShah
No ratings yet
27 KrishParasShah
17 pages
Assignment 1
No ratings yet
Assignment 1
3 pages
Assignment 1
No ratings yet
Assignment 1
4 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
43 pages
1 - Data Preprocessing and Cleaning - 55
No ratings yet
1 - Data Preprocessing and Cleaning - 55
8 pages
178 hw3
No ratings yet
178 hw3
3 pages
House Price Prediction: Project Description
No ratings yet
House Price Prediction: Project Description
11 pages
ML Lab
No ratings yet
ML Lab
23 pages
TensorFlow Regression Assignment
No ratings yet
TensorFlow Regression Assignment
4 pages
Lab 08 - Data Preprocessing
No ratings yet
Lab 08 - Data Preprocessing
9 pages
Lab Manual 04
No ratings yet
Lab Manual 04
12 pages
Machine Learnin
100% (2)
Machine Learnin
23 pages
Multilinear ProblemStatement
No ratings yet
Multilinear ProblemStatement
132 pages
Simple Linear Regression - Assign2
No ratings yet
Simple Linear Regression - Assign2
9 pages
CS335 Lab6
No ratings yet
CS335 Lab6
7 pages
Assignment 2
No ratings yet
Assignment 2
3 pages
Important Questions
No ratings yet
Important Questions
4 pages
7641 Assignment 1
No ratings yet
7641 Assignment 1
4 pages
Machine Learning Lab Assignment: Instructions
No ratings yet
Machine Learning Lab Assignment: Instructions
4 pages
Linear Regression Lab Guide
No ratings yet
Linear Regression Lab Guide
5 pages
C1 W1 Lab03 Model Representation Soln-Copy1
No ratings yet
C1 W1 Lab03 Model Representation Soln-Copy1
7 pages
C1 W1 Lab02 Model Representation Soln
No ratings yet
C1 W1 Lab02 Model Representation Soln
7 pages
LAB5 Regularization
No ratings yet
LAB5 Regularization
6 pages
C1 W2 Lab05 Sklearn GD Soln
No ratings yet
C1 W2 Lab05 Sklearn GD Soln
3 pages
Dda3020 2024F HW1
No ratings yet
Dda3020 2024F HW1
6 pages
C1 W1 Lab02 Model Representation Soln
No ratings yet
C1 W1 Lab02 Model Representation Soln
7 pages
LAB 1 Amin Modified
No ratings yet
LAB 1 Amin Modified
8 pages
R20!63!20ITC27 Deep Learning Lab Manual (Minor Proj 2) Dr.K.ramu
No ratings yet
R20!63!20ITC27 Deep Learning Lab Manual (Minor Proj 2) Dr.K.ramu
47 pages
C1 W1 Lab02 Model Representation Soln
No ratings yet
C1 W1 Lab02 Model Representation Soln
5 pages
ML Lab File
No ratings yet
ML Lab File
48 pages
School of Engineering: Lab Manual On Machine Learning Lab
No ratings yet
School of Engineering: Lab Manual On Machine Learning Lab
23 pages
Machine Learning Perceptron & Regression
No ratings yet
Machine Learning Perceptron & Regression
11 pages
PBL Rubric Ed PDF
No ratings yet
PBL Rubric Ed PDF
1 page
Cultural Features in Alberto S. Florentino's Select Play
No ratings yet
Cultural Features in Alberto S. Florentino's Select Play
6 pages
Interactive Classroom Activities
No ratings yet
Interactive Classroom Activities
32 pages
Facillitate Empowerment Workbook
No ratings yet
Facillitate Empowerment Workbook
83 pages
Nirma University: ? !,'' XTLT"
No ratings yet
Nirma University: ? !,'' XTLT"
3 pages
Nanomaterials Course Overview
No ratings yet
Nanomaterials Course Overview
5 pages
Calculation of Co Attainment: Internal Assessment Tests
No ratings yet
Calculation of Co Attainment: Internal Assessment Tests
3 pages
Sta. Isabel Es Sip 2019-2022
100% (1)
Sta. Isabel Es Sip 2019-2022
28 pages
Issue: 20th February 2011
No ratings yet
Issue: 20th February 2011
4 pages
Job Interview: Listening Practice
No ratings yet
Job Interview: Listening Practice
3 pages
Effective and Ineffective Supervision
No ratings yet
Effective and Ineffective Supervision
20 pages
Unit Title: Coral Reefs: Goal 3.2: Understand The Relationship Between Matter and Energy in Living Systems
No ratings yet
Unit Title: Coral Reefs: Goal 3.2: Understand The Relationship Between Matter and Energy in Living Systems
9 pages
First Quarter Module 1 Activities
100% (4)
First Quarter Module 1 Activities
2 pages
A Deep Learning Approach To The Geometry Friends Game (Artículo)
No ratings yet
A Deep Learning Approach To The Geometry Friends Game (Artículo)
10 pages
Artikel Media Pembelajaran
No ratings yet
Artikel Media Pembelajaran
15 pages
SWOT Analysis Teacher's Guide
No ratings yet
SWOT Analysis Teacher's Guide
10 pages
What Is Anthropology 2nd Edition Thomas Hylland Eriksen Download
No ratings yet
What Is Anthropology 2nd Edition Thomas Hylland Eriksen Download
48 pages
Differentiation Formulas - Derivative Formulas List
No ratings yet
Differentiation Formulas - Derivative Formulas List
13 pages
2 - An Intelligent Retrievable Object-Tracking System With Real-Time
No ratings yet
2 - An Intelligent Retrievable Object-Tracking System With Real-Time
15 pages
PFR Final Exam
No ratings yet
PFR Final Exam
2 pages
BracU Scholarship - Financial Aid Policy (Undergraduate) Jan 27 2020
No ratings yet
BracU Scholarship - Financial Aid Policy (Undergraduate) Jan 27 2020
7 pages
2020 World AIDS Day Report Graphs Tables en
No ratings yet
2020 World AIDS Day Report Graphs Tables en
45 pages
SPM Kedah 2018 Biology Marking Scheme
No ratings yet
SPM Kedah 2018 Biology Marking Scheme
9 pages
Breeds of Animal Week 4
No ratings yet
Breeds of Animal Week 4
5 pages
Production Diary Steven Tweedale
No ratings yet
Production Diary Steven Tweedale
4 pages
How To Work With Spirits - Taylor Ellwood
No ratings yet
How To Work With Spirits - Taylor Ellwood
8 pages
IJSRD - International Journal For Scientific Research & Development - Vol. 8, Issue 3, 2020 - ISSN (Online) - 2321-0613
No ratings yet
IJSRD - International Journal For Scientific Research & Development - Vol. 8, Issue 3, 2020 - ISSN (Online) - 2321-0613
3 pages
Measures To Control Population Growth in India
No ratings yet
Measures To Control Population Growth in India
4 pages
Adobe Solution Partner Program Datasheet
No ratings yet
Adobe Solution Partner Program Datasheet
2 pages
Assessing Quality of Education: in Perspective With Continuous Assessment and Learners' Performance in Adwa College, Ethiopia
No ratings yet
Assessing Quality of Education: in Perspective With Continuous Assessment and Learners' Performance in Adwa College, Ethiopia
11 pages