0% found this document useful (0 votes)

39 views11 pages

Auto ML Tool For Supervised Machine Learning Data

Contact us for project abstract, enquiry, explanation, code, execution, documentation. Phone/Whatsap : 9573388833 Email : [email protected] Website : https://dcs.datapro.in/contact-us-2 Tags: btech, mtech, final year project, datapro, machine learning, cyber security, cloud computing, blockchain,

Uploaded by

dataprodcs

We take content rights seriously. If you suspect this is your content, claim it here.

0% found this document useful (0 votes)

39 views11 pages

Auto ML Tool For Supervised Machine Learning Data

Uploaded by

dataprodcs

We take content rights seriously. If you suspect this is your content, claim it here.

You are on page 1/ 11

ABSTRACT

Auto-ML sets out as connection joining the different levels of competence when
building Machine learning pipelines or systems and achieve the data science
processes more quickly. We present the common Automl tool which works on
cleaned datasets using normal Ready-made algorithms provided by sklearn to run
against regression and classification datasets and we also use the open source
Automl libraries like auto-sklearn,hyperpot,Tpot etc and found that TPOT is best
suitable for the regression datasets and auto-sklearn is best suitable for the
classification datasets .The auto sklearn library is basically used for running the auto-
ml . By using the scikit-learn machine learning library we can train the high level of
machine learning algorithms and can find best accuracy easily for the given dataset.

v
TABLE OF CONTENTS

ABSTRACT v
LIST OF FIGURES vii

CHAPTER TITLE PAGE

NO. NO.

INTRODUCTION
1 1
1.1 OBJECTIVE OF THE PROJECT
1.1.1 NECESSITY 1
1
1.1.2 SOFTWARE DEVELOPMENT METHOD
1.1.3 LAYOUT OF THE DOCUMENT 2
2
1.2 OVERVIEW OF THE PROJECT
2
2 LITERATURE SURVEY 3

2.1 LITERATURE SURVEY

3
3 METHODOLOGY
5
3.1 PROJECT PROPOSAL
5
3.1.1 MISSION 5
3.1.2 GOAL
3.2 SCOPE OF THE PROJECT 5
5
3.3 OVERVIEW OF THE PROJECT
5
3.4 FLOWCHART
6

4 SYSTEM DESIGN AND IMPLEMENTATION

7
4.1 SYSTEM STUDY
7
4.1.1 SYSTEM REQUIREMENT SPECIFICATIONS
7
4.2 SYSTEM SPECIFICATIONS
4.2.1 MACHINE LEARNING OVERVIEW 7
7
4.2.2 FLASK OVERVIEW 8

vi
8
4.3 PYTHON LIBRARIES NEEDED
4.3.1 NUMPY LIBRARY 8
8
4.3.2 PANDAS LIBRARY
4.3.3 MATPLOTLIB LIBRARY 8
9
4.3.4 SEABORN LIBRARY
4.3.5 SCIKIT LEARN LIBRARY 9

4.4 DESCRIPTION ABOUT EACH LIBRARY 9

4.4.1 HOW NUMPY CAN BE USED 9

4.4.2 HOW PANDAS CAN BE USED 10

11
4.5 MODULES
4.5.1 DATA PRE-PROCESSING 12
4.5.2 DATA VALIDATION /CLEANING /PREPARING 13
PROCESS
4.5.3 EXPLORATION DATA ANALYSIS OF VISUALIZATION 14
4.5.4 COMPARING ALGORITHM WITH PREDICTION IN THE FORM 14
OF BEST ACCURACY RESULT
4.5.5 ALGORITHM AND TECHNIQUES 18

4.6 DEPLOYMENT USING DJANGO 30

4.7 DETAIL EXPLANATION OF AUTO ML
32

vii
5 RESULTS AND DISCUSSION, PERFORMANCE ANALYSIS 33

6 SUMMARY AND CONCLUSION 34

6.1 SUMMARY 34
6.2 CONCLUSION 34

REFERENCES 35

APPENDIX
A. SOURCE CODE 37
B. SCREENSHOTS 42
C. PUBLICATION 46

viii
LIST OF FIGURES

FIGURE NO. FIGURE NAME PAGE NO.

3.1 System Architecture 6
3.2 Flow chart 6
4.2 Logistic Regression 20
4.3 Linear Regression 21
4.4 Random Forest 22
4.5 Decission Trees 24
4.6 Naïve Bayes 25
4.7 K Nearest Neighbor 26
4.8 Support Vector Classifier 27
4.9 Support Vector Regressor 28
4.10 Gradient Boosting 28
4.11 XG Boosting 29
4.12 Adaptive Boosting 30

0
CHAPTER-1
INTRODUCTION

1.1 OBJECTIVE OF THE PROJECT:

To build an Auto-ML tool for any cleaned dataset so that it applies

machine learning algorithm and predicts the accuracy scores for all applied algorithms
and also give top three result.
The recent substantial progress in machine learning has led to a growing demand for
hands-free ML systems that can support developers and ML novices in efficiently
creating the new ML applications. Since different datasets require different ML pipelines,
this demand has given rise to the area of automated machine learning.

1.1.1 Necessity: This website helps in overcoming the time management. This
Application is very easy to use. It can work accurately and very smoothly in a different
scenario. It reduces the effort workload and increases efficiency in work. In aspects of
time value, it is worthy.
In this website the user can easily use our auto-ml tool for choosing the best algorithm
for the given supervised machine learning data. Our auto-ml tool provides the accuracy
scores of all algorithms for classification/Regression data. Then it displays the best three
accuracy models for the uploaded dataset. So, by using our auto-ml tool the user can
easily find the best suitable model for the data
Hence even the non-coders also easily done machine learning by using our tool.

1.1.2 Software development method:

In many software applications program different methods and cases are
followed such as, Waterfall model, Iterative model, Spiral model, V-model and Big Bang
model. I used waterfall model in this application. I tried to use test case and case
software approaches.

1
1.1.3 Layout of the document:
This documentation starts with formal introduction. After introduction analysis
and design of the project are described. In analysis and design of the project have many
parts such as project proposal, mission, goal, target audience, environment. After that
design and table diagram will be found. Use cases and test cases are in chapter 2 and
chapter 3 respectively. Finally this documentation finished with result and Conclusion
part.

1.2 Overview of the Designed Project:

At first we take the dataset from out resource then we have to perform data-
preprocessing, visualization methods for cleaning and visualizing the dataset
respectively and we upload the cleaned dataset and can run all algorithms easily by
clicking button for getting accuracy scores and then it will give best three accuracy
scores algorithms and flask is used for user interface.

2
CHAPTER-2
LITERATURE SURVEY
2.1 LITERATURE SURVEY:
General
A literature review is a body of text that aims to review the critical points of current
knowledge on and/or methodological approaches to a particular topic. It is secondary
sources and discuss published information in a particular subject area and sometimes
information in a particular subject area within a certain time period. Its ultimate goal is to
bring the reader up to date with current literature on a topic and forms the basis for
another goal, such as future research that may be needed in the area and precedes a
research proposal and may be just a simple summary of sources. Usually, it has an
organizational pattern and combines both summary and synthesis.
A summary is a recap of important information about the source, but a synthesis is a re-
organization, reshuffling of information. It might give a new interpretation of old material
or combine new with old interpretations or it might trace the intellectual progression of
the field, including major debates. Depending on the situation, the literature review may
evaluate the sources and advise the reader on the most pertinent or relevant of them.

Review of Literature Survey

Title : Benchmarking Automatic Machine Learning Frameworks
Author: Adithya Balaji, Alexander Allen
Year : 2018

We test auto-sklearn, TPOT, auto_ml, and H2O's AutoML solution against a

compiled set of regression and classification datasets sourced from OpenML and found
that TPOT is best suitable for the regression type datasets and auto-sklearn is best
suitable for the classification type datasets

Title : Efficient and Robust Automated Machine Learning

Author: Matthias Feurer, Aaron Klein, Katharina Eggensperger
Year : 2015

3
The demand for the machine learning has been increased due to the success of
machine learning in various range of applications .For good effective, such systems
need to automatically chooses the algorithm and data pre-processing steps for a new
dataset and should set their respective hyper-parameters.

Title : Hyper-parameter Optimization of Machine Learning Algorithms

Author: Li Yang, Abdallah Shami
Year : 2020

Currently most of us using machine learning algorithms in different applications and

in different areas. For fitting into different problems the machine learning algorithms
have to do hyper-parameters must be tuned. There is an direct impact on models
performances by selecting the best hyperparameter configuration for ML models. It even
also requires more knowledge on ML algorithms and hyper-parameter optimization.

Title : Hyperopt-Sklearn: Automatic Hyperparameter Configuration for Scikit-learn

Author: Brent Komer, James Bergstra, Chris Eliasmith
Year : 2014

Hyperopt-sklearn is the project which gives the auto algorithm configuration for
the sklearn ML library. Following Auto-Weka, To represent a single large hyper-
parameter optimization problem we can take the choice of large pre-processing modules
and the choice of classifier together.

4
CHAPTER-3
METHODOLOGY

3.1 Project Proposal:

The project proposal is the term of documents. A project can describe the
project proposal. It is the set of all plans of a project. Like, how the software works, what
are the steps to complete the entire projects, and what are the software requirements
and analysis for this project. In my project, I am doing all the steps and also risk and
reward and other project dependencies in the project proposal.

3.1.1 Mission:
An online Web based machine learning application is very popular and well
known to everyone. We can train high level custom machine learning models with
minimal effort and machine learning expertise. Hence even non coder also can do
machine learning easily by using our Auto-ML tool. This simple Auto-ML tool gives fast
and accurate results for choosing the best model for the given dataset.

3.1.2 Goal:
The goal is to build an Auto-ML tool for choosing the best accuracy model
for any given cleaned dataset.

3.2 Scope of the Project:

The scope of this paper is to implement and investigate how different
supervised binary classification methods impact default prediction. The model evaluation
techniques used in this project are limited to precision, sensitivity, F1-score.

3.3 Overview of the Project:

The overview of the project is to build an Auto-ML tool for choosing the
best accuracy model for any given cleaned dataset. So we can do any classification and
regression type projects can predict the data and displays the best three accuracy
model.

5
3.4 Flow Chart:

Fig:3.1: Machine Learning workflow diagram

The above flow diagram represents the flow of process from gathering data to
predicting result.

Fig:3.2: System Architecture

Clean the raw data then apply machine learning algorithms and finally predict the
results.

Uber Data Analysis
100% (4)
Uber Data Analysis
37 pages
Internship Report On Machine Learning With Python
100% (1)
Internship Report On Machine Learning With Python
50 pages
Student Performance Analysis Using Machine Learning
No ratings yet
Student Performance Analysis Using Machine Learning
40 pages
Crash Barrier BBS & QTY
100% (10)
Crash Barrier BBS & QTY
4 pages
Loan Approval Prediction Report
100% (1)
Loan Approval Prediction Report
66 pages
Sa1 Frame
No ratings yet
Sa1 Frame
51 pages
Boss ME-10 Service Manual
50% (2)
Boss ME-10 Service Manual
23 pages
SBM Assessment Tool For Online Validation With Essential MOVs
No ratings yet
SBM Assessment Tool For Online Validation With Essential MOVs
10 pages
Python Machine Learning - Machine Learning and Deep Learning With Python Scikit Learn and Tensorflow 2 Third Edition
No ratings yet
Python Machine Learning - Machine Learning and Deep Learning With Python Scikit Learn and Tensorflow 2 Third Edition
4 pages
3 Must-Have Projects For Your Data Science Portfolio - by Aakash N S - Jovian - Jan, 2021 - Medium
No ratings yet
3 Must-Have Projects For Your Data Science Portfolio - by Aakash N S - Jovian - Jan, 2021 - Medium
1 page
Environment Consists of All Living and Non Living Things Which Surround Us
No ratings yet
Environment Consists of All Living and Non Living Things Which Surround Us
7 pages
Industrial Training Report (Sahil)
No ratings yet
Industrial Training Report (Sahil)
33 pages
Unit 5 Transformation Notes
No ratings yet
Unit 5 Transformation Notes
33 pages
Big Data Analytics in Intelligent Transportation Systems A Survey
No ratings yet
Big Data Analytics in Intelligent Transportation Systems A Survey
20 pages
Auto ML
No ratings yet
Auto ML
15 pages
Chapter 2 Preparing To Model
No ratings yet
Chapter 2 Preparing To Model
49 pages
Machine Learning Tools and Toolkits in The Explora
No ratings yet
Machine Learning Tools and Toolkits in The Explora
7 pages
Identifing Software Bugs or Not Using SMLT Model
No ratings yet
Identifing Software Bugs or Not Using SMLT Model
34 pages
1NH17CS407
No ratings yet
1NH17CS407
110 pages
Kaldor'S Growth Theory Nancy J. Wulwick
No ratings yet
Kaldor'S Growth Theory Nancy J. Wulwick
19 pages
Estimating & Measuring Work Within A Construction Environment
No ratings yet
Estimating & Measuring Work Within A Construction Environment
29 pages
Machine Learning Course Overview
No ratings yet
Machine Learning Course Overview
6 pages
AI Project Report: By: Neha Kalra (17csu122) and Prerna Pathak (17csu143)
No ratings yet
AI Project Report: By: Neha Kalra (17csu122) and Prerna Pathak (17csu143)
22 pages
Machine Learning Part: Domain Overview
No ratings yet
Machine Learning Part: Domain Overview
20 pages
Stock Prediction with Boosting
No ratings yet
Stock Prediction with Boosting
112 pages
Beltscale Handbook 03 12 TL
No ratings yet
Beltscale Handbook 03 12 TL
8 pages
Main Dock Pin
No ratings yet
Main Dock Pin
31 pages
Abstract Sugarcane
No ratings yet
Abstract Sugarcane
5 pages
Salazar CPE124 Courswork 1
No ratings yet
Salazar CPE124 Courswork 1
22 pages
Report
No ratings yet
Report
36 pages
ML Interactively
No ratings yet
ML Interactively
273 pages
2 JHA On Shot Grit Blasting1
No ratings yet
2 JHA On Shot Grit Blasting1
3 pages
Instruction Manual: Sync-Check Relay BE1-25
No ratings yet
Instruction Manual: Sync-Check Relay BE1-25
53 pages
Real Estate Web Application Using Flask
0% (1)
Real Estate Web Application Using Flask
11 pages
Activity Log
No ratings yet
Activity Log
23 pages
1-6 Practice
No ratings yet
1-6 Practice
2 pages
Unit 1-Omd553-Telehealth Technology
No ratings yet
Unit 1-Omd553-Telehealth Technology
53 pages
PDS Labmanualword
No ratings yet
PDS Labmanualword
32 pages
Methods and Models
No ratings yet
Methods and Models
12 pages
Chapter 7
No ratings yet
Chapter 7
49 pages
Karunadu Technologies Private Limited: (Affiliated To and Approved By)
No ratings yet
Karunadu Technologies Private Limited: (Affiliated To and Approved By)
53 pages
ML Lecture Notes Unit-1
No ratings yet
ML Lecture Notes Unit-1
45 pages
ML Project Guide for Practitioners
No ratings yet
ML Project Guide for Practitioners
7 pages
Paper 8675
No ratings yet
Paper 8675
6 pages
Associations Between Social Responsibility Disclosure and Characteristics of Companies
No ratings yet
Associations Between Social Responsibility Disclosure and Characteristics of Companies
8 pages
Journal of Materials Processing Tech.: Harikrishna Rana, Vishvesh Badheka
No ratings yet
Journal of Materials Processing Tech.: Harikrishna Rana, Vishvesh Badheka
13 pages
Query Generation Using Nadaq System
No ratings yet
Query Generation Using Nadaq System
11 pages
RDS to HDFS Data Ingestion via Sqoop
No ratings yet
RDS to HDFS Data Ingestion via Sqoop
5 pages
Sms Spam Detection Using Machine Learning and Deep Learning Techniques
No ratings yet
Sms Spam Detection Using Machine Learning and Deep Learning Techniques
11 pages
Accident Detection Using Machine Learning Method
No ratings yet
Accident Detection Using Machine Learning Method
10 pages
Definition, Health by WHO
No ratings yet
Definition, Health by WHO
13 pages
BALLOU Inclusion VS Empathy
No ratings yet
BALLOU Inclusion VS Empathy
5 pages
Number Plate Recogination Using Machine Learning
No ratings yet
Number Plate Recogination Using Machine Learning
11 pages
Index List
No ratings yet
Index List
9 pages
Automated Machine Learning
No ratings yet
Automated Machine Learning
10 pages
Human Value Ethics
No ratings yet
Human Value Ethics
57 pages
Ensemble Approach On Customer Churn Prediction
No ratings yet
Ensemble Approach On Customer Churn Prediction
11 pages
Disease Prediction and Hospital Recommendation Using Machine Learning
No ratings yet
Disease Prediction and Hospital Recommendation Using Machine Learning
11 pages
Sales Forecast of Manufacturing Companies Using Machine Learning Navigating Pandemic
No ratings yet
Sales Forecast of Manufacturing Companies Using Machine Learning Navigating Pandemic
11 pages
5GRAIL WCRR Presentation
No ratings yet
5GRAIL WCRR Presentation
6 pages
Machine Learning For Data Science Unit-4
No ratings yet
Machine Learning For Data Science Unit-4
16 pages
Minor Project (7-37)
No ratings yet
Minor Project (7-37)
31 pages
Week 12 Intro To DS and ML
No ratings yet
Week 12 Intro To DS and ML
67 pages
MIssing Data Imputation Using Machine Learning Algorithm
No ratings yet
MIssing Data Imputation Using Machine Learning Algorithm
11 pages
Gene Expression Analysis On Cancer Dataset
No ratings yet
Gene Expression Analysis On Cancer Dataset
11 pages
Heart Diesease Prediction and Recommendation System Using Machine Learning
No ratings yet
Heart Diesease Prediction and Recommendation System Using Machine Learning
11 pages
Enhancing The Data and Security in Health Care System
No ratings yet
Enhancing The Data and Security in Health Care System
9 pages
Secure and Efficient Facial Identification Based Attendance System For Institution
No ratings yet
Secure and Efficient Facial Identification Based Attendance System For Institution
11 pages
Computer Vision-Based Early Fire Detection Using Open CV and Machine Learning
No ratings yet
Computer Vision-Based Early Fire Detection Using Open CV and Machine Learning
11 pages
Prediction of Cyber Attacks Using Data Science Technique
No ratings yet
Prediction of Cyber Attacks Using Data Science Technique
11 pages
Implementation of MVC Pattern in Content Management System Using Codeigniter As Skeleton Framework.
No ratings yet
Implementation of MVC Pattern in Content Management System Using Codeigniter As Skeleton Framework.
11 pages
Malaysian School Counsellors' Challenges in Job Description, Job Satisfaction and Competency
No ratings yet
Malaysian School Counsellors' Challenges in Job Description, Job Satisfaction and Competency
7 pages
Segmentation On MRI Brain Image and Classification of Stages of Tumor Using Machine Learning
No ratings yet
Segmentation On MRI Brain Image and Classification of Stages of Tumor Using Machine Learning
11 pages
Social Media Analysis Using Machine Learning
No ratings yet
Social Media Analysis Using Machine Learning
11 pages
COvid-19 Detection Using Deep Learning With X-Ray
No ratings yet
COvid-19 Detection Using Deep Learning With X-Ray
11 pages
Communication Interpretation Using Machine Learning and Open CV
No ratings yet
Communication Interpretation Using Machine Learning and Open CV
11 pages
Hybrid Movie Recommendation System
No ratings yet
Hybrid Movie Recommendation System
11 pages
Finnish Traffic Accident Data Mining
No ratings yet
Finnish Traffic Accident Data Mining
11 pages
Survey On Crime Analysis and Prediction Using Machine Learning Techniques
No ratings yet
Survey On Crime Analysis and Prediction Using Machine Learning Techniques
11 pages
Project Report On Emotion Aware Smart Music Recommended System Using CNN
No ratings yet
Project Report On Emotion Aware Smart Music Recommended System Using CNN
11 pages
Machine Learning Approach For Identifying Plant Diseases and Provide Cure
No ratings yet
Machine Learning Approach For Identifying Plant Diseases and Provide Cure
11 pages
Publication Automation System
No ratings yet
Publication Automation System
11 pages
Quick Aid
No ratings yet
Quick Aid
11 pages
Online Donation Based Crowdfunding
No ratings yet
Online Donation Based Crowdfunding
11 pages
Modern Crop Protection Using Python
No ratings yet
Modern Crop Protection Using Python
11 pages
Covid-19 Future Forecasting Using Supervised Machine Learning Models
No ratings yet
Covid-19 Future Forecasting Using Supervised Machine Learning Models
11 pages
Train Track Crack Classification Using Convolutional Neural Networks
No ratings yet
Train Track Crack Classification Using Convolutional Neural Networks
11 pages
Human Annotator For Imbalanced Dossier
No ratings yet
Human Annotator For Imbalanced Dossier
11 pages
Supervised Learning Method of Diabetes Prediction
No ratings yet
Supervised Learning Method of Diabetes Prediction
10 pages
ML Da
No ratings yet
ML Da
55 pages
2024 AutoML Past, Present and Future
No ratings yet
2024 AutoML Past, Present and Future
82 pages
BSBCRT511 Task 3 Assessment Templates V3.0923
No ratings yet
BSBCRT511 Task 3 Assessment Templates V3.0923
10 pages
1visvesvaraya Technological University
No ratings yet
1visvesvaraya Technological University
29 pages
Homework Hotline d428
100% (1)
Homework Hotline d428
5 pages
ML Customer Segmentation
No ratings yet
ML Customer Segmentation
39 pages
Uninterruptible Power Supply (UPS)
No ratings yet
Uninterruptible Power Supply (UPS)
11 pages
Machine Learning With Python
No ratings yet
Machine Learning With Python
6 pages
Harrington 1 Ton Hand Chain Hoist OM Manual
No ratings yet
Harrington 1 Ton Hand Chain Hoist OM Manual
55 pages
Stefano Brambilla CV ENGLISH
No ratings yet
Stefano Brambilla CV ENGLISH
2 pages
Exploring, Transforming, and Summarizing Input Datasets For Building Classification Models
No ratings yet
Exploring, Transforming, and Summarizing Input Datasets For Building Classification Models
21 pages
Decision Theory for Leaders
No ratings yet
Decision Theory for Leaders
12 pages
Anderson Peter Chapter 5 Two
No ratings yet
Anderson Peter Chapter 5 Two
4 pages
Module 1
No ratings yet
Module 1
25 pages
Autonomous Machine Learning Modelling Full
No ratings yet
Autonomous Machine Learning Modelling Full
68 pages
Assignment 1
No ratings yet
Assignment 1
1 page
Final Documentation
No ratings yet
Final Documentation
101 pages
Machine: Learning
No ratings yet
Machine: Learning
24 pages
Step-By-Step Tutorial To Building Your First Machine Learning Model - KDnuggets
No ratings yet
Step-By-Step Tutorial To Building Your First Machine Learning Model - KDnuggets
20 pages
Lima Et Al., 2022, MLOps Practices, Maturity Models, Roles, Tools, and Challenges - A Systematic Literature Review
No ratings yet
Lima Et Al., 2022, MLOps Practices, Maturity Models, Roles, Tools, and Challenges - A Systematic Literature Review
13 pages
Unit 5
No ratings yet
Unit 5
7 pages
Machine Learning With Data Science
No ratings yet
Machine Learning With Data Science
31 pages
ML - Unit 1
No ratings yet
ML - Unit 1
68 pages
Imperfections in Crystalline Solids, D. 1st Edition, Wei Cai, William Nix
No ratings yet
Imperfections in Crystalline Solids, D. 1st Edition, Wei Cai, William Nix
410 pages
Two-Stage Optimization For Machine Learning Workflow
No ratings yet
Two-Stage Optimization For Machine Learning Workflow
35 pages
7th Sem Intership Report Format
No ratings yet
7th Sem Intership Report Format
39 pages
Simplifying Model Comparison for Machine Learning
No ratings yet
Simplifying Model Comparison for Machine Learning
11 pages