AI Project Cycle
Project Cycle is a step by step process to solve the problems using proven scientific
methods and drawing the inference about it.
The AI Project Cycle mainly has 5 stages:
1. Problem Scoping
● It is a fact that we are surrounded by problems. Identifying such a problem and having a
vision to solve it, is what Problem Scoping is about.
● Definition : Problem Scoping refers to understanding a problem, finding out various
factors which affect the problem, define the goal or aim of the project.
4Ws Problem Canvas
The 4Ws Problem canvas helps in identifying the key elements related to the problem.
1. Who? Who are the stakeholders?
Stakeholders are the people who face this problem and would
be benefitted with the solution
2. What? What is the problem?
3. Where? What is the context/situation the stakeholders experience this
problem? Where is the problem located?
4. Why? What benefits the stakeholders would get? How will the solution
improve their situation?
2. Data Acquisition
● Data can be a piece of information or facts and statistics collected together
for reference or analysis.
● Data Acquisition is the process of collecting accurate and reliable data to work
with. Data can be in the format of text, video, images, audio and so on and it
can be collected from various source like the interest, journals, newspapers etc.
Types of Data :
There can be various ways in which you can collect data. Some of them are:
Web Scraping
● Web Scraping means collecting data from web using some technologies.
● We use it for monitoring prices, news etc.
Sensors
● Sensors are very Important but very simple to understand.
● Sensors are the part of IoT (Internet of things)
Cameras
● Camera captures the visual information and then that information which is called image is
used as a source of data.
Observations
● When we observe something carefully we get some information For ex: Scientists Observe
creatures to study them
API - Application Programming Interface.
● API is a messenger which takes requests and tells the system about requests and gives the
response. Ex: Twitter API, Google Search API
Surveys
● The survey is a method of gathering specific information from a sample of people.
Example : a census survey for analyzing the population
3. Data Exploration
● Data Exploration is the process of arranging the gathered data uniformly for a better
understanding. Data can be arranged in the form of a table, plotting a chart or making
a database.
● To visualise data, we can use various types of visual representations.
● The tools used to visualize the acquired data are known as data visualization or
exploration tools. Ex. Google Charts, Tableau, Fusion Charts, High chart
Data Exploration or Visualization Tools
Data Sets
Dataset is a collection of related sets of information that is composed of separate elements but can
be manipulated by a computer as a unit.
Training Data – A subset required to train the model
Testing Data – A subset required while testing the trained the model
Advantages of Data Visualization
❖ Provides better understanding of data
❖ Provides insights or patterns into data
❖ Help to make decisions
❖ Reduces complexity of data
❖ Provides the relationships and patterns contained within data
4. Modelling
● AI Modelling refers to developing algorithms, also called models which can be trained to get
intelligent outputs. That is, writing codes to make a machine artificially intelligent.
Definition : Modelling is the process in which different models based on the
visualized data can be created and even checked for the advantages and disadvantages
of the model.
Before jumping into modelling let us discuss the definitions of Artificial Intelligence (AI),
Machine Learning (ML) and Deep Learning (DL).
I. Artificial Intelligence, or AI, refers to any technique that enables computers to mimic human
intelligence. The AI-enabled machines think algorithmically and execute what they have been
asked for intelligently.
II. Machine Learning, or ML, enables machines to improve at tasks with experience. The
machine learns from its mistakes and takes them into consideration in the next execution. It
improvises itself using its own experiences.
III. Deep Learning, or DL, enables software to train itself to perform tasks with vast amounts
of data. In deep learning, the machine is trained with huge amounts of data which helps it into
training itself around the data. Such machines are intelligent enough to develop algorithms for
themselves.
As you can see in the Venn Diagram, Artificial
Intelligence is the umbrella terminology which
covers machine and deep learning under it.
NOTE: Deep Learning is the most advanced form of
Artificial Intelligence out of these three.
Generally, AI models can be classified as follows :
Rule Based Approach
● Rule Based Approach Refers to the AI modeling where the relationship or patterns
in data are defined by the developer.
● That means the machine works on the rules and information given by the
developer and performs the task accordingly.
Ex : You trained your model with 100 images of apples and bananas. Now If you test it by
showing an apple, it will figure out and tell if it's an apple or not. Here Labeled images
of apple and banana were fed, due to which the model could detect the fruit
Learning Based Approach
● AI modelling where the machine learns by itself.
● The learning-Based Approach is based on a Machine learning experience with the data
fed.
Machine learning is a subset of artificial intelligence(AI) that provides machines the ability to
learn automatically and improve from experience without being programmed for it.
The learning-based approach or Machine Learning can further be divided into three parts :
1. Supervised Learning :-
Supervised learning is where a computer algorithm is trained on input data that has been
labeled for a particular output.
Ex Classification , Regression
2. Unsupervised Learning:-
Unsupervised learning is in which a system learns through datasets created on its own.
In this, the training is not labeled.
Ex : Clustering
3. Reinforcement Learning
Learning through feedback or trial and error methods is called Reinforcement Learning.
Also called semi-supervised Learning.
There are two types of Supervised Learning models :
Classification:
● Where the data is classified according to the labels.
● This model works on a discrete dataset which means
the data need not be continuous.
● For example, in the grading system, students are
classified on the basis of the grades they obtain with
respect to their marks in the examination.
Regression:
● Here, the data which has been fed to the machine is
continuous.
● Such models work on continuous data.
● For example, if you wish to predict your next salary,
then you would put in the data of your previous
salary,any increments, etc., and would train the model.
Unsupervised Learning
● An unsupervised learning model works on an unlabelled dataset.
● The unsupervised learning models are used to identify relationships, patterns and trends out
of the data which is fed into it. It helps the user in understanding what the data is about and
what are the major features identified by the machine
in it.
Clustering:
● Refers to the unsupervised learning algorithm which
can cluster the unknown data according to the
patterns or trends identified out of it.
● The patterns observed might be the ones which are
known to the developer or it might even come up
with some unique patterns out of it.
5. EVALUATION :
Definition :
Moving towards deploying the model in the real world, we test it in as many ways
as possible. The stage of testing the models is known as EVALUATION.
OR
Evaluation is a process of understanding the reliability of any AI model, based on
outputs by feeding the test dataset into the model and comparing it with actual
answers.
What Are Evaluation Metrics?
Evaluation metrics are quantitative measures used to assess the performance and effectiveness of a
statistical or machine learning model. These metrics provide insights into how well the model is
performing and help in comparing different models.
Hence, the model is tested with the help of Testing Data (which was separated out of the
acquired dataset at Data Acquisition stage) and the efficiency of the model is calculated on the
basis of the parameters mentioned below :
Neural Networks (Artifical Neural Network)
● Neural networks are loosely modelled after how neurons in the human brain behave.
● The key advantage of neural networks is that they are able to extract data features
automatically without needing the input of the programmer.
● It is a fast and efficient way to solve problems for which the dataset is very large, such as in
images.
As seen in the figure given, the larger Neural Networks tend to perform better with larger amounts
of data whereas the traditional machine learning algorithms stop improving after a certain saturation
point.
● A Neural Network is divided into multiple layers and each layer is further divided into
several blocks called nodes. Each node has its own task to accomplish which is then passed
to the next layer.
● The first layer of a Neural Network is known as the input layer. The job of an input layer is
to acquire data and feed it to the Neural Network. No processing occurs at the input layer.
● Hidden layers are the layers in which the whole processing occurs. These layers are hidden
and are not visible to the user.
● Each node of these hidden layers has its own machine learning algorithm which it executes
on the data received from the input layer.
● There can be multiple hidden layers in a neural network system.The last hidden layer passes
the final processed data to the output layer which then gives it to the user as the final output.
● Similar to the input layer, output layer too does not process the data which it acquires. It is
meant for user-interface.
Some of the features of a Neural Network are listed below:
Prepared by : Hitesh Pujari