0% found this document useful (0 votes)

3 views5 pages

Class 12 AI - Chapter 1

Uploaded by

LEO MERCY

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views5 pages

Class 12 AI - Chapter 1

Uploaded by

LEO MERCY

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

DELHI PUBLIC SCHOOL, KANPUR

CLASS 12
SUBJECT - ARTIFICIAL INTELLIGENCE (843)
Chapter - Data Science Methodology : An Analytic Approach to Capstone Project
Questions / Answers

1. How many steps are there in Data Science Methodology? Name them in order.
Ans. Data Science Methodology is a process with a prescribed sequence of iterative steps that data scientists
follow to approach a problem and find a solution. It enables the capacity to handle and comprehend the data.

Data Science Methodology consists of 10 steps

1. Business Understanding 6. Data Preparation
2. Analytic Approach 7. Modelling
3. Data Requirements 8. Evaluation
4. Data Collection 9. Deployment
5. Data Understanding 10. Feedback

2. What are the different types of data analytics used in Analytic Approach?
Ans. There are four main types of data analytics.
Descriptive Analytics:
This summarizes past data to understand what has happened. It is the first step undertaken in data analytics
to describe the trends and patterns using tools like graphs, charts etc. and statistical measures like mean,
median, mode to understand the central tendency. This method also examines the spread of data using range,
variance and standard deviation.
Diagnostic Analytics:
It helps to understand the reason behind why some things have happened. This is normally done by
analyzing past data using techniques like root cause analysis, hypothesis testing, correlation analysis etc.
The main purpose is to identify the causes or factors that led to a certain outcome.
Predictive Analytics:
This uses the past data to make predictions about future events or trends, using techniques like regression,
classification, clustering etc. Its main purpose is to foresee future outcomes and make informed decisions.
Prescriptive Analytics:
This recommends the action to be taken to achieve the desired outcome, using techniques such as
optimization, simulation, decision analysis etc. Its purpose is to guide decisions by suggesting the best
course of action based on data analysis.

3. Data is collected from different sources. Explain the different types of sources with example.
Ans. Data collection is a systematic process of gathering observations or measurements. In this phase, the
data requirements are revised and decisions are made as to whether the collection requires more or less data.
Today’s high-performance database analytics enable data scientists to utilize large datasets. There are
mainly two sources of data collection:
→ Primary data Source -
 A primary data source refers to the original source of data, where the data is collected firsthand
through direct observation, experimentation, surveys, interviews, or other methods.
 This data is raw, unprocessed, and unbiased, providing the most accurate and reliable information for
research, analysis, or decision-making purposes.
 Examples include marketing campaigns, feedback forms, IoT sensor data etc.
→ Secondary data Source -
 A secondary data source refers to the data which is already stored and ready for use.
 Data given in books, journals, websites, internal transactional databases, etc. can be reused for data
analysis.
 Some methods of collecting secondary data are social media data tracking, web scraping, and
satellite data tracking.
 Some sources of online data are data.gov, World bank open data, UNICEF, open data network,
Kaggle, World Health Organization, Google etc.
Smart forms are an easy way to procure data online.

4. Write a short note on the steps done during Data Preparation.

Data Preparation stage covers all the activities to build the set of data that will be used in the modelling step.
Data is transformed into a state where it is easier to work with. Data preparation includes -
● cleaning of data (dealing with invalid or missing values, removal of duplicate values and assigning a
suitable format)
● combine data from multiple sources (archives, tables and platforms)
● transform data into meaningful input variables

5. What do you mean by Feature Engineering?

Ans. Feature Engineering is a part of Data Preparation. The preparation of data is the most time consuming
step among the Data Science stages. Feature engineering is the process of selecting, modifying, or creating
new features (variables) from raw data to improve the performance of machine learning models.

6. Which step of Data Science Methodology is related to constructing the data set? Explain.
Data Understanding encompasses all activities related to constructing the dataset. In this stage, we check
whether the data collected represents the problem to be solved or not. The relevance, comprehensiveness,
and suitability of the data for addressing the specific problem or question at hand are evaluated. Techniques
such as descriptive statistics and visualization can be applied to the dataset, to assess the content, quality,
and initial insights about the data.

7. What do you mean by AI Modelling.

Ans. The modelling stage uses the initial version of the dataset prepared and focuses on developing models
according to the analytical approach previously defined. The modelling process is usually iterative, leading
to the adjustments in the preparation of data. For a determined technique, Data scientists can test multiple
algorithms to identify the most suitable model for the Capstone Project.

8. Differentiate between descriptive modelling and predictive modelling.

Data Modelling focuses on developing models that are either descriptive or predictive

1. Descriptive Modeling:
 It is a concept in data science and statistics that focuses on summarizing and understanding the
characteristics of a dataset without making predictions or decisions.
 The goal of descriptive modeling is to describe the data rather than predict or make decisions based
on it.
 This includes summarizing the main characteristics, patterns, and trends that are present in the data.
 Descriptive modeling is useful when you want to understand what is happening within your data and
how it behaves, but not necessarily why it happens.
 Common Descriptive Techniques:
o Summary Statistics: This includes measures like:
 Mean (average), Median, Mode
 Standard deviation, Variance
 Range (difference between the highest and lowest values)
 Percentiles (e.g., quartiles)
o Visualizations: Graphs and charts to represent the data, such as:
 Bar charts
 Histograms
 Pie charts
 Box plots
 Scatter plots

2. Predictive Modeling:
 It involves using data and statistical algorithms to identify patterns and trends in order to predict
future outcomes or values.
 It relies on historical data and uses it to create a model that can predict future behavior or trends or
forecast what might happen next.
 It involves techniques like regression, classification, and time-series forecasting, and can be applied
in a variety of fields, from predicting exam scores to forecasting weather or stock prices.
 While it is a powerful tool, students must also understand its limitations and the importance of good
data.
 The data scientist will use a training set for predictive modeling. A training set is a set of historical
data in which the outcomes are already known. The training set acts like a gauge to determine if the
model needs to be calibrated.
 In this stage, the data scientist will play around with different algorithms to ensure that the variables
selected are actually required.

9. What do you understand by Evaluation. Write its phases also.

Ans. Evaluation in an AI project cycle is the process of assessing how well a model performs after training.
It involves using test data to measure metrics like accuracy, precision, recall, or F1 score. This helps
determine if the model is reliable and effective before deploying it in real-world situations.
Model evaluation can have two main phases.

First phase – Diagnostic measures

It is used to ensure the model is working as intended. If the model is a predictive model, a decision tree can
be used to evaluate the output of the model, check whether it is aligned to the initial design or requires any
adjustments. If the model is a descriptive model, one in which relationships are being assessed, then a testing
set with known outcomes can be applied, and the model can be refined as needed.

Second phase – Statistical significance test

This type of evaluation can be applied to the model to verify that it accurately processes and interprets the
data. This is designed to avoid unnecessary second guessing when the answer is revealed.

10. Is Feedback a necessary step in Data Science Methodology? Justify your answer.
Ans. Yes, Feedback is a necessary step in Data Science Methodology. This includes results collected from
the deployment of the model, feedback on the model’s performance from the users and clients, and
observations from how the model works in the deployed environment. Feedback from the users will help to
refine the model and assess it for performance and impact.
11. Why is model validation important?
Model Validation offers a systematic approach to measure its accuracy and reliability, providing insights
into how well it generalizes to new, unseen data. Model validation is the step conducted post Model
Training, wherein the effectiveness of the trained model is assessed using a testing dataset. Validating the
machine learning model during the training and development stages is crucial for ensuring accurate
predictions. The benefits of Model Validation include -
• Enhancing the model quality.
• Reduced risk of errors
• Prevents the model from overfitting and underfitting.

12. Write a comparative study on train-test split and cross validation.

Train-Test Split Cross Validation
Normally applied on large datasets Normally applied on small datasets
Divides the data into training data set and testing Divides a dataset into subsets (folds), trains the model on
dataset. some folds, and evaluates its performance on the
remaining data.
Clear demarcation on training data and testing data. Every data point at some stage could be in either testing
or training data set.

13. Explain the different metrics used for evaluating Classification models.
Ans. Evaluation metrics help assess the performance of a trained model on a test dataset, providing insights
into its strengths and weaknesses. These metrics enable comparison of different models, including variations
of the same model, to select the best-performing one for a specific task.

14. Explain Confusion Matrix.

Ans. A Confusion Matrix is a table used to evaluate the performance of a classification model. It
summarizes the predictions against the actual outcomes. It creates an N X N matrix, where N is the number
of classes or categories that are to be predicted. Suppose there is a problem, which is a binary classification,
then N=2 (Yes/No). It will create a 2x2 matrix.
True Positives: It is the case where the model predicted Yes and the real output was also yes.
True Negatives: It is the case where the model predicted No and the real output was also No.
False Positives: It is the case where the model predicted Yes but it was actually No.
False Negatives: It is the case where the model predicted No but it was actually Yes.
15. How to calculate Accuracy, Precision, Recall and F1-Score.
Ans. Accuracy
Accuracy = Number of correct predictions / Total number of predictions
Accuracy = (TP+TN) / (TP+FP+FN+TN)

Precision and Recall

Precision measures “What proportion of predicted Positives is truly Positive?”
Precision = (TP) / (TP+FP).
Precision should be as high as possible.
Recall measures “What proportion of actual Positives is correctly classified?”
Recall = (TP) / (TP+FN)

F1 Score
A good F1 score means that you have low false positives and low false negatives, so you’re correctly
identifying real threats, and you are not disturbed by false alarms. An F1 score is considered perfect when it
is 1, while the model is a total failure when it is 0.
F1 = 2* (Precision * Recall) / (Precision + Recall)

15. Explain MAE, MSE and RMSE

Ans. 1. MAE - Mean Absolute Error is a sum of the absolute differences between predictions and actual
values. A value of 0 indicates no error or perfect predictions
2. MSE - Mean Square Error (MSE) is the most commonly used metric to evaluate the performance of a
regression model. MSE is the mean(average) of squared distances between our target variable and predicted
values.

3. RMSE Root Mean Square Error (RMSE) is the standard deviation of the residuals (prediction errors).
RMSE is often preferred over MSE because it is easier to interpret since it is in the same units as the target
variable.

Unit 2 MCQ 12th Class
No ratings yet
Unit 2 MCQ 12th Class
11 pages
Unit2data Science Methodology
No ratings yet
Unit2data Science Methodology
6 pages
Xii - Ai - Notes - U 2
No ratings yet
Xii - Ai - Notes - U 2
8 pages
Notes Unit 1
No ratings yet
Notes Unit 1
8 pages
XII - Unit 2 - Data Science Methodology - An Analytic Approach To Capstone Project
No ratings yet
XII - Unit 2 - Data Science Methodology - An Analytic Approach To Capstone Project
3 pages
Introduction To Data Science Methodology
No ratings yet
Introduction To Data Science Methodology
45 pages
Long Answered Questions With Answer
No ratings yet
Long Answered Questions With Answer
6 pages
HTTTTC - Final Exam
No ratings yet
HTTTTC - Final Exam
4 pages
Introduction Data Science Edited
No ratings yet
Introduction Data Science Edited
33 pages
IDS Unit - 5
No ratings yet
IDS Unit - 5
6 pages
01.ad3491 Fdsa QB
No ratings yet
01.ad3491 Fdsa QB
16 pages
Module 5 - Data Science Methodologies
No ratings yet
Module 5 - Data Science Methodologies
9 pages
UNIT 1 Material
No ratings yet
UNIT 1 Material
28 pages
Capstone Project Data Science Methodology Class12 AI
No ratings yet
Capstone Project Data Science Methodology Class12 AI
3 pages
Datascience Sum.23sol
No ratings yet
Datascience Sum.23sol
22 pages
Cs3352 - Foundation of Data Science
No ratings yet
Cs3352 - Foundation of Data Science
56 pages
Class Xi Chapter 2
No ratings yet
Class Xi Chapter 2
10 pages
EBook - Data Science 4
No ratings yet
EBook - Data Science 4
14 pages
Data Science
No ratings yet
Data Science
14 pages
Datas Unit1
No ratings yet
Datas Unit1
20 pages
Question Bank With Answers
No ratings yet
Question Bank With Answers
103 pages
AI Student HandbookXII
No ratings yet
AI Student HandbookXII
48 pages
CH 2
No ratings yet
CH 2
26 pages
Unit 2 - Data Science Methodology Notes
No ratings yet
Unit 2 - Data Science Methodology Notes
26 pages
Grade 12 Ai Ws Booklet - Unit 2 & 3
No ratings yet
Grade 12 Ai Ws Booklet - Unit 2 & 3
34 pages
7 - Foundations of DS
No ratings yet
7 - Foundations of DS
8 pages
Dsur Ea2352001010391 W3
No ratings yet
Dsur Ea2352001010391 W3
3 pages
Podar Pearl School: Chapter 1: Capstone Project Question and Answers
No ratings yet
Podar Pearl School: Chapter 1: Capstone Project Question and Answers
11 pages
CS3352-QB Fds
No ratings yet
CS3352-QB Fds
12 pages
12 2marks With Ans
No ratings yet
12 2marks With Ans
21 pages
Cs3352 Fods QB
No ratings yet
Cs3352 Fods QB
25 pages
Set. No - 2 P18pecs021-Data Science QP - Ph.d.
No ratings yet
Set. No - 2 P18pecs021-Data Science QP - Ph.d.
20 pages
Capstone Project - Unit2
No ratings yet
Capstone Project - Unit2
81 pages
Unit 2 - DS - 1st Year
No ratings yet
Unit 2 - DS - 1st Year
7 pages
Q1. Explain Data Science Process Along With Detailed Diagram
No ratings yet
Q1. Explain Data Science Process Along With Detailed Diagram
7 pages
Data Science Assignment
No ratings yet
Data Science Assignment
9 pages
Data Science Methodology
No ratings yet
Data Science Methodology
4 pages
Data Science Methodology
No ratings yet
Data Science Methodology
26 pages
QB Ese FDS
No ratings yet
QB Ese FDS
29 pages
Ds 3
No ratings yet
Ds 3
9 pages
FDSMSE Imp
No ratings yet
FDSMSE Imp
6 pages
Fds Two Marks
No ratings yet
Fds Two Marks
10 pages
DSV Sem Exam
No ratings yet
DSV Sem Exam
15 pages
Scanned 20241018-1707 Page2 Image2
No ratings yet
Scanned 20241018-1707 Page2 Image2
7 pages
2 Marks Foundations of Data Science
No ratings yet
2 Marks Foundations of Data Science
13 pages
Data Science Methodology
No ratings yet
Data Science Methodology
21 pages
Liceria Tech
No ratings yet
Liceria Tech
12 pages
Data Science Foundations Guide
No ratings yet
Data Science Foundations Guide
19 pages
12 2marks With Ans
No ratings yet
12 2marks With Ans
21 pages
IBM Q1 Technical Marketing ASSET2 - Data Science Methodology-Best Practices For Successful Implementations Ov37176 PDF
No ratings yet
IBM Q1 Technical Marketing ASSET2 - Data Science Methodology-Best Practices For Successful Implementations Ov37176 PDF
6 pages
Data Science
No ratings yet
Data Science
11 pages
Unit1 Fds
No ratings yet
Unit1 Fds
20 pages
Data Science Course Overview
No ratings yet
Data Science Course Overview
5 pages
DSV Notes
No ratings yet
DSV Notes
13 pages
Unit I 2 Marks
No ratings yet
Unit I 2 Marks
5 pages
Microsoft AI-900 Exam Dumps & Q&A
No ratings yet
Microsoft AI-900 Exam Dumps & Q&A
32 pages
Machine Learning Based Prediction of Flyrock Distance in Rock Blasting
No ratings yet
Machine Learning Based Prediction of Flyrock Distance in Rock Blasting
16 pages
Earth-2 DLI Post-GTC
No ratings yet
Earth-2 DLI Post-GTC
26 pages
Course Eer A
No ratings yet
Course Eer A
97 pages
Machine Learning Model Evaluation
No ratings yet
Machine Learning Model Evaluation
11 pages
978 981 97 7004 5 75 89
No ratings yet
978 981 97 7004 5 75 89
15 pages
Analysis of Machine Learning Techniques For Time Domain Waveform Prediction in Analog and Mixed Signal Integrated Circuit Verification
No ratings yet
Analysis of Machine Learning Techniques For Time Domain Waveform Prediction in Analog and Mixed Signal Integrated Circuit Verification
9 pages
Utility Patent - Invention Disclosure - Smart IoT-Based Spectral System and Method For Real-Time Me
No ratings yet
Utility Patent - Invention Disclosure - Smart IoT-Based Spectral System and Method For Real-Time Me
43 pages
Samash
No ratings yet
Samash
240 pages
Tabular Data Classification and Regression XGBoost or Deep Learning With Retrieval-Augmented Generation
No ratings yet
Tabular Data Classification and Regression XGBoost or Deep Learning With Retrieval-Augmented Generation
14 pages
SLF - ReCell Project - Presentation
No ratings yet
SLF - ReCell Project - Presentation
26 pages
Lnmiit BTP Report Nirf Rank Prediction
No ratings yet
Lnmiit BTP Report Nirf Rank Prediction
31 pages
Navya Paper
No ratings yet
Navya Paper
6 pages
IDA117V Supervised ML
No ratings yet
IDA117V Supervised ML
39 pages
One Two and Three Dimensional Root Water Uptake Functions Fo
No ratings yet
One Two and Three Dimensional Root Water Uptake Functions Fo
14 pages
2025 - An Innovative Deep Learning Model For Accurate Wave Height Predictions With Enhanced Performance For Extreme Waves
No ratings yet
2025 - An Innovative Deep Learning Model For Accurate Wave Height Predictions With Enhanced Performance For Extreme Waves
17 pages
Jamboree
No ratings yet
Jamboree
10 pages
Predictive Modeling of Stock Prices Using Transformer Model
No ratings yet
Predictive Modeling of Stock Prices Using Transformer Model
8 pages
Precision Network Adjustment with QOCA
No ratings yet
Precision Network Adjustment with QOCA
3 pages
Kubiak S3R-Net A Single-Stage Approach To Self-Supervised Shadow Removal CVPRW 2024 Paper
No ratings yet
Kubiak S3R-Net A Single-Stage Approach To Self-Supervised Shadow Removal CVPRW 2024 Paper
11 pages
Bitcoin Price Prediction and Analysis Using Deep Learning Models
No ratings yet
Bitcoin Price Prediction and Analysis Using Deep Learning Models
10 pages
Kumar 2021
No ratings yet
Kumar 2021
5 pages
Documentation of Our Project
No ratings yet
Documentation of Our Project
21 pages
SMT-PREDICT An Efficient Framework For Stock Market Trend Prediction Using Historical and Sentimental
No ratings yet
SMT-PREDICT An Efficient Framework For Stock Market Trend Prediction Using Historical and Sentimental
5 pages
10.2. Accuracy and Quality Measurements
No ratings yet
10.2. Accuracy and Quality Measurements
55 pages
Sedgeo S 22 00177
No ratings yet
Sedgeo S 22 00177
27 pages
Ai-900 3
No ratings yet
Ai-900 3
18 pages
Test Bank For Forecasting and Predictive Analytics With Forecast X (TM), 7th Edition, Barry Keating, J. Holton Wilson, John Solutions Inc. Download
100% (1)
Test Bank For Forecasting and Predictive Analytics With Forecast X (TM), 7th Edition, Barry Keating, J. Holton Wilson, John Solutions Inc. Download
120 pages
Water Level Prediction Using Various Machine Learning Algorithms A Case Study of Durian Tunggal River Malaysia
No ratings yet
Water Level Prediction Using Various Machine Learning Algorithms A Case Study of Durian Tunggal River Malaysia
20 pages

Class 12 AI - Chapter 1

Uploaded by

Class 12 AI - Chapter 1

Uploaded by

DELHI PUBLIC SCHOOL, KANPUR

Data Science Methodology consists of 10 steps

4. Write a short note on the steps done during Data Preparation.

5. What do you mean by Feature Engineering?

7. What do you mean by AI Modelling.

8. Differentiate between descriptive modelling and predictive modelling.

9. What do you understand by Evaluation. Write its phases also.

First phase – Diagnostic measures

Second phase – Statistical significance test

12. Write a comparative study on train-test split and cross validation.

14. Explain Confusion Matrix.

Precision and Recall

15. Explain MAE, MSE and RMSE

You might also like