0% found this document useful (0 votes)

64 views15 pages

EVALUATION - Notes

The document provides an overview of evaluation in artificial intelligence, detailing its importance, the concept of overfitting, and the distinction between prediction and reality. It explains various evaluation metrics such as accuracy, precision, recall, and F1 score, along with examples and scenarios where these metrics are applied. Additionally, it discusses the implications of false positives and false negatives in different contexts, emphasizing the need for careful metric selection based on the specific case at hand.

Uploaded by

ayushkumarupadhyay381

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

64 views15 pages

EVALUATION - Notes

Uploaded by

ayushkumarupadhyay381

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

EVALUATION - Class 10

Artificial Intellignce(417)

1. What is Evaluation?
Ans : Evaluation is the process of understanding the reliability of any AI model,
based on outputs by feeding test dataset into the model and comparing with
actual answers. Its purpose is to make judgments about a program, to improve
its effectiveness, and/or to inform programming decisions.

2. Why is Evaluation important? Explain.

Ans : Evaluation is a process that critically examines a program by collecting
and analyzing information about a program’s activities, characteristics and
outcomes. The advantages of Evaluation are as follows :
i. Evaluation ensures that the model is operating correctly and
optimally.
ii. Evaluation is an initiative to understand how well it achieves its goals.
iii. Evaluation help to determine what works well and what could be
improved in a Program.

3. What is meant by Overfitting of Data?

Overfitting is "the production of an analysis that corresponds too closely or

exactly to a particular set of data, and may therefore fail to fit additional data
or predict future observations reliably".

Models that use the training dataset during testing, will always results in
correct output. This is known as overfitting

4. What are Prediction & Reality in relation to Evaluation?

Ans :Prediction – It is the output given by the AI model using Machine Learning
Algorithm.
Reality – It is the real scenario of the situation for which the prediction
has been made.

5. Differentiate between Prediction and Reality.

Ans :

a) Prediction is the input given to the machine to receive the expected result

of reality.

b) Prediction is the output given to match reality.

c) The prediction is the output which is given by the machine and the reality is

the real scenario in which the prediction has been made.

d) Prediction and reality both can be used interchangeably.

6. Terminologies of Model Evaluation

The Scenario

Let’s imagine that we have an AI-based prediction model which has been
deployed to identify a Football or a soccer ball.

Now, the objective of the model is to predict whether the given/shown figure
is a football. Now, to understand the efficiency of this model, we need to
check if the predictions which it makes are correct or not. So we need to
consider upon Prediction and Reality.

Case 1 :

a) Prediction = YES
b) Reality = YES

 The predicted value matches the actual value.

 Here, the Prediction is positive and matches Reality. Hence, this
condition is termed as True Positive.

Case 2 :

a) Prediction = No
b) Reality = No

 The predicted value matches the actual value.

 Here, the Prediction is negative and matches Reality. Hence, this

condition is termed as True Negative.

Case 3 :

a) Prediction = Yes
b) Reality = No

 The predicted value does not match the actual value.

 Here, the Prediction is positive and does not match Reality. Hence, this
condition is termed as False Positive.

 This is also known as Type 1 Error.

Case 4 :

c) Prediction = No
d) Reality = Yes

 The predicted value does not match the actual value.

 Here, the Prediction is negative and does not match Reality. Hence, this
condition is termed as False Negative.

 This is also known as Type 2 Error.

7. What is Confusion Matrix?

Ans : Confusion Matrix is a

tabular structure which
helps in measuring the
performance of an AI
model using the test data.
The result of comparison
between the prediction
and reality are recorded in
confusion matrix. It is a
record that helps in
evaluation.

8. Parameters to Evaluate a Model

9. What is Accuracy? Mention its formula.

Ans : Accuracy is defined as the percentage of correct predictions out of all

the observations. A prediction is said to be correct if it matches reality.
Here we have two conditions in which the Prediction matches with the
Reality, i.e., True Positive and True Negative. Therefore, Formula for
Accuracy is –
Where TP = True Positives, TN = True Negatives, FP = False Positives, and FN
= False Negatives.

10. What is Precision? Mention its formula.

Ans : Precision is defined as the percentage of true positive cases versus all
the cases where the prediction is true. That is, it takes into account the
True Positives and False Positives.

11. What is Recall? Mention its formula.

Ans : Recall is defined as the fraction of positive cases that are correctly
Identified. It majorly takes into account the true reality cases.

12. How do you suggest which evaluation metric is more important for any
case ?
Ans :

F 1 Evaluation metric is more important in any case. F1 score maintains a

balance between the precision and recall for the classifier. If the precision is
low, the F1 is low and if the recall is low again F1 score is low. The F1 score
is a number between 0 and 1 and is the harmonic mean of precision and
recall. When we have a value of 1 (that is 100%) for both Precision and
Recall, the F1 score would also be an ideal 1 (100%). It is known as the
perfect value for F1 Score. As the values of both Precision and Recall ranges
from 0 to 1, the F1 score also ranges from 0 to 1.
A model is said to have a good performance if the F1 score for that model is
high.
13.Give an example where High Accuracy is not usable.
Ans : SCENARIO: An expensive robotic chicken crosses a very busy road a
thousand times per day. An ML model evaluates traffic patterns and
predicts when this chicken can safely cross the street with an accuracy of
99.99%.
Explanation: A 99.99% accuracy value on a very busy road strongly suggests
that the ML model is far better than chance. In some settings, however, the
cost of making even a small number of mistakes is still too high. 99.99%
accuracy means that the expensive chicken will need to be replaced, on
average, every 10 days. (The chicken might also cause extensive damage to
cars that it hits.)
14.Give an example where High Precision is not usable.
Ans : Example: “Predicting a mail as Spam or Not Spam”
False Positive: Mail is predicted as “spam” but it is “not spam”.
False Negative: Mail is predicted as “not spam” but it is “spam”. Of course,
too many False Negatives will make the spam filter ineffective but False
Positives may cause important mails to be missed and hence Precision is
not usable.

15. Which evaluation metric would be crucial in the following cases? Justify.
 In a case like Forest Fire, a False Negative can cost us a lot and is risky

too. Imagine no alert being given even when there is a Forest Fire. The whole
forest might burn down.

 Another case where a False Negative can be dangerous is Viral Outbreak.

Imagine a deadly virus has started spreading and the model which is supposed
to predict a viral outbreak does not detect it. The virus might spread widely
and infect a lot of people.

 On the other hand, there can be cases in which the False Positive

condition costs us more than False Negatives. One such case is Mining.
Imagine a model telling you that there exists treasure at a point and you keep
on digging there but it turns out that it is a false alarm. Here, the False
Positive case (predicting there is a treasure but there is no treasure) can be
very costly.

 Similarly, let’s consider a model that predicts whether a mail is spam or

not. If the model always predicts that the mail is spam, people would not look
at it and eventually might lose important information. Here also False Positive
condition (Predicting the mail as spam while the mail is not spam) would have
a high cost.

16.Cases of High FN Cost

Forest Fire

Viral

Cases of High FP Cost

Spam

Mining
17. Calculate Accuracy, Precision, Recall and F1 Score for the following
Confusion Matrix on Heart Attack Risk. Also suggest which metric would be a
good evaluation parameter here and why?

Where True Positive (TP) = 50, True Negative (TN) = 20, False Positive (FP) = 20 and
False Negative (FN) = 10.
Accuracy

=((50+20) / (50+20+20+10))*100%

= (70/100) * 100%

= 0.7 * 100% = 70%

Precision:

Precision is defined as the percentage of true positive cases versus all the cases

where the prediction is true.

= (50 / (50 + 20)) * 100%

= (50/70)*100%

= 0.714 *100% = 71.4%

Recall: It is defined as the fraction of positive cases that are correctly identified.

= 50 / (50 + 10)
= 50 / 60
= 0.83

F1 Score:

F1 score is defined as the measure of balance between precision and recall.

= 2 * (0.714 *0.83) / (0.714 + 0.83)

= 2 * (0.592 / 1.544)

= 2* (0.383) = 0.766

Therefore,

Accuracy= 0.7 Precision=0.714 Recall=0.83 F1 Score=0.766

Here within the test there is a tradeoff. But Recall is a good Evaluation metric.

Recall metric needs to improve more.

Because,

False Positive (impacts Precision): A person is predicted as high risk but does not

have heart attack.

False Negative (impacts Recall): A person is predicted as low risk but has heart

attack. Therefore, False Negatives miss actual heart patients, hence recall metric
need more improvement.

False Negatives are more dangerous than False Positives.

18. Calculate Accuracy, Precision, Recall and F1 Score for the following Confusion

Matrix on Water Shortage in Schools: Also suggest which metric would not be

a good evaluation parameter here and why?

Where True Positive (TP), True Negative (TN), False Positive (FP) and False Negative
(FN).

Accuracy

Accuracy is defined as the percentage of correct predictions out of all the

observations

= ((75+15) / (75+15+5+5))*100%

= (90 / 100) *100%

=0.9 *100% = 90%

Precision:

Precision is defined as the percentage of true positive cases versus all the cases

where the prediction is true.

= (75 / (75+5))*100%
= (75 /80)*100%
= 0.9375 *100% = 93%
Recall:

It is defined as the fraction of positive cases that are correctly identified.

= 75 / (75+5)

= 75 /80

= 0.9375

F1 Score:

F1 score is defined as the measure of balance between precision and recall.

= 2 * ((0.9375 *0.9375) / (0.9375+0.9375)

= 2 * (0.8789 / 1.875)
= 2 * 0.46875 = 0.9375

Accuracy= 90% Precision=93% Recall=0.9375 F1 Score=0.9375

Here precision, recall, accuracy, f1 score all are same

19. Calculate Accuracy, Precision, Recall and F1 Score for the following

Confusion Matrix on SPAM FILTERING: Also suggest which metric would not be a

good evaluation parameter here and why?

Accuracy is defined as the percentage of correct predictions out of all the

Observations

Where True Positive (TP) = 10, True Negative (TN) = 25, False Positive (FP) = 55 and
False Negative (FN) = 10.

Accuracy

= ((10 + 25) / (10+25+55+10))*100%

= (35 / 100)*100%

= 0.35 %100% = 35%

Precision:

Precision is defined as the percentage of true positive cases versus all the cases
where the prediction is true.

= (10 / (10 +55))*100%

= (10 /65) *100%
= 0.15 *100% = 15%

Recall:

= 10/(10+10)
= 10/20
0.5

F1 Score

F1 score is defined as the measure of balance between precision and recall.

= 2 * ((0.15 * 0.5) / (0.15 + 0.5))

= 2 * (0.075 / 0.65)

= 2 * 0.115

= 0.23

Accuracy= 35% Precision= 15% Recall= 0.5 F1 Score= 0.23

Here within the test there is a tradeoff. But Precision is not a good Evaluation

metric. Precision metric needs to improve more.

Because,
False Positive (impacts Precision): Mail is predicted as “spam” but it is not.

False Negative (impacts Recall): Mail is predicted as “not spam” but spam

Too many False Negatives will make the Spam Filter ineffective. But False

Positives may cause important mails to be missed. Hence, Precision is more

important to improve.

AI Evaluation
No ratings yet
AI Evaluation
3 pages
Unit-3-Evaluating Models
No ratings yet
Unit-3-Evaluating Models
3 pages
Evaluating Models
No ratings yet
Evaluating Models
8 pages
UNIT 3-Practice Sheet 3
No ratings yet
UNIT 3-Practice Sheet 3
2 pages
Evaluation Exercise
No ratings yet
Evaluation Exercise
3 pages
Evaluation Data
No ratings yet
Evaluation Data
3 pages
Cls 10 Evaluation Final
No ratings yet
Cls 10 Evaluation Final
10 pages
Evaluating Models QA ClassX 25-26
No ratings yet
Evaluating Models QA ClassX 25-26
6 pages
GR X Unit 3 Evaluation NBE Key
No ratings yet
GR X Unit 3 Evaluation NBE Key
6 pages
Evaluation Class X Ai 417
No ratings yet
Evaluation Class X Ai 417
19 pages
Screenshot 2025-03-17 at 12.15.59
No ratings yet
Screenshot 2025-03-17 at 12.15.59
3 pages
Notes of Evaluation
No ratings yet
Notes of Evaluation
5 pages
Class X Artificial Intelligence EVALUATION
No ratings yet
Class X Artificial Intelligence EVALUATION
5 pages
EvaluationQuestions Class 10 Ai
No ratings yet
EvaluationQuestions Class 10 Ai
6 pages
AI Project Evaluation 1
No ratings yet
AI Project Evaluation 1
5 pages
Unit - 3 Evaluation
No ratings yet
Unit - 3 Evaluation
6 pages
CH 7 - Notes Evaluation
No ratings yet
CH 7 - Notes Evaluation
3 pages
Assignment 5
No ratings yet
Assignment 5
22 pages
Screenshot 2024-12-17 at 8.54.03 PM
No ratings yet
Screenshot 2024-12-17 at 8.54.03 PM
4 pages
Aiunit 7 10
No ratings yet
Aiunit 7 10
4 pages
Chater 3 Class 10
No ratings yet
Chater 3 Class 10
4 pages
Unit-7 Evaluation: 7. What Is Meant by Overfitting of Data?
No ratings yet
Unit-7 Evaluation: 7. What Is Meant by Overfitting of Data?
7 pages
CH EVALUATION
No ratings yet
CH EVALUATION
7 pages
Evaluating AI Models
No ratings yet
Evaluating AI Models
3 pages
517-C-30072-Assignment Chapter Evaluation
No ratings yet
517-C-30072-Assignment Chapter Evaluation
10 pages
Kenya's Gig Economy Regulation
No ratings yet
Kenya's Gig Economy Regulation
99 pages
E Book Evaluation
No ratings yet
E Book Evaluation
12 pages
Partiiiunit2model Performanceconfusion Matrixaccuracyprecesion Recall
No ratings yet
Partiiiunit2model Performanceconfusion Matrixaccuracyprecesion Recall
8 pages
Part B Chapter 7 (Evaluation)
No ratings yet
Part B Chapter 7 (Evaluation)
5 pages
UNIT 7 Evaluation Solution
No ratings yet
UNIT 7 Evaluation Solution
8 pages
Evaluating Models CH-3
No ratings yet
Evaluating Models CH-3
5 pages
AI Model Evaluation Guide
No ratings yet
AI Model Evaluation Guide
7 pages
Evaluation AI X
No ratings yet
Evaluation AI X
6 pages
Evaluation 1 7
No ratings yet
Evaluation 1 7
7 pages
AI Evaluation and Metrics Worksheet
No ratings yet
AI Evaluation and Metrics Worksheet
5 pages
Unit-7 Evaluation Notes
No ratings yet
Unit-7 Evaluation Notes
9 pages
10 Ai Evaluation tp01
No ratings yet
10 Ai Evaluation tp01
5 pages
Unit 7 - Evaluation
No ratings yet
Unit 7 - Evaluation
7 pages
EVALUATION
No ratings yet
EVALUATION
25 pages
Evaluation Grade10 Ai
No ratings yet
Evaluation Grade10 Ai
32 pages
EVALUATION
No ratings yet
EVALUATION
10 pages
AI Evaluation for Class 10 Students
No ratings yet
AI Evaluation for Class 10 Students
12 pages
Evaluation Notes
No ratings yet
Evaluation Notes
12 pages
AI Model Evaluation Basics
No ratings yet
AI Model Evaluation Basics
5 pages
CH 07 Evaluation
No ratings yet
CH 07 Evaluation
25 pages
Unit 7 Evaluation
No ratings yet
Unit 7 Evaluation
13 pages
Part B Unit 7 Evaluation
No ratings yet
Part B Unit 7 Evaluation
11 pages
Q ClassX AI Evaluation
No ratings yet
Q ClassX AI Evaluation
12 pages
Grade 5 English Test
No ratings yet
Grade 5 English Test
3 pages
Home Based Newborn Care
No ratings yet
Home Based Newborn Care
40 pages
Tempering Procedure Guide
No ratings yet
Tempering Procedure Guide
2 pages
SCI BROCHURE (Trifolds)
No ratings yet
SCI BROCHURE (Trifolds)
2 pages
Evaluation Notes
No ratings yet
Evaluation Notes
12 pages
Unit - 7 - Evaluation
No ratings yet
Unit - 7 - Evaluation
30 pages
Interim - Script-2 (Rough)
No ratings yet
Interim - Script-2 (Rough)
3 pages
5.10ai - 2B
No ratings yet
5.10ai - 2B
15 pages
NUTRIC Score 1 Page Summary - 19march2013
No ratings yet
NUTRIC Score 1 Page Summary - 19march2013
1 page
Chemistry Lab: Cation Analysis Guide
No ratings yet
Chemistry Lab: Cation Analysis Guide
3 pages
Unit 7 - AI (Evaluation)
No ratings yet
Unit 7 - AI (Evaluation)
28 pages
Biology 11 and 12: Cell Modifications and Specialization
No ratings yet
Biology 11 and 12: Cell Modifications and Specialization
2 pages
AI Evaluation
No ratings yet
AI Evaluation
30 pages
Mike Thompson's Devore Water Company Letter of Resignation
No ratings yet
Mike Thompson's Devore Water Company Letter of Resignation
2 pages
For Parents
No ratings yet
For Parents
6 pages
Lavender Oil
No ratings yet
Lavender Oil
1 page
3008 Revision CV Evaluation
No ratings yet
3008 Revision CV Evaluation
20 pages
AI Travel Web App Development
No ratings yet
AI Travel Web App Development
6 pages
Cbse - Department of Skill Education Artificial Intelligence
No ratings yet
Cbse - Department of Skill Education Artificial Intelligence
12 pages
Evaluation
No ratings yet
Evaluation
32 pages
Coagulation For Wastewater Treatment A R
100% (1)
Coagulation For Wastewater Treatment A R
5 pages
JD Omre226033 en Preview
No ratings yet
JD Omre226033 en Preview
31 pages
JASON Study: Human Performance
No ratings yet
JASON Study: Human Performance
90 pages
AI Model Evaluation Techniques
No ratings yet
AI Model Evaluation Techniques
24 pages
PA Science
No ratings yet
PA Science
8 pages
Evaluation Grade10 Ai
No ratings yet
Evaluation Grade10 Ai
32 pages
To Investigate The Behavior of Concrete by Partial Replacement of Stone Dust With Fine Aggregates and Tobacco Waste Ash With Cement
No ratings yet
To Investigate The Behavior of Concrete by Partial Replacement of Stone Dust With Fine Aggregates and Tobacco Waste Ash With Cement
10 pages
Ship Recycling ACS Seminar 2021
No ratings yet
Ship Recycling ACS Seminar 2021
21 pages
Construction Material Order
No ratings yet
Construction Material Order
2 pages
Reading II: TV Schedule Analysis
No ratings yet
Reading II: TV Schedule Analysis
3 pages
MalaKumarEngineers Brochure
No ratings yet
MalaKumarEngineers Brochure
6 pages
Air Mass Calculations PDF
No ratings yet
Air Mass Calculations PDF
3 pages
Circular For Early Dispersal 22-Apr-2024 11-03-30 Page 1
No ratings yet
Circular For Early Dispersal 22-Apr-2024 11-03-30 Page 1
1 page
MAN - MFG006 - RevF - Expression CMS User's Manual
No ratings yet
MAN - MFG006 - RevF - Expression CMS User's Manual
56 pages
Class X - Artificial Intelligence - Evaluation - Question Bank
86% (7)
Class X - Artificial Intelligence - Evaluation - Question Bank
8 pages
B2: Module 3: Lesson 3 Assignment Chem 30: (?/44 Marks)
No ratings yet
B2: Module 3: Lesson 3 Assignment Chem 30: (?/44 Marks)
4 pages
HDPE Blow Moulded Containers Report
50% (8)
HDPE Blow Moulded Containers Report
26 pages
Introduction To Rubber Final
No ratings yet
Introduction To Rubber Final
4 pages
80 20 Meal Plan SoreyFitness
No ratings yet
80 20 Meal Plan SoreyFitness
8 pages
EES Thermodynamics Training Guide
No ratings yet
EES Thermodynamics Training Guide
11 pages
European Standard Norme Européenne Europäische Norm: Qualification Test of Welders - Fusion Welding - Part 6: Cast Iron
No ratings yet
European Standard Norme Européenne Europäische Norm: Qualification Test of Welders - Fusion Welding - Part 6: Cast Iron
24 pages
Soal Latian
No ratings yet
Soal Latian
18 pages
NSTSE Class 4 Solved Paper 2011 PDF
No ratings yet
NSTSE Class 4 Solved Paper 2011 PDF
22 pages
Full Download Handbook of Coating Additives 2nd Edition John J. Florio (Editor) PDF
100% (15)
Full Download Handbook of Coating Additives 2nd Edition John J. Florio (Editor) PDF
64 pages

EVALUATION - Notes

Uploaded by

EVALUATION - Notes

Uploaded by

EVALUATION - Class 10

2. Why is Evaluation important? Explain.

3. What is meant by Overfitting of Data?

Overfitting is "the production of an analysis that corresponds too closely or

4. What are Prediction & Reality in relation to Evaluation?

5. Differentiate between Prediction and Reality.

b) Prediction is the output given to match reality.

the real scenario in which the prediction has been made.

d) Prediction and reality both can be used interchangeably.

6. Terminologies of Model Evaluation

 The predicted value matches the actual value.

 The predicted value matches the actual value.

 Here, the Prediction is negative and matches Reality. Hence, this

 The predicted value does not match the actual value.

 This is also known as Type 1 Error.

 The predicted value does not match the actual value.

 This is also known as Type 2 Error.

Ans : Confusion Matrix is a

8. Parameters to Evaluate a Model

9. What is Accuracy? Mention its formula.

Ans : Accuracy is defined as the percentage of correct predictions out of all

10. What is Precision? Mention its formula.

11. What is Recall? Mention its formula.

F 1 Evaluation metric is more important in any case. F1 score maintains a

 Another case where a False Negative can be dangerous is Viral Outbreak.

 Similarly, let’s consider a model that predicts whether a mail is spam or

16.Cases of High FN Cost

Cases of High FP Cost

= 0.7 * 100% = 70%

where the prediction is true.

= (50 / (50 + 20)) * 100%

= 0.714 *100% = 71.4%

F1 score is defined as the measure of balance between precision and recall.

Accuracy= 0.7 Precision=0.714 Recall=0.83 F1 Score=0.766

Recall metric needs to improve more.

have heart attack.

False Negatives are more dangerous than False Positives.

a good evaluation parameter here and why?

Accuracy is defined as the percentage of correct predictions out of all the

= (90 / 100) *100%

=0.9 *100% = 90%

where the prediction is true.

It is defined as the fraction of positive cases that are correctly identified.

F1 score is defined as the measure of balance between precision and recall.

= 2 * ((0.9375 *0.9375) / (0.9375+0.9375)

Accuracy= 90% Precision=93% Recall=0.9375 F1 Score=0.9375

Here precision, recall, accuracy, f1 score all are same

good evaluation parameter here and why?

= ((10 + 25) / (10+25+55+10))*100%

= 0.35 %100% = 35%

= (10 / (10 +55))*100%

F1 score is defined as the measure of balance between precision and recall.

= 2 * ((0.15 * 0.5) / (0.15 + 0.5))

Accuracy= 35% Precision= 15% Recall= 0.5 F1 Score= 0.23

metric. Precision metric needs to improve more.

Positives may cause important mails to be missed. Hence, Precision is more

You might also like