DELHI PUBLIC SCHOOL
NACHARAM
Class: X Notes
Unit-3: Evaluating Models-Notes
1. In a medical test for a rare disease, out of 1000 people tested, 50 actually have the disease
while 950 do not. The test correctly identifies 40 out of the 50 people with the disease as
positive, but it also wrongly identifies 30 of the healthy individuals as positive.
What is the accuracy of the test?
A) 97% B) 90% C) 85% D) 70%
2. A student solved 90 out of 100 questions correctly in a multiple-choice exam. What isthe error
rate of the student's answers?
A) 10% B) 9% C)8% D) 11%
3. In a spam email detection system, out of 1000 emails received, 300 are spam. The system
correctly identifies 240 spam emails as spam, but it also marks 60 legitimate emails as spam.
What is the precision of the system?
A) 80% B) 70% C) 75% D) 90%
4. In a binary classification problem, a model predicts 70 instances as positive out of which 50
are actually positive. What is the recall of the model?
A) 50% B) 70% C) 80% D) 100%
5. In a sentiment analysis task, a model correctly predicts 120 positive sentiments out of 200
positive instances. However, it also incorrectly predicts 40 negative sentiments as positive. What
is the F1 score of the model?
A) 0.8 B) 0.75 C) 0.72 D) 0.82
6. . A medical diagnostic test is designed to detect a certain disease. Out of 1000 people tested,
100 have the disease, and the test identifies 90 of them correctly. However, it also wrongly
identifies 50 healthy people as having the disease. What is the precision of the test?
A) 90% B) 80% C) 70% D) 60%
7.. A teacher's marks prediction system predicts the marks of a student as 75, but the actual
marks obtained by the student are 80. What is the absolute error in the prediction?
A) 5 B) 10 C) 15 D) 20
8.The goal when evaluating an AI model is to:
A) Maximize error and minimize accuracy
B) Minimize error and maximize accuracy
C) Focus solely on the number of data points used
D) Prioritize the complexity of the model
9. A high F1 score generally suggests:
A) A significant imbalance between precision and recall
B) A good balance between precision and recall
C) A model that only performs well on specific data points
D) The need for more training data
10. How is the relationship between model performance and accuracy described?
A) Inversely proportional B) Not related
C) Directly proportional D) Randomly fluctuating
Answer the following questions
1. What will happen if you deploy an AI model without evaluating it with known test set
data?
Ans:Deploying an AI model without evaluating it on a known test set is risky and can
lead to several serious consequences:
Poor Performance in the Real World: Without testing, we have no idea how well the
model generalizes to new, unseen data. It may have high accuracy on training data but
fail in real time scenarios.
Overfitting Risks: The model might simply be memorizing the training data , rather than
learning generalizable patterns. Test data is crucial to identify this issue.
Bias and Fairness Issues: Unseen biases or unfair treatment of certain groups may go
undetected if the model isn’t evaluated on a properly balanced test set.
2. Do you think evaluating an AI model is essential in an AI project cycle?
Ans:Yes.It helps you understand its strengths, weaknesses, and suitability for the task at
hand. This feedback loop is essential for building trustworthy and reliable AI systems.
● Model evaluation is the process of using different evaluation metrics to
understand a machine learning model’s performance
● An AI model gets better with constructive feedback .
● We build a model, get feedback from metrics, make improvements and continue
until you achieve a desirable accuracy
3. Explain train-test split with an example
Ans:
● The train-test split is a technique for evaluating the performance of a machine
learning algorithm
● It can be used for any supervised learning algorithm
● The procedure involves taking a dataset and dividing it into two subsets: The
training dataset and the testing dataset
● The train-test procedure is appropriate when there is a sufficiently large dataset
available
Need of Train-test split
● The train dataset is used to make the model learn
● The input elements of the test dataset are provided to the trained model. The
model makes predictions, and the predicted values are compared to the
expected values
● The objective is to estimate the performance of the machine learning model on
new data: data not used to train the model
4. “Understanding both error and accuracy is crucial for effectively evaluating and
improving AI models.” Justify this statement.
Ans:Accuracy measures how often the AI model makes correct predictions.
It tells us the overall performance by showing the proportion of correct results among all
predictions made.
Error shows how often and how much the AI model is wrong.
It reflects the difference between the model’s predictions and the actual outcomes.
When developing AI models, it is important to maximize accuracy and minimize error
because:
A model with high accuracy is more reliable and trustworthy for real-world applications.
A model with low error ensures fewer wrong predictions, which is critical especially in
sensitive areas like medical diagnosis, banking, or autonomous driving.
Thus, understanding both error and accuracy helps us:
● Select the right machine learning model for a specific task.
● Evaluate the model's strengths and weaknesses properly.
● Improve the AI model by balancing accuracy and minimizing important errors based on
task requirements.
5. What is classification accuracy? Can it be used all times for evaluating AI models?
Ans:Classification Accuracy is an evaluation metric that measures the percentage of
correct predictions made by a classification model out of all predictions made.
It is calculated as:
Classification Accuracy=Number of Correct Predictions/Total Number of Predictions
● No, classification accuracy cannot always be used alone to evaluate AI models,
especially in some special cases.Classification accuracy is suitable only when the
dataset is balanced (i.e., when the number of examples from each class is almost
equal).
If the dataset is imbalanced (one class has many more examples than another),
accuracy can give a false impression of model performance.
Example of imbalance problem:
Suppose:
95% of patients are healthy ,5% of patients are sick
If a model predicts everyone is healthy, it will still have 95% accuracy — but it completely fails to
detect sick patients, which is very dangerous in real life.In such cases, we should also use other
metrics like:
● Precision
● Recall
● F1-Score
These metrics give a better evaluation when the dataset is imbalanced.
Assertion and reasoning-based questions:
1. Assertion: Accuracy is an evaluation metric that allows you to measure the total number
of predictions a model gets right. Reasoning: The accuracy of the model and
performance of the model is directly proportional, and hence better the performance of
the model, the more accurate are the predictions.
Choose the correct option:
(a) Both A and R are true and R is the correct explanation for A
(b) Both A and R are true and R is not the correct explanation for A
(c) A is True but R is False
(d) A is false but R is True
2. Assertion: The sum of the values in a confusion matrix's row represents the total number
of instances for a given actual class. Reasoning: This enables the calculation of
class-specific metrics such as precision and recall, which are essential for evaluating a
model's performance across different classes.
Choose the correct option:
(a) Both A and R are true and R is the correct explanation for A
(b) Both A and R are true and R is not the correct explanation for A
(c) A is True but R is False
(d) A is false but R is True
3. Identify which metric (Precision or Recall) is to be used in the following cases and why?
a) Email Spam Detection
b) Cancer Diagnosis
c) Legal Cases(Innocent until proven guilty)
d) Fraud Detection
e) Safe Content Filtering (like Kids YouTube)
Case Metric to Use
a) Email Spam Detection Precision
b) Cancer Diagnosis Recall
c) Legal Cases (Innocent until proven Precision
guilty)
d) Fraud Detection Recall
e) Safe Content Filtering (Kids YouTube) Recall