Capstone Project Using Data Science Methodology
CLASS 12 - ARTIFICIAL INTELLIGENCE (CBSE)
This PDF contains:
- Detailed Notes on Data Science Methodology
- Important 3-Mark and 5-Mark Questions with Answers
- Examples and Tables for Exam Preparation
DETAILED NOTES
The Data Science Methodology (IBM/John Rollins) consists of 10 steps grouped into 5 stages:
1. Business Understanding - Define the real-life/business problem using 5W1H and Design
Thinking.
2. Analytic Approach - Decide the analytics type: Descriptive, Diagnostic, Predictive, Prescriptive.
3. Data Requirements - Identify the type, format, and quality of data required.
4. Data Collection - Collect from primary (surveys, sensors) or secondary (public datasets) sources.
5. Data Understanding - Explore data using statistics and visualizations; identify anomalies.
6. Data Preparation - Clean, transform, and engineer features (handle missing data, normalize,
encode).
7. Modeling - Choose algorithms (Classification, Regression, Clustering) and train models.
8. Evaluation - Test using metrics (Accuracy, Precision, Recall, MSE, RMSE).
9. Deployment - Integrate the model into a real-world system (app, website).
10. Feedback - Collect user feedback and retrain for improvement.
IMPORTANT 3-MARK QUESTIONS WITH ANSWERS
1. What are the five stages of Data Science Methodology?
Answer: Business Understanding & Analytic Approach, Data Requirements & Collection,
Data Understanding & Preparation, Modeling & Evaluation, Deployment & Feedback.
2. State the difference between Descriptive and Predictive Analytics with examples.
Answer: Descriptive - Summarizes past events (Example: Monthly sales report).
Predictive - Forecasts future events (Example: Predicting next month sales).
3. Write any three steps involved in Data Preparation.
Answer: Handling missing values, Removing duplicates/outliers, Transforming & encoding data.
4. What is the importance of Feature Engineering? Give one example.
Answer: Feature Engineering improves model performance by creating new features.
Example: Age from Date of Birth.
5. Define Precision and Recall with formulas.
Answer: Precision = TP / (TP + FP); Recall = TP / (TP + FN).
IMPORTANT 5-MARK QUESTIONS WITH ANSWERS
1. Explain the 10 steps of the Data Science Methodology with one example.
Answer: 1. Business Understanding, 2. Analytic Approach, 3. Data Requirements, 4. Data
Collection,
5. Data Understanding, 6. Data Preparation, 7. Modeling, 8. Evaluation,
9. Deployment, 10. Feedback.
Example: Predicting student marks using study hours.
2. Explain the role of Design Thinking in Capstone Projects.
Answer: Design Thinking: Empathize, Define, Ideate, Prototype, Test.
It ensures user-centric problem solving.
3. Differentiate between Classification, Regression, and Clustering with examples.
Answer: Classification - Predict categories (Spam or Not). Regression - Predict continuous values
(House prices).
Clustering - Group similar data (Customer segmentation).
4. Describe the Evaluation process in Data Science with metrics.
Answer: Classification - Accuracy, Precision, Recall, F1 Score.
Regression - MSE, RMSE, R2 Score.