Viva Questions and Answers: Data Science
Q: What is Data Science?
A: Data Science is a multidisciplinary field that uses scientific methods, processes, algorithms, and
systems to extract knowledge and insights from structured and unstructured data.
Q: What are the key components of the Data Science process?
A: The key components include data collection, data cleaning, exploratory data analysis, model
building, evaluation, and deployment.
Q: Explain the difference between Data Science and Data Analytics.
A: Data Science is broader and includes creating data models and using machine learning. Data
Analytics focuses on processing and performing statistical analysis on existing datasets.
Q: What is Descriptive Statistics?
A: Descriptive statistics summarizes and organizes characteristics of a data set using measures
such as mean, median, mode, standard deviation, and graphical tools.
Q: What is Inferential Statistics?
A: Inferential statistics allows us to make predictions or inferences about a population based on a
sample of data using methods such as hypothesis testing and confidence intervals.
Q: What is Cross Validation?
A: Cross Validation is a model validation technique for assessing how a statistical analysis will
generalize to an independent dataset. Commonly used forms include k-fold and stratified k-fold
cross validation.
Q: List different types of data visualizations.
A: Common data visualizations include histograms, pie charts, bar charts, line charts, scatter plots,
box plots, and heatmaps.
Q: What are outliers and how can they be detected?
A: Outliers are data points that differ significantly from other observations. They can be detected
using statistical tests, visualization tools, or machine learning methods like Isolation Forest or
SMOTE.
Q: What is Time Series Forecasting?
A: Time Series Forecasting involves using historical data to predict future events. Techniques
include ARIMA, exponential smoothing, and machine learning approaches.
Q: Explain the use of Predictive Models in Data Science.
A: Predictive models use patterns found in historical and transactional data to identify risks and
opportunities. Common applications include fraud detection and sales forecasting.
Q: What is a Recommendation Engine?
A: A recommendation engine is a system that suggests products, services, or information to users
based on analysis of data. Techniques include collaborative filtering and content-based filtering.