0% found this document useful (0 votes)

35 views18 pages

Data Science Interview Questions For Freshers

Uploaded by

socraties25

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views18 pages

Data Science Interview Questions For Freshers

Uploaded by

socraties25

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

Habib Shaikh

AI Expert
DATA SCIENCE Habib Shaikh
AI Expert

INTERVIEW QUESTIONS FOR FRESHERS

What is Data Science?

Data Science is a multidisciplinary field that
blends various techniques to analyze large data
sets and derive actionable insights.

Key Components
Data Collection: Gathering raw data from various
sources.
Data Cleaning: Removing errors and
inconsistencies for accuracy.
Data Storage: Warehousing and structuring data
for accessibility.
Analysis Methods: Applying statistical,
mathematical, and machine learning algorithms.
Visualization: Presenting findings through charts
and graphs for better understanding.
DATA SCIENCE Habib Shaikh
AI Expert

INTERVIEW QUESTIONS FOR FRESHERS

Define the terms KPI, lift, model

fitting, robustness, and DOE.
KPI (Key Performance Indicator): A measure of
how effectively a business is achieving its
objectives.
Lift: A metric used to compare the performance
of a target model against a random choice model.
Lift quantifies how well the model predicts
compared to no model at all.
Model Fitting: Refers to how accurately a model
corresponds to the given data observations.
Robustness: The ability of a system to handle
variability or noise effectively.
DOE (Design of Experiments): A structured
approach to investigating and explaining the
variation of information under assumed
conditions by reflecting variables.
DATA SCIENCE Habib Shaikh
AI Expert

INTERVIEW QUESTIONS FOR FRESHERS

What is the difference between

data analytics and data science?
Data science involves transforming data using
advanced analysis methods to extract insights,
which can then be applied in various business
contexts. In contrast, data analytics focuses on
examining existing data to validate hypotheses
and support decision-making.
Data science is forward-looking, involving
predictive modeling and innovations for future
problem-solving, while data analytics is more
focused on understanding past trends for
immediate business decisions. Data science
encompasses a wider scope of techniques, tools,
and methodologies, while data analytics typically
deals with more specific, concentrated issues.
DATA SCIENCE Habib Shaikh
AI Expert

INTERVIEW QUESTIONS FOR FRESHERS

What are some sampling

techniques and the advantages
of sampling?
Due to the challenges of analyzing large datasets
in their entirety, sampling allows for the selection
of representative data points for analysis. Two
main categories of sampling techniques are:
Probability Sampling: Techniques like cluster
sampling, simple random sampling, and stratified
sampling.
Non-Probability Sampling: Methods such as
quota sampling, convenience sampling, and
snowball sampling.
Sampling ensures that analysis remains
manageable while still representing the whole
dataset.
DATA SCIENCE Habib Shaikh
AI Expert

INTERVIEW QUESTIONS FOR FRESHERS

List the conditions for overfitting

and underfitting.
Overfitting: Occurs when a model works well on
training data but fails on new data. This happens
due to low bias and high variance, often seen in
decision trees.
Underfitting: Happens when a model is too
simple to capture the underlying data
relationships, leading to poor performance even
on training data. It results from high bias and low
variance, typical of linear regression.
DATA SCIENCE Habib Shaikh
AI Expert

INTERVIEW QUESTIONS FOR FRESHERS

Differentiate between long and

wide format data.
Long Format Data: Each row represents one data
point per subject, with multiple rows for each
subject. Common in R analysis and data logging.
Wide Format Data: Multiple observations per
subject are stored in separate columns. This
format is typically used for repeated measures
ANOVA in statistical packages.
DATA SCIENCE Habib Shaikh
AI Expert

INTERVIEW QUESTIONS FOR FRESHERS

What are Eigenvectors and

Eigenvalues?
Eigenvectors are unit vectors whose magnitude
equals one and are used in eigen decomposition
of matrices.
Eigenvalues are coefficients applied to these
vectors, altering their magnitude. These
concepts are essential in machine learning
techniques like PCA, where they help identify
significant patterns in data.
DATA SCIENCE Habib Shaikh
AI Expert

INTERVIEW QUESTIONS FOR FRESHERS

What do high and low p-values

mean?
A p-value indicates the likelihood that the
observed results are due to chance.
Low p-value (≤ 0.05): Suggests that the null
hypothesis can be rejected, indicating the result
is statistically significant.
High p-value (≥ 0.05): Indicates the null
hypothesis remains valid, suggesting the
observed data is likely due to random variation.
p-value of 0.05: Implies a borderline result
where the hypothesis could go either way.
DATA SCIENCE Habib Shaikh
AI Expert

INTERVIEW QUESTIONS FOR FRESHERS

When is resampling done?

Resampling is employed to assess the stability
and accuracy of a model, typically by training the
model on various subsets of the data. It helps
quantify uncertainties, ensuring the model can
handle diverse patterns in the data and is
validated through different random selections.

What is imbalanced data?

Imbalanced data refers to a dataset where
certain categories or classes are
underrepresented, leading to biases in model
predictions and accuracy issues.
DATA SCIENCE Habib Shaikh
AI Expert

INTERVIEW QUESTIONS FOR FRESHERS

Are there differences between

expected value and mean value?
While both represent central tendencies, the
expected value pertains to random variables,
whereas the mean is related to probability
distributions. The expected value is often used in
stochastic processes, while the mean is a
general statistic for averaged data.

What is Survivorship Bias?

Survivorship bias is the error made when
focusing on successful subjects and ignoring
those that failed, leading to false conclusions
based on incomplete data. This can distort
analyses and create misleading interpretations.
DATA SCIENCE Habib Shaikh
AI Expert

INTERVIEW QUESTIONS FOR FRESHERS

What is Gradient and Gradient

Descent?
Gradient: A vector indicating how much the
output of a function changes relative to changes
in its input. It represents the slope of the
function.
Gradient Descent: A technique used to minimize
a function by iteratively adjusting the input in the
direction of the steepest decrease, often applied
to minimize loss functions in machine learning.
Define confounding variables.
Confounding variables are extraneous factors
that affect both the independent and dependent
variables, causing spurious relationships that can
distort the conclusions drawn from data analysis.
DATA SCIENCE Habib Shaikh
AI Expert

INTERVIEW QUESTIONS FOR FRESHERS

Explain the bias-variance trade-

off.
Bias: Error introduced by oversimplifying the
model. Low-bias models like decision trees are
complex, while high-bias models like linear
regression are simpler.
Variance: Error due to model complexity, where
overly complex models may overfit and perform
poorly on unseen data.
The trade-off suggests that as model complexity
increases, bias decreases, but variance may
increase, leading to overfitting. The goal is to
find an optimal balance for model accuracy.
DATA SCIENCE Habib Shaikh
AI Expert

INTERVIEW QUESTIONS FOR FRESHERS

Define the confusion matrix.

A confusion matrix is a 2x2 matrix used in
classification tasks to evaluate a model's
performance. It shows the number of correct and
incorrect predictions, broken down into four
categories: True Positive (TP), False Positive
(FP), True Negative (TN), and False Negative
(FN). From these, metrics like accuracy,
precision, recall, and F-score are derived.
DATA SCIENCE Habib Shaikh
AI Expert

INTERVIEW QUESTIONS FOR FRESHERS

What is logistic regression?

Provide an example of its use.
Logistic regression is a technique used to model
binary outcomes using a linear combination of
predictor variables. For example, predicting
election outcomes based on factors like
campaign spending and political history.

What is Linear Regression and its

drawbacks?
Linear regression models the relationship
between a dependent variable and independent
variables. Drawbacks include assumptions of
linearity, inability to model binary outcomes, and
vulnerability to overfitting.
DATA SCIENCE Habib Shaikh
AI Expert

INTERVIEW QUESTIONS FOR FRESHERS

What is Random Forest and how

does it work?
Random Forest is an ensemble learning technique
that builds multiple decision trees and combines
their outputs to improve classification accuracy.
Each tree is trained on random subsets of data,
and predictions are made based on the majority
vote across all trees.
DATA SCIENCE Habib Shaikh
AI Expert

INTERVIEW QUESTIONS FOR FRESHERS

Calculate the chance of seeing a

shooting star within an hour
given a 0.2 probability every 15
minutes.
With a 0.2 chance of seeing a shooting star in 15
minutes, the probability of not seeing any in 15
minutes is 0.8. Over an hour (four 15-minute
intervals), the chance of seeing no stars is 0.8⁴ ≈
0.40. Thus, the chance of seeing at least one star
is 1 - 0.40 = 0.60 or 60%.
DATA SCIENCE Habib Shaikh
AI Expert

INTERVIEW QUESTIONS FOR FRESHERS

What is deep learning and its

difference from machine
learning?
Deep learning is a subset of machine learning
that uses layered neural networks to process and
learn from data. Unlike traditional machine
learning, which uses simpler models, deep
learning simulates the human brain's structure for
higher accuracy and feature extraction.

Lecture Notes On Data Engineering and Communications Technologies Volume 89 (Series Editor Fatos Xhafa) (Z-Library)
No ratings yet
Lecture Notes On Data Engineering and Communications Technologies Volume 89 (Series Editor Fatos Xhafa) (Z-Library)
137 pages
Data Science Interview
No ratings yet
Data Science Interview
132 pages
Unit 1 DataScience
No ratings yet
Unit 1 DataScience
105 pages
Data Science Interview Prep For SQL, Panda, Python, R Langu
No ratings yet
Data Science Interview Prep For SQL, Panda, Python, R Langu
136 pages
Statistics For Data Science
100% (1)
Statistics For Data Science
39 pages
Ace The Data Science Interview PDF
No ratings yet
Ace The Data Science Interview PDF
13 pages
Datascience Sum.23sol
No ratings yet
Datascience Sum.23sol
22 pages
Free Data Science Course Material 2018
No ratings yet
Free Data Science Course Material 2018
32 pages
AI & ML Interview Preparation
No ratings yet
AI & ML Interview Preparation
15 pages
Exploratory Data Analysis
100% (1)
Exploratory Data Analysis
209 pages
Crack Data Science Interview 1731300339
No ratings yet
Crack Data Science Interview 1731300339
132 pages
Da 1733591326
No ratings yet
Da 1733591326
132 pages
7118 Ds Methodology Ss
No ratings yet
7118 Ds Methodology Ss
56 pages
Data Science Interview Questions
100% (2)
Data Science Interview Questions
55 pages
FDS QB
No ratings yet
FDS QB
107 pages
Machine Learning
No ratings yet
Machine Learning
25 pages
Data Science Dse
No ratings yet
Data Science Dse
24 pages
Crash Course - Introduction To Data Science
No ratings yet
Crash Course - Introduction To Data Science
121 pages
Introduction To Data Science
No ratings yet
Introduction To Data Science
29 pages
IDS Mid 1 Notes
No ratings yet
IDS Mid 1 Notes
80 pages
API 580 Exam Bogota August 2019
100% (1)
API 580 Exam Bogota August 2019
19 pages
MYP Sciences - Concepts
No ratings yet
MYP Sciences - Concepts
3 pages
Week 4 - Intro To ML
No ratings yet
Week 4 - Intro To ML
37 pages
Data Science Interview Questions - 1
No ratings yet
Data Science Interview Questions - 1
55 pages
Ads Imp Qna 2025 15 04 06 06 35
No ratings yet
Ads Imp Qna 2025 15 04 06 06 35
33 pages
Data Science - Ebook
No ratings yet
Data Science - Ebook
32 pages
Data Science Interview Questions
No ratings yet
Data Science Interview Questions
27 pages
Statistics N Probability
No ratings yet
Statistics N Probability
31 pages
Unit I
No ratings yet
Unit I
52 pages
Data Science
No ratings yet
Data Science
14 pages
Machine Learning (1) : Inteligência Artificial E Cibersegurança (Inacs)
No ratings yet
Machine Learning (1) : Inteligência Artificial E Cibersegurança (Inacs)
33 pages
P&S New Notes-A
No ratings yet
P&S New Notes-A
22 pages
DLL-Q3-Week 9
100% (1)
DLL-Q3-Week 9
5 pages
Chapter 01 2
No ratings yet
Chapter 01 2
19 pages
Model Structure Visualizations Help Data Scientist1
No ratings yet
Model Structure Visualizations Help Data Scientist1
11 pages
Data Science S3mca
No ratings yet
Data Science S3mca
55 pages
DSBDA
No ratings yet
DSBDA
18 pages
Module 1
No ratings yet
Module 1
19 pages
Datas Unit1
No ratings yet
Datas Unit1
20 pages
Data Science Interview Questions
No ratings yet
Data Science Interview Questions
32 pages
r22 Unit1 Theory1 Ch1
No ratings yet
r22 Unit1 Theory1 Ch1
16 pages
Unit 1 Theory
No ratings yet
Unit 1 Theory
8 pages
Big Data
No ratings yet
Big Data
5 pages
Ads Ia1
No ratings yet
Ads Ia1
13 pages
FDS Sem5
No ratings yet
FDS Sem5
15 pages
Lecture 2 - Statistical Inference - EDA and DS Process - 02032023 111156am 1 - 1 27022024 012412pm
No ratings yet
Lecture 2 - Statistical Inference - EDA and DS Process - 02032023 111156am 1 - 1 27022024 012412pm
44 pages
Wa0004.
No ratings yet
Wa0004.
44 pages
Ass-3 Ds
No ratings yet
Ass-3 Ds
7 pages
Datasciencevictoryy
No ratings yet
Datasciencevictoryy
16 pages
ADS IA 1 Syllabus Prep
No ratings yet
ADS IA 1 Syllabus Prep
5 pages
Unit Ii-Ds
No ratings yet
Unit Ii-Ds
12 pages
Jewish Museum Berlin Thesis
No ratings yet
Jewish Museum Berlin Thesis
30 pages
TE Computer DSBDA
No ratings yet
TE Computer DSBDA
11 pages
40 Interview Questions Asked at Startups in Machine Learning - Data Science
No ratings yet
40 Interview Questions Asked at Startups in Machine Learning - Data Science
13 pages
Data Science Interview Questions (#Day11) PDF
100% (1)
Data Science Interview Questions (#Day11) PDF
11 pages
What Exactly Is Data Science
No ratings yet
What Exactly Is Data Science
15 pages
DSV Sem Exam
No ratings yet
DSV Sem Exam
15 pages
Data Science Cheat Sheet
No ratings yet
Data Science Cheat Sheet
10 pages
Homework Unit 2 Lesson 2
100% (1)
Homework Unit 2 Lesson 2
4 pages
What Is Data Science? Probability Overview Descriptive Statistics
No ratings yet
What Is Data Science? Probability Overview Descriptive Statistics
10 pages
Women Consumption and Paradox Timothy de Waal Malefyt and Maryann Mccabe PDF Download
No ratings yet
Women Consumption and Paradox Timothy de Waal Malefyt and Maryann Mccabe PDF Download
83 pages
The Effect The Type of Surface Has On A Bouncy Balls Return Height
No ratings yet
The Effect The Type of Surface Has On A Bouncy Balls Return Height
7 pages
Class X Economics Exam Syllabus
No ratings yet
Class X Economics Exam Syllabus
4 pages
Problem Based Learning and Applications
No ratings yet
Problem Based Learning and Applications
9 pages
A Design Manual For Small Bridges, UK 1992-2000
100% (3)
A Design Manual For Small Bridges, UK 1992-2000
236 pages
SSIGL-2 Site Identification and Prioritization Ver-6
No ratings yet
SSIGL-2 Site Identification and Prioritization Ver-6
106 pages
Tot Book: Material For Tot Participants - General Guide For Facilitation and Training
No ratings yet
Tot Book: Material For Tot Participants - General Guide For Facilitation and Training
47 pages
Basic Data Science Interview Questions
No ratings yet
Basic Data Science Interview Questions
18 pages
Examiner Thesis Report Format 1
100% (1)
Examiner Thesis Report Format 1
3 pages
Orca Share Media1680785708386 7049726219828421931
No ratings yet
Orca Share Media1680785708386 7049726219828421931
4 pages
Bmjopen 2019 November 9 11 Inline Supplementary Material 2
No ratings yet
Bmjopen 2019 November 9 11 Inline Supplementary Material 2
6 pages
Project Report On Sanvie Retail Private Limited
No ratings yet
Project Report On Sanvie Retail Private Limited
65 pages
Training Module On PRA Tools PDF
No ratings yet
Training Module On PRA Tools PDF
3 pages
Media Raporu
No ratings yet
Media Raporu
168 pages
Project Scope and WBS Guide
No ratings yet
Project Scope and WBS Guide
27 pages
Implication of Banning Extracurricular Activities To The Holistic Development of The Students
No ratings yet
Implication of Banning Extracurricular Activities To The Holistic Development of The Students
23 pages
Template Jurnal Photon
No ratings yet
Template Jurnal Photon
3 pages
Electronic Management Requirements and Their Role in Improving Jo
No ratings yet
Electronic Management Requirements and Their Role in Improving Jo
24 pages
Sentiment Analysis of Movie Reviews Using Machine Learning Techniques
No ratings yet
Sentiment Analysis of Movie Reviews Using Machine Learning Techniques
6 pages
Service Quality in Health Care Setting
No ratings yet
Service Quality in Health Care Setting
12 pages
Overcoming Inertia: Rosenbauer's Tech Adaptation
No ratings yet
Overcoming Inertia: Rosenbauer's Tech Adaptation
75 pages
Ariana Znaor
No ratings yet
Ariana Znaor
12 pages
Amended MGT702 Major Project 2021
No ratings yet
Amended MGT702 Major Project 2021
23 pages
Intro About Yourself and Acknowledging The Deped and Berf
No ratings yet
Intro About Yourself and Acknowledging The Deped and Berf
38 pages
Sustainable Farming for Future Generations
No ratings yet
Sustainable Farming for Future Generations
23 pages
Thai Journal of Nursing Research Vol 13 No 3 Jul 92974
No ratings yet
Thai Journal of Nursing Research Vol 13 No 3 Jul 92974
96 pages

Data Science Interview Questions For Freshers

Uploaded by

Data Science Interview Questions For Freshers

Uploaded by

Habib Shaikh

INTERVIEW QUESTIONS FOR FRESHERS

What is Data Science?

INTERVIEW QUESTIONS FOR FRESHERS

Define the terms KPI, lift, model

INTERVIEW QUESTIONS FOR FRESHERS

What is the difference between

INTERVIEW QUESTIONS FOR FRESHERS

What are some sampling

INTERVIEW QUESTIONS FOR FRESHERS

List the conditions for overfitting

INTERVIEW QUESTIONS FOR FRESHERS

Differentiate between long and

INTERVIEW QUESTIONS FOR FRESHERS

What are Eigenvectors and

INTERVIEW QUESTIONS FOR FRESHERS

What do high and low p-values

INTERVIEW QUESTIONS FOR FRESHERS

When is resampling done?

What is imbalanced data?

INTERVIEW QUESTIONS FOR FRESHERS

Are there differences between

What is Survivorship Bias?

INTERVIEW QUESTIONS FOR FRESHERS

What is Gradient and Gradient

INTERVIEW QUESTIONS FOR FRESHERS

Explain the bias-variance trade-

INTERVIEW QUESTIONS FOR FRESHERS

Define the confusion matrix.

INTERVIEW QUESTIONS FOR FRESHERS

What is logistic regression?

What is Linear Regression and its

INTERVIEW QUESTIONS FOR FRESHERS

What is Random Forest and how

INTERVIEW QUESTIONS FOR FRESHERS

Calculate the chance of seeing a

INTERVIEW QUESTIONS FOR FRESHERS

What is deep learning and its

You might also like