Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
11 views10 pages

Data Science - 2 Sets

The document outlines various topics and subtopics related to Data Science, including Python Programming, Data Mining, Basic Statistics, and Machine Learning, categorized by difficulty levels. It also includes questions and options related to data handling, machine learning concepts, and statistical methods. Additionally, references to educational resources are provided for further learning.

Uploaded by

akhil1940155
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as XLSX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views10 pages

Data Science - 2 Sets

The document outlines various topics and subtopics related to Data Science, including Python Programming, Data Mining, Basic Statistics, and Machine Learning, categorized by difficulty levels. It also includes questions and options related to data handling, machine learning concepts, and statistical methods. Additionally, references to educational resources are provided for further learning.

Uploaded by

akhil1940155
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as XLSX, PDF, TXT or read online on Scribd
You are on page 1/ 10

S. NO.

Topic Sub Topic Difficulty


Python Programming &
1 Data Science Algorithm Concepts & Medium
Python Programming
2 Data Science Algorithm Concepts & Medium
Python Programming
3 Data Science Algorithm Concepts Hard
4 Data Science Data Mining Easy
5 Data Science Data Mining Medium
6 Data Science Data Mining Medium
7 Data Science Data Mining Hard
8 Data Science Basic Statistics Easy
9 Data Science Basic Statistics Easy
10 Data Science Basic Statistics Medium
11 Data Science Basic Statistics Medium
12 Data Science Basic Statistics Hard
Machine Learning
13 Data Science Concepts Easy
Machine Learning
14 Data Science Concepts Medium
Machine Learning
15 Data Science Exploratory Data
Concepts Hard
16 Data Science Analysis and the Data Easy
Exploratory Data
Science Process
17 Data Science Analysis and the Data
Exploratory Data Medium
Science Process
Analysis and the Data
18 Data Science Hard
Data Visualization
Science Process using
19 Data Science Excel Medium
Data Visualization using
20 Data Science Excel Medium
Question Option A
You have created a Web Application using Python. You are supposed to AWS Elastic Beanstalk
deploy,
The givenrunfunction
and manage
shouldyouaddapplication
two valueson a Platform
if the value ofasthe
a Service. do_math(?,?)
"maths"
While applying penaltyparameter is "add"
in Logistic Regression classifier, a senior data It tends to use all weights equally
scientist
While working on a dataset, you find outofthat
recommends using L2 instead L1 thepenalty.
rangeWhat could
of data be the
in one Missing value treatment
column is
Identify thevery large.
correct Youpoints
main need to
to keep
makeinsure
mind that the performing
while data becomes 1, 2 and 4
Reinforcement
The Deep Q-NetworkLearning
algorithm contains three primary steps. Select the 3,1 and 2
viewing\
correct sequence from the given
You are working on a project andoptions.
the CEO of the company wants the projection/
data in a more comprehensive idea
Consider you are working on a project of data and tothe
analyzing visualize
price ofdata in-depth
the diamond workstation
Add boxplot
which varies
Imagine with
in the thewhere
table diamond's
rows quantity.
representYou face and
objects difficulty in analyzing
columns Edge detection algorithm
represent quantitative characteristics, there are 18 objects and
Consider a data file where you collect the zip codes of your respondents 2 Join data (combining two or more
and you want to attach demographic data sets)
Consider you want to remove the NA data (average
values from theincome).
prior to You will
calculate isnull ()
the
Themean
marksand sum values
obtained by 10 of your in respective order for two subjects -
students 0.88
Biology
A and
team is Physics
working onare given below.
a project that contains data types like string, Tuple using list
integers, list, etc. They showed
You are creating a binary classifier the to
python tuple
predict known
if the as will
person tuple packing.
default on Mean
loan or not. There exists a column in data which has categorial
A random forest model is giving high accuracy on training data but it values. Increase the number of trees
goes on significantly when tested on test data. It is a classic problem of Logistic Regression
Lucifer
You hasworking
were been asked to solveand
on a project a non-binary classification,
you were presented withhis
theteam
below batchSpaceND
code snippet:-
You train a decision tree model which has small minimum samples per Underfitting
split.
If a datasetwill
What be theand
is given, result on thewants
Christin overall
to model because
know about of small
features which By plotting on the graph
occur together and features that are correlated, about the standard
You work for the movie app and want to make a supervised machine Small data
learning model to classify movies for different genres. You have collected
Option B Option C Option D
Heroku Openshift Google Cloud
Use else if instead of else do_math(a, b, maths='add')
It tends to turn most of the No error.
It works better when the data is
It is much faster weights close to zero, but not
not properly formatted
Drop rows zero
Normalize data Use data as is
2, 3 and 4 1, 2, and 3 1 and 2
2,3 and
Data 1
positioning 1,2 and 3 3,2 and maps
Adding 1
Add color to the
/bars over circle \ graphs \tables
story
reducing charts
Covariation Display density and charts
Reorder
Principal component analysis Text analysis Cluster analysis
Rebuilding missing data Exporting data Importing data
use (na.rm = TRUE) dropna() filna()
0.23 0.5 0.66
Tuple with more than 3
Tuple with single parentheses Tuple without parentheses
parentheses There is no need for data
Median Mode
Decrease the threshold in the imputation in this case.
Increase the depth of trees Decrease the number of trees
trees
Decision Trees Random Forest Boosted Trees
batchToSpace batchToSpaceND ToSpaceND
Overfitting No effect on the overall model Just right
By applying various machine
Associative Rule Mining IQR technique
learning models
Data cluster Subset Test data
Correct Option Explanation
4 GCP is a Paas platform which is used to run and
3 manage you
do_math(a, application can be used for the given
b, maths='add')
3 requirement
Self Explanatory
3 Self Explanatory
1 Self Explanatory
2 Self Explanatory
1 Self Explanatory
3 Displaying density will make the comparison easy.
4 Cluster analysis will be the best way to visalize data in this
1 scenario.
Self Explanatory
2 isnull () is used to remove the NA values from the prior to
1 calculate
Self the mean and sum values of your
Explanatory
4 Tuple packing is the tuple without parentheses
3 Mode can be used to impute the missing values in the
1 given column.
The overall performance can be improved by increasing the
1 number of trees
Self Explanatory
3 Self Explanatory
2 Overfitting will be the result on the overall model
3 because
Self of small miminum samples per split.
Explanatory
4
References
https://www.geeksforgeeks.org/data-science-tutorial/
https://www.geeksforgeeks.org/data-science-tutorial/
https://www.geeksforgeeks.org/data-science-tutorial/
https://www.techtarget.com/searchbusinessanalytics/definition/data-mining#:~:text=Data%20mining%20is%2
https://www.techtarget.com/searchbusinessanalytics/definition/data-mining#:~:text=Data%20mining%20is%2
https://www.techtarget.com/searchbusinessanalytics/definition/data-mining#:~:text=Data%20mining%20is%2
https://www.techtarget.com/searchbusinessanalytics/definition/data-mining#:~:text=Data%20mining%20is%2
https://www.geeksforgeeks.org/7-basic-statistics-concepts-for-data-science/
https://www.geeksforgeeks.org/7-basic-statistics-concepts-for-data-science/
https://www.geeksforgeeks.org/7-basic-statistics-concepts-for-data-science/
https://www.geeksforgeeks.org/7-basic-statistics-concepts-for-data-science/
https://www.geeksforgeeks.org/7-basic-statistics-concepts-for-data-science/
https://towardsdatascience.com/ten-machine-learning-concepts-you-should-know-for-data-science-interviews-7
https://towardsdatascience.com/ten-machine-
learning-concepts-you-should-know-for-data-
https://towardsdatascience.com/ten-machine-
science-interviews-70107ca84755
learning-concepts-you-should-know-for-data-
science-interviews-70107ca84756
https://www.geeksforgeeks.org/what-is-exploratory-data-analysis/
https://www.geeksforgeeks.org/what-is-exploratory-data-analysis/
https://www.geeksforgeeks.org/what-is-exploratory-data-analysis/
https://www.tutorialspoint.com/excel_data_analysis/excel_data_analysis_visualization.htm#:~:text=Visualizing%
https://www.tutorialspoint.com/excel_data_analysis/excel_data_analysis_visualization.htm#:~:text=Visualizing%
S. NO. Topic Sub Topic Difficulty
Python Programming &
1 Data Science Algorithm Concepts & Medium
Python Programming
2 Data Science Algorithm Concepts & Medium
Python Programming
3 Data Science Algorithm Concepts Hard
4 Data Science Data Mining Easy
5 Data Science Data Mining Medium
6 Data Science Data Mining Medium
7 Data Science Data Mining Hard
8 Data Science Basic Statistics Easy
9 Data Science Basic Statistics Easy
10 Data Science Basic Statistics Medium
11 Data Science Basic Statistics Medium
12 Data Science Basic Statistics Hard
Machine Learning
13 Data Science Concepts Easy
Machine Learning
14 Data Science Concepts Medium
Machine Learning
15 Data Science Exploratory Data
Concepts Hard
16 Data Science Analysis and the Data Easy
Exploratory Data
Science Process
17 Data Science Analysis and the Data
Exploratory Data Medium
Science Process
Analysis and the Data
18 Data Science Hard
Data Visualization
Science Process using
19 Data Science Excel Medium
Data Visualization using
20 Data Science Excel Medium
Question Text Option A
A team is working on a project which includes a survey. They received a Data Transformation( convert
data setworking
and areon unable to read it and thatKNNs,
unnecessarily increased the data from one form to another
During distance-based model Elena speed up training Polynomial
Through Data features
in neural
During networks
a project, and problems
many changes the values
were facedtoinaperforming
common scale without
a proper integration(combining data from
analysis
The managerusingofunorganized information.
a tech company What totodolettohim
wants Damon check maximum
know if there is Directly
differentgoing
sources)through balance
any fall inyour
the client
sale, and sheet
Imagine askshe read
you for aautomatic
financial testing
statement. Whatframework
modular will be the Automation framework
with
Afteraddition
cleaningbenefits,
the data,instead
David of dividing
starts theitApplication
storing in a secureunderlocation into Develop a new
testwhile
keeping location
Supposeathat complete
you are loga of the entire
Growth PM in process, but hiscompany.
a data-driven entire process You want Regression
The data should be gathered
to understand
Imagine working why at certain
a B2B SaaSsegments of theas
company user
PMbasefor DShave higherNow
projects. data and not third party or
whenare
You providing
providing training dataat
a training toathe ML team,
company, andwhich
Ms. of these
Jenny struggling to Sheet
is points licensedthat is used on a
data
dashboard can be deleted
understand
Consider thedashboards and Sheets.
following options According
and identify the one to her there
which is ais only one is modified the info should
disadvantage
Suppose, there ofare
the 10,000 filter. classes and 2,000 Negative classes recompute
contextPositive and rewrite
Use under sampling theover
and
available inside your
You are working on aTarget
projectcolumn,
and youthen are for sure your
presented withdataset
the belowis code sampling
This codetechniques
will show compile time
Search on the host project’s
error
snippet:-
Your boss wants a method to visualize individual users affected by a Permissions page, and then click
particular workbooktopermissions
You are struggling resolve the rule. Select whichperformance
synchronization option should of he
an Use the Settingsrule.
the permissions page to
Azure Active synchronize on demand.
Services on Data Extract are not
A report is notDirectory Group. Which
getting updated of these
daily. There is aisrequirement
the correct option
from theto
client running
Looking at the genuine against
While that
in a itdiscussion
should beinrefreshed
a meeting, on you
a daily
are basis.
askedWhat to giveis athe possible
justifiable
the objective deals
reason
Your to utilize
client a projectile
ask you to manages chart.theWhat shouldofbe.NET
execution the correct
programs and to Common language specification
reason
provides additional services including memory management, type safety, (CLS) She has not created virtual
While environment
While working
working on on the
the dataset
dataset inlikethe file, mail
spam Ashley has used
filtering, sorecognization,
digits many Gaussian Naive Bayes
Semi-supervised machine
where conditional probability is used, consider having input
Your team is working on a machine learning product that can act as an learning data as a
artificial opponent in video games that plays according to the situation of
Option B Option C
Principal component Option D
Export values (add new values,
analysis( reduce the store visit( person visit)
columns)
Normalization of features dimensionality
Log )
transformation Interest elapsed
By Removing duplicates(rows of
By Export values(exporting new Validate data(for checking the
values and columns have the
framework(database values in database)
Share result to quality of your data)
same values or are duplicates) Store visit
management
Modular task) stakeholder Library architecture testing
Linear framework
framework framework
Develop a data cleansing
Use informatica/talend Reduce noisy data
framework
Correlation Clustering Classification
The data should be extremely
The data should be wrangled The data
Sheets should
which arebe rawon
used sensitive
Sheets thatin nature
are usedandonneed
a not
Sheets which are used on comply with guidlines
dashboard can be hidden but it dashboard can be hidden but not
dashboard cannotcreates
context, Tableau be hidden
a brief context,
givesAwarning
Both and B on hiding deleted Tableau creates a brief
table which will need a reload Use One-Hot encoding for Target matrix which
Use Label will need
encoding forathe
reload
target
Use dimension
The PCA or LDA algorithms
index at which to column
The input tensor whole column
The significance is used to
insert shape of defaults
On the menu, click on the to 0 (the Click on workbook's permission
dimensions to be expanded enable
On the mirror
menu, padding
click on groups and
first dimension)
content, then workbooks
Set the site role for usersand
who page and click on permission Set
thenthe
all maximum
users. site role for
then list
have icon.
been removed from Active rule. A and B
Both group users to beconfiguration
applied during
The data source of
The workers of Data Extract are
Directory. Active Directory synchronization.
Data Extract needs an update Data Extract needs to be
taking
Addingainformation
timeout to canisters Showing the business refreshed
Examining class
the pattern for an era
Common language
and ascertaining tally measure common
developmenttypefor a specific year framework
runtime system library
She
(CLR) has save the file name by She
(CTS)is not giving proper Her
(FCL)library version is old , she
library name indentation to the code have to update it
Multinomial Niave Bayes Bernoulli Naive Bayes GaussNB
Reinforcement learning Supervised machine learning unsupervised machine learning
Correct Option Explanation
A Data transformatin can be done for smoothening the
B data
Normalization of features should be used here
D Self Explanatory
B Self Explanatory
D Self Explanatory
D Developing a data cleansing framework beforehand
C takes
Self care of the repition
Explanatory
B Self Explanatory
D Sheets that are used on a dashboard can be hidden but not
C deleted
Self Explanatory
A Use under sampling and over sampling techniques can be
B usedExplanatory
Self to cure this imbalancing.
C Self Explanatory
C Self Explanatory
A If the services on Data Extract are not running, the data will
A not be
Self refreshed.
Explanatory
B Common language
B runtime
Self will be apt for the given scenario.
Explanatory
B Self Explanatory
B Reinforcement learning has been used in the given scenario.
References
https://www.geeksforgeeks.org/data-science-tutorial/
https://www.geeksforgeeks.org/data-science-tutorial/
https://www.geeksforgeeks.org/data-science-tutorial/
https://www.techtarget.com/searchbusinessanalytics/definition/data-mining#:~:text=Data%20mining%20is%2
https://www.techtarget.com/searchbusinessanalytics/definition/data-mining#:~:text=Data%20mining%20is%2
https://www.techtarget.com/searchbusinessanalytics/definition/data-mining#:~:text=Data%20mining%20is%2
https://www.techtarget.com/searchbusinessanalytics/definition/data-mining#:~:text=Data%20mining%20is%2
https://www.geeksforgeeks.org/7-basic-statistics-concepts-for-data-science/
https://www.geeksforgeeks.org/7-basic-statistics-concepts-for-data-science/
https://www.geeksforgeeks.org/7-basic-statistics-concepts-for-data-science/
https://www.geeksforgeeks.org/7-basic-statistics-concepts-for-data-science/
https://www.geeksforgeeks.org/7-basic-statistics-concepts-for-data-science/
https://towardsdatascience.com/ten-machine-learning-concepts-you-should-know-for-data-science-interviews-7
https://towardsdatascience.com/ten-machine-
learning-concepts-you-should-know-for-data-
https://towardsdatascience.com/ten-machine-
science-interviews-70107ca84755
learning-concepts-you-should-know-for-data-
science-interviews-70107ca84756
https://www.geeksforgeeks.org/what-is-exploratory-data-analysis/
https://www.geeksforgeeks.org/what-is-exploratory-data-analysis/
https://www.geeksforgeeks.org/what-is-exploratory-data-analysis/
https://www.tutorialspoint.com/excel_data_analysis/excel_data_analysis_visualization.htm#:~:text=Visualizing%
https://www.tutorialspoint.com/excel_data_analysis/excel_data_analysis_visualization.htm#:~:text=Visualizing%

You might also like