H.T.No.
Code No: CT3545 SRGEC-R20
IV B.Tech I Semester Regular/Supplementary Examinations, December 2024
DATA SCIENCE
(Computer Science and Engineering)
Time: 3 Hours Max. Marks: 70
Note: Answer one question from each unit.
All questions carry equal marks.
5 × 14 = 70M
UNIT-I
1. a) Consider the following set of points: {(-2, -2), (1, 1), (3, 3),(5,5),(-5,-5)}. Find the least
square regression line for the given data points. (8M)
b) How can you Assess Model Accuracy? (6M)
(OR)
2. a) Differentiate supervised and unsupervised learning. (8M)
b) Differentiate between linear regression , classification and non-linear regression. (6M)
UNIT-II
3. a) How can overcome Overfitting and underfitting of data? (7M)
b) What Is the Naive Bayes Classifier, If the weather is sunny, then the Player should play or
not for the following data? (7M)
Outlook Play
0 Rainy Yes
1 Sunny Yes
2 Overcast Yes
3 Overcast Yes
4 Sunny No
5 Rainy Yes
6 Sunny Yes
7 Overcast Yes
8 Rainy No
9 Sunny No
10 Sunny Yes
11 Rainy No
12 Overcast Yes
13 Overcast Yes
(OR)
Page 1 of 2
4. a) Explain why KNN is called as lazy learner. (7M)
b) Explain the following terms with Respect to Confusion Matrix. (7M)
(i) Accuracy (ii) Precision (iii) Recall (iv) F1-score
UNIT-III
5. a) How to plot frequency polygon? Explain with an example. (7M)
b) Discuss about the Wilcoxon u-test
(i) two-sample u-test (ii) one -sample u-test (7M)
(OR)
6. a) How to plot box-plot? Explain with an example. (7M)
b) Discuss about covariance and paired t-test and u-test. (7M)
UNIT-IV
7. a) Explain about data Acquisition and cleaning and with example. (7M)
b) Explain about dimensionality reduction. (7M)
(OR)
8. Explain about data wrangling and data imputation. (14M)
UNIT-V
9. a) Outline programming for eigen values and eigen vectors. (7M)
b) Discuss various ways of r data de-duplication and data summarization. (7M)
(OR)
10. a) Explain about SVD decomposition. (7M)
b) Differentiate between data mining and OLAP. (7M)
*****
Page 2 of 2