Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
9 views2 pages

Data Science

The document outlines the mid-term exam for the Introduction to Data Science course at Capital University of Science and Technology, scheduled for April 30, 2025. It includes instructions, the weightage of the exam, and a series of questions covering topics such as correlation, mean deviation, clustering, and KNN algorithm. The exam is divided into two parts, with specific marks allocated to each question.

Uploaded by

survivor000111
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views2 pages

Data Science

The document outlines the mid-term exam for the Introduction to Data Science course at Capital University of Science and Technology, scheduled for April 30, 2025. It includes instructions, the weightage of the exam, and a series of questions covering topics such as correlation, mean deviation, clustering, and KNN algorithm. The exam is divided into two parts, with specific marks allocated to each question.

Uploaded by

survivor000111
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Capital University of Science and Technology

Department of CS
Introduction to Data Science (CS4883)
Mid Term Exam
Semester: Spring 2025 Max Marks: 20 marks
Date: 30th April, 2025 Time: 90 Minutes
Instructors:
Dr. Nayyer Masood (S1), Dr. Masroor Ahmed (S2, S3)

Instructions:
⚫ This exam carries 20% weightage towards the final evaluation.
⚫ Attempt whole paper in Answer Sheet

Name: Reg. No.

Part-I (CLO-1 Marks 10)

Question 1 [3, 1, 1 Marks]

Given the values for X and Y, compute the Pearson and Spearman correlation between the two. What
is the purpose of these two metrices and also interpret the computed values.

X 10 20 30 40 50
Y 9 25 15 35 40

Question 2 [3, 2 Marks]

a) The number of cappuccinos sold at the Starbucks location in the Orange County Airport
between 4 and 7 p.m. for a sample of 5 days last year were 20, 40, 50, 60, and 80. Determine
the mean deviation for the number of cappuccinos sold

b) Consider the given distribution below; define the probability distribution for this data and then
plot the probability distribution function (pdf) and cumulative probability distribution (cdf) for
that.

2, 5, 9, 5, 11, 9, 5, 9, 11, 9
PART-II (CLO-2 Marks 10)

Question 3 [7 Marks]

Cluster the following eight points (with (x, y) representing locations) into three clusters

▪ A1(2, 10)
▪ A2(2, 5)
▪ A3(8, 4)
▪ A4(5, 8)
▪ A5(7, 5)
▪ A6(6, 4)
▪ A7(1, 2)
▪ A8(4, 9).

Initial cluster centres are: A1(2, 10), A4(5, 8) and A7(1, 2). The distance function between two points
a=(x1, y1) and b=(x2, y2) is defined as: ρ(a, b) = |x2 – x1| + |y2 – y1| . Use k-means algorithm to
find the three cluster centres after the second iteration.

Question 4 [3 Marks]

Consider the below given dataset:

Point Coordinates Class Label


A1 (2,10) C2
A2 (2, 6) C1
A3 (11,11) C3
A4 (6, 9) C2
A5 (6, 5) C1
A6 (1, 2) C1
A7 (5, 10) C2
A8 (4, 9) C2
A9 (10, 12) C3
A10 (7, 5) C1
A11 (9, 11) C3
A12 (4, 6) C1
A13 (3, 10) C2
A15 (3, 8) C2

By using KNN algorithm please find the class label of the point P= (5, 7).

-----------------------------------------------------The End-----------------------------------------------------

You might also like