number of
Ofthe questions attem pted. the answers to only the first required
questions (as stipulated in the question paper) will be evaluated.
So, PLEASE DONOT ATTEMPT EXTRA QUESTIONS.
MODULE 1
(25 Marks)
ANSWER ANY ONE FROM O1-02 AND ANY TWO FROM Q3-05
1. What do you mean by data wrangling? How do you achieve this for a dataset handed to you?
2. Youwish to study the state of mind of the members ofa population with respect to music. How would
you study the preference on
a) A nominal scale,
b) An ordinal scale,
c) A ratio scale.
In each case, how can you get an overall impression of the data acquired? (3+2-5]
3. Call the dataset "'warpbreaks" inR. It contains three variables breaks, wool, tension. Type ??wapbreaks
to read the description of the variables. Consider the variables breaks and wool.
1) Through an appropriate diagram, compare the distributions of the breaks for the wool types "A" and
"B".(You need not submit a printout of this diagram)
ii) Substantiate your observations in (i) through appropriate measures. Report the measures along with
their values. (3+7=10|
4, Call the dataset "immer" from the MASS library in R. Type ??immer to read the description of the
variables. Using two to four diagrams, bring out the features of the data. Submit a printout of the diagrams
used (all clubbed in one page). [10]
5. Call the dataset "Survey" from the MASS library in R. Type ??survey to read the description of the
variables.
i) Using an appropriate measure for each, comment on the association between
a) Span of Writing Hand and Height of the student
b) Span of Writing Hand and Sex of the student
i) Which of Span of Non Writing Hand and Height would you use to predict the Span of Writing Hand
of a student? Justify your choice.
iii) Obtain a scatterplot of the Span of Writing Hand against the predictor you choose in (ii), In the
scatterplot, differentiate the points with respect to Sex of the student. Draw the best fit regression on
the plot. (Submit a printout of the plot). (2+2)+3+3=10]
MODULE 2
(25 Marks)
ANSWER QUESTION 6 ANY TWO FROM THE REST.
6 Answer ANY ONEof the following:
(a) Write an alyorithm to find whether the parentheses in an arithmetic expression are balanced or not.
(Example A+(B-(C+D*E))/F is balanced however A+(B-(C+D*EYF is not|
(b) Write an algorithm to delete the last node of a linked list.
2023 2
7 (a) Writenecessary algorithms to implement a circular queue
(b) Describe the process of balancing anode in an AVL tree whose balance
child has a balance tactor of -1 Give an approprnate factor is 2 and whose left
example (4+6|
s (a) Write an algorithm to find the K" smallest clement in an
array without using any sorting algorithm
(b) Draw a trie constructed from the following set of keys
aear re rare ea are ere era rarer rear err
built from the letters d, e, r
|4+6]
9 (a) Construct a BST whose Postorder traversal technique is as follows
10 20 86 2522 40 3S
(b) Wnte algorithms to do the following in a linked list:
() Find out if agiven element, DATA is present in a linked list or not.
(u) Find out whether the elements in a linked list are present in a sorted order (4-(3+3)|
Of the questions attempted, the answers to only the first required number of
questions (as stipulated in the question paper) will be evaluated.
So, PLEASE DO NØT ATTEMPT EXTRA QUESTIONS.
Answer ANY TO from QUESTION NOS. 1 TO 4and ANY FOUR from QUESTION NOS. 5 TO 10,
t. Suppose 10 similar biscuits are distributed among 3 children at random. Find the probability that a
particular child willreceive 4 biscuits.
2. Supposea coin, not necessarily unbiased, is given to you. How will you find out the probability of getting
head in that coin?
3. Suppose in a large population of patients, 5% are suffering from a particular disease. 7 patients are selected
of these
at random, one by one and without replacement from this population. Find the probability that 3
selected patients are suffering from the said disease.
3
P(B) = 8 then show that 8 s P(AN B) S
3
A. If Aand B are two events such that P(A) = and
4
numbered as 1, 2, 20.
5.Suppose 20 balls numbered as 1, 2, .... 20 are distributed among 20 boxes alsO balls
probabilitythat none of the
The boxes are such that each box may contain only one ball. What is the
willgo to the box having the same number? [10]
place in monitored for 2 years (730
6. Suppose at a busy road crossing, the daily number of accidents taking
days). The data is given below:
No. of Accidents 23 4 5 Total
No. of Days 402 15794 49 217 730
required). |10]
Fit an appropriate distribution to the above data and comment on the fit. (No printout is
theorem can be used to find
7. (a) State Bayes' Theorem. Describe a real-life situation where this
probabilities.
limitation of classical definition of
(b)Why may the notion of a finite sample space be considered as a
(6+4]
probability?
parameters n= 10and p= 0.3.
8. (a) Draw an appropriate graph of aBinomial distribution with
Attach a printout ofthis graph in your answer script.
on the skewness of the Binomial distribution of part (a).
(b)Comment |4+1+S]
e)Give a theoretical justification of your comment on part (b).
students:
Consider the following frequency distribution of marks ofa number of
80-85 85-90 90-95 95-100
Marks 60-65 65-70 70-75 75-80
326 135 26 4
No of 3 21 150 335
Students
Z-2-1
(No printout is required).
Fit an appropriate d1stribution to the above data and comnent on the tit |10]
10.5 Explainwith anexample what is meant by a truncateddistribution
sketch of the same Mention one
b) Write down the PDF of alog-normal distribution Give a ough |4+6|
use of this distribution
2023
Of the questions attempted, the answers to only the first required number of
questions (as stipulated in the question paper) will be evaluated.
So, PLEASE DO NOT ATTEMPT EXTRA QUESTIONS.
MODULE A
ANSWER ANY ONE FROM Q.NOS 1-2& ANY TWOFROM Q.NOS 3-5 (No prinouts needed).:
1. Consider the following system of equation Ay-b, y= (x, y, z, w):
x+ 2y +3z +5w =b,
2x +4y + 8z+ 12w = b,
3x + 6y + 7z + 13w = ba
Describe the column space and the null space of A. Also find the condition on b,,b, and b to have a
solution.
2. If AB =0 then show that the column space ofB is contained in the null space of A. Give an example of
A and B.
3. a) Given the basis vectors ej-(1,1,1), e=(1.1,0), e:=(1.0,0) for R, which vectors can be removed from
the basis and replaced by b-(2, -1, 4) such that it still remains a basis?
b) Let H be a 3-dimensional subspace of R with basis
B=(v =(1,0,0, -1), v, = (0,-1,0,2), uz = (4, -1,0,0) }
i) Find an orthonormal basis for H.
11) What are the components of the vector x (1, 2, -1, 0) relative to this orthonormal basis?
|3+(3+4)|I
a), Using QR factorization solve the following system of linear equations:
6w +2x+ 2y + 1z =37
2w + lx+ ly + 0z = 14
3w + 2x+ 2y + 4z = 28
2w + Ox + 5y + 5z = 28
Reduce the following quadratic form into diagonal form and comment on its nature:
Q= 2x?-2y2+6z+2xy-6xz+6yz. Write the matrix of transformation. [6+4]
6 -1 1
3
1 -2 2 3 -1
Find SVD of A = -4 5 8 Interpret the factors.
2 -4
0 1 -2 2
-1 4)
LetA- ( 1 Find rank(A), rank(B) and rank (A B). (7+3|
MODULE B
Answer ANY ONE from Q. No 6-7 & ANY TWOfrom 0. NO 8-10:
Consider an exponential distribution with mean 0 Find a suflicient statistic T for based on a random
sample of size n from the distribution Suggest two more suflicient statistics for e based on T. |3+1+1]
7. Suppose X- N(0, o) Is s' = X(X, -X '/n consistent for g² ? Justify your answer. |l+4|
2023 2
8. Draw 100randomsamples each of size 15 from a Bivariate Normal(0,0,1, 1,0 5)distribution For cach
sample
Compute the sample correlation coefficient
b) Compute the statistic t = [N(n-2)/N1-).
c) Plot the histogram of talong with the frequency curve. (A SINGLE PRINT OUT REQUIRED)
d) Comment on the nature of the distribution of t. |2+2+4+2)
9 Define a Minimum Variance Unbiased Estimator (MVUE) of a parameter.
o) Show that an MVUE, if it exists, is unique.
c) Let (Xi, X2..Xn)be i.i.d N(0, 1). Consider the following estimators of e
i) TI= X;
ii) T2 = X2/2
iii) T3 = (X3 + X4)/2
iv) T4 = Vn X,n
Which of the above estimators would you prefer? Give reasons for your answer. [3+3+4|
10. a) Distinguish between asimple and acomposite hypothesis with the help of an illustrative example.
nb) jConsider a random sample (XI, X......Xn) from a Rectangular (0, 9) distribution. The critical region
for testing 6=o against 6>o is givenby w: {Xin) > c} where c is a constant, suitably chosen. Derive
the power function of the test. (2+2)+6|
2 1
() (e) (diy
) (b) FOUR.
ANY
(a) Answer (d) (b) ká)
é) Answer
TWO.
ANY
(ii) (ü) Writei) (iiy (i). (i) (ii) Ai) describea What the Design
yourhandling diagram.
WhatDefine Explain
Explain Of
way the
CheckpointStatesshort Explain ofState WhatOLAPhelp in
Deadlock Whatdatabase
What Explain model?What is ER do questions
questions
creating Conflict diagram. to ofER
an functional
you the the
notes
of the is
test delivering differences
association So,
is is is the mean
a the concept a the a to diagram PLEASE
handling
transaction on
differences Apriori materialized data check its entire
and steps data a DiscussMention
Star Serializability? dependency. by (as
operations? cube? attempted,
its ANY hierarchy? schema the
production for Consistency between
of
of
mart? pruning for stipulated
techniques a final a a
advantages TWO
big between What conflict the data DO
view? supply
technique? with assumptions producttheto Mention NOT
of data What is flow warehouse an the
theanalysis data a serializability.
a the chain in operational in
suitable of context ATTEMPT answers
the
following. are significance the
mart management a
the benefits. made. good benefits with question
pipeline. and different example. consumer]. of database
or ACID the to
a of service of data only
data a Using
ways dimension system./Supply properties?
functional EXTRApaper)
warehouse. Include mining and the
a a
of suitable starting transactional first
generating generalization applications
dependency. QUESTIONS. wil
table Explain
example required
be
What e
fromratwhchain
in evaluated.
are thmultidimensional
e with database.
of
management using
the same? an number
your and
difrerent
ways components all example. a
How suitable
I(2+2+3)+3] I(2+2)+4+2] own,aggregation in
I(2+2)+6] is of
does [2+3+5] the (4x10|
(2x5) [10| [2x5]
it