Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
14 views56 pages

STATISTICS-II (Intermediate - II ICS - 2)

The document outlines a statistics syllabus for Intermediate Part-II, detailing a chapter-wise paper setting scheme and the structure of questions including MCQs, short questions, and long questions. It focuses on Normal Distribution, defining its properties, equations, and applications, as well as providing examples and solutions related to the normal distribution. Additionally, it includes objective questions to test understanding of the concepts presented.

Uploaded by

abrishbookcenter
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views56 pages

STATISTICS-II (Intermediate - II ICS - 2)

The document outlines a statistics syllabus for Intermediate Part-II, detailing a chapter-wise paper setting scheme and the structure of questions including MCQs, short questions, and long questions. It focuses on Normal Distribution, defining its properties, equations, and applications, as well as providing examples and solutions related to the normal distribution. Additionally, it includes objective questions to test understanding of the concepts presented.

Uploaded by

abrishbookcenter
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 56

AASSAN

STATISTICS

INTERMEDIATE
PART-II

DARSHAN JEE MUHAMMAD ASIM


LECTURER IN STATISTICS ASSISTANT PROFESSOR IN STATISTICS
GOVT. KHAWAJA FAREED GRADUATE GOVT. KHAWAJA FAREED GRADUATE
COLLEGE, RAHIM YAR KHAN COLLEGE, RAHIM YAR KHAN

1
Chapter wise Paper Setting Scheme BISE BWP

TOTAL 85 Marks
Q.1 MCQS (1 Mark Each) 17 Marks

Q.2 Attempt 8 Short Questions from 12 Questions. (2 Marks Each) 16 Marks

Q.3 Attempt 8 Short Questions from 12 Questions. (2 Marks Each) 16 Marks

Q.4 Attempt 6 Short Questions from 9 Questions. (2 Marks Each) 12 Marks

Attempt 3 Questions from Q.5, Q.6, Q.7, Q.8 and Q.9. (8 Marks Each) 24 Marks

Multiple Choice Short Long


Chapter Questions Question Questions Weightage Marks

Question
10 3 5 24.70% 21
(a & b Part)

Question
11 3 6 27.05% 23
(a & b Part)
Half Question
12 2 2 11.76% 10
(1 Part)
Half Question
13 1 3 (1 Part) 12.94% 11

Question
14 3 6 27.05% 23
(a & b Part)
Half Question
15 2 3 (1 Part) 14.12% 12
Half Question
16 2 6 (1 Part) 25.88% 22

-
17 1 2 5.88% 05

2
Chapter No.10
Normal Distribution
Define Normal distribution.
Normal distribution is the limiting form of binomial distribution when the number of trails “n” is very
large i. e. 𝑛 > 30 and "p" is not very small. i. e. p ≅ q. It is also called Gaussian distribution.
What is Normal probability distribution?
Let “X” be a continuous random variable with interval (−∞, +∞) is said to be normal
distribution having its probability density function (p.d.f) is given as:
1 1 𝑥−𝜇 2
𝑓(x) = 𝑦 = 𝑒− 2
(
𝜎
)
; −∞ ≤ 𝑥 ≤ +∞
𝜎√2𝜋
mean = μ S. D = σ
22
𝜋 = a constant = ≅ 3.1416
7
𝑒 = a constant ≅ 2.7183
X = abscissa, Y = ordinate
How many parameters of Normal distribution?
The normal distribution has two parameters population mean 𝝁 and population variance 𝝈 𝟐 and it is
denoted by 𝑋~ 𝑁(𝜇, 𝜎 2 ).
Write down the importance of Normal distribution.
Some importance of normal distribution is given below:
i) The normal random variable does frequently occur in practical problems such as heights and weights
of individuals, error of measurements.
ii) It is the limiting form of many other probability laws.
iii) It is used in solving problems both in probability and in statistical inference.
iv) Normal distribution is very important in applied statistics.
Describe the Normal Frequency Distribution.
When Normal probability density function is multiplied with total number of observations “N” then we
get Normal frequency distribution.
1 1 𝑥−𝜇 2
𝑁𝑓(x) = 𝑁 𝑒− 2 (𝜎
)
; −∞ ≤ 𝑥 ≤ +∞
𝜎√2𝜋
What is meant by Normal curve?
The graph of Normal distribution is called Normal curve.
The Normal curve is symmetrical and bell-shaped.

3
What is the shape of the normal curve?
The shape of the normal curve is uni-modal,
symmetrical and bell-shaped.

Why the normal curve is bell-shaped?


In a normal distribution 𝛽2 = 3, so it is Mesokurtic distribution and Mesokurtic distribution is
bell-shaped.
Define normal probability density function.
A continuous random variable “X” is normally distribution if and only if its probability density
function (p.d.f) is
1 𝑋− 𝜇 2
1 − ( )
f(x) = N~(x; 𝜇, 𝜎) = 𝑒 2 𝜎 ; −∞ ≤ 𝑥 ≤ ∞
𝜎 √2𝜋

Write down the equation of the normal curve with mean 𝝁 𝐚𝐧𝐝 𝐬𝐭𝐚𝐧𝐝𝐚𝐫𝐝 𝐝𝐞𝐯𝐢𝐚𝐭𝐢𝐨𝐧 𝝈.
The equation of the normal curve with mean 𝜇 and standard deviation 𝜎 is.
1 𝑋−𝜇 2
1 − ( )
f(x) = 𝑒 2 𝜎 ; −∞ ≤ 𝑥 ≤ ∞
𝜎 √2𝜋

Enlist the properties of Normal distribution.


The properties of Normal distribution are:
i) Normal distribution is a continuous probability distribution.
ii) The total area under the Normal curve is unity.
iii) The range of Normal random variable from −∞ 𝑡𝑜 ∞. i.e., −∞ ≤ 𝑥 ≤ +∞.
iv) In Normal distribution all odd odder moments about mean are equal to zero. 𝜇1 = 𝜇3 =…= 𝜇𝑘 = 0.
v) The Normal curve is uni-modal, symmetrical and bell-shaped.
vi) The two parameters of Normal distribution mean = μ and Variance = 𝜎 2 .
vii) The quartile deviation and S.D have the relation in normal distribution is:
2
Q. D = 0.6745σ ≅ σ
3
viii) The mean deviation and S.D have the relation in normal distribution is:
4
M. D = 0.7979σ ≅ σ
5
ix) In Normal distribution the lower quartile is 𝑄1 = 𝜇 − 0.6745𝜎 and upper quartile is 𝑄3 = 𝜇 + 0.6745𝜎.
1
x) Normal distribution has maximum ordinate (height) at x = 𝜇 and its value = .
𝜎 √2𝜋
Why β1 is zero in normal distribution?
µ23
Since β1 = and all odd moments about mean of normal distribution is zero, so value of β1 is zero.
µ32

4
What is the range of a normal distribution?
The range of the Normal distribution is from −∞ 𝑡𝑜 + ∞.
What is the role of 𝝁 in the Normal curve?
The 𝜇 is the mean of the normal distribution, the role of 𝜇 in normal curve to controls the location of
the normal curve.
What is the role of 𝝈 in the Normal curve?
The 𝜎 is the standard deviation of the normal distribution, the role of 𝜎 in normal curve to controls the
relative spread of the normal curve.
What is the relationship among mean, median and mode of a normal distribution?
The normal distribution is a symmetrical distribution the relationship among mean, median and mode
of a normal distribution is equal. i.e.,
Mean = Median = Mode = 𝜇
Define the point of inflection in normal distribution.
In normal probability distribution the points of inflection are points at which the bend in the curve
changes direction. The points of inflection of the normal distribution are at x = µ − 𝜎 and x = µ + 𝜎.
What is standard normal distribution?
A normal distribution that has a mean of zero and standard deviation of one is called standard normal
distribution or standardized normal distribution
𝑋− µ
If Z = the standard normal random variable, then Z has the probability distribution
𝜎
𝑍2
1 −
f(z) = 𝑒 2 ; −∞ ≤ 𝑧 ≤ ∞
√2𝜋
In a standard normal distribution what is the value of mode and median?
The standard normal distribution is a symmetrical distribution the relationship between mean, median
and mode of a standard normal distribution is equal to zero. i.e.,
Mean = Median = Mode = 𝜇 = 0
What is standardized random variable?
Any variable having zero mean and unit variance is called standardized normal random variable or
standard normal random variable. It is denoted by “Z”.

𝑋− µ 𝑦−µ
For x: Z = , For y: Z =
𝜎 𝜎

What is standard normal distribution density function?


The standard normal density function is
𝑍2
1 −
f(z) = 𝑒 2 ; −∞ ≤ 𝑧 ≤ ∞
√2𝜋

5
Q. Write down the equation of normal distribution with mean = 24 and 𝝈 = 4.
Solution
Given data: 𝜇 = 24, 𝜎=4
1 𝑋−𝜇 2
1 − ( )
f(x) = 𝑒 2 𝜎 ; −∞ ≤ 𝑥 ≤ ∞
𝜎√2𝜋
1 𝑋 − 24 2
1 − ( )
f(x)= 𝑒 2 4
4√2𝜋

Q. If X~ N (100, 25). Find the value of maximum ordinate.


Solution
Given data: X~N (100, 25), 𝜇 =? , 𝜎 =?
𝜇 = 100
𝜎 2 = 25 ⇒ 𝜎 = 5
1
Maximum ordinate = 𝜇 = 𝑥 =
σ√2π
1
Maximum ordinate = = 0.0798
5√2(3.14159)

Q. In the normal distribution mean = 40, find median and mode.


Solution
Given data: Mean = 40 Median =? Mode =?
The normal distribution is a symmetrical distribution the relationship among mean, median and mode
of a normal distribution is equal. i.e.,
Mean = Median = Mode = 𝜇
So, Median = 40
And Mode = 40
Q. A normal distribution has mean 80 and standard deviation 36. Find 𝑸𝟏 , 𝝁 𝒂𝒏𝒅 𝑸𝟑.
Solution
Given data: Mean = 𝜇 = 80, S. D = 𝜎 = 36
𝑄1= 𝜇 − 0.67459 𝜎 , 𝑄3 = 𝜇 + 0.67459 𝜎
𝑄1 = 80 − 0.6745(36) , 𝑄3 = 80 + 0.6745(36)
𝑄1 = 55.718 , 𝑄3 = 104.282

6
Q. In normal distribution 𝜎 = 9 find quartile deviation.
Solution

Given data: 𝜎 = 9, Q.D = ?

Q.D = 0.6745𝜎

Q.D = 0.6745(9)

Q.D = 6.0705
Q. In normal distribution mean is 100 and S.D is 10. Find M.D and Q.D.
Solution
Given data: µ = 100, 𝜎 = 10, M.D = ?, Q.D = ?
M.D = 0.7979 𝜎 , Q.D = 0.6745𝜎
M.D = 0.7979 (10) , Q.D = 0.6745 (10)
M.D = 7.979 , Q.D = 6.745
Q. In normal distribution the mean and S.D are 25 and 5 respectively. Find M.D and Q.D.
Solution
Given data: µ = 25, 𝜎 = 5, M.D = ? , Q.D = ?
M.D = 0.7979 𝜎
M.D = 0.7979 (5)
M.D = 3.9897
Q.D = 0.6745𝜎
Q.D = 0.6745 (5)
Q.D = 3.3725

Q. If 𝑿~𝑵(𝟏𝟓, 𝟒). Find the value of Z if x = 18.


Solution
Given data: 𝑋~𝑁(15, 4), 𝜇 = 15 , 𝜎 2 = 4 , 𝜎 = 2 , x = 18, Z = ?
x−μ
z=
σ

18 − 15
z= = 1.5
2

7
(OBJECTIVE)
1. The range of normal distribution is:
(a) 0 to ∞ (b) −∞ to 0 (c) −∞ to ∞ (d) -1 to +1
2. The mean of standard normal distribution is :
(a) 1 (b) 0 (c) ∞ (d) 𝜇
3. The normal distribution has parameters:
(a) 1 (b) 2 (c) 3 (d) 5
4. If X~𝑁(40, 36) the mode is:
(a) 36 (b) 6 (c) 40 (d) 20
5. In normal distribution M.D is equal to:
2 2 𝟒 3
(a) 𝜎 (b) 𝜎 (c) 𝝈 (d) 𝜎
5 3 𝟓 5

6. The lower and upper quartiles of normal distribution are equidistance from its:
(a) Variance (b) Standard deviation (c) Mean (d) None of these
7. In normal distribution 𝜇 + 0.6745𝜎 includes:
(a) 25% area (b) 50% area (c) 68.27% area (d) 95.45% area
8. In a normal curve ordinate is highest at:
(a) Mean (b) Variance (c) 𝑄1 (d) 𝜎
9. In normal distribution the area to the of Z = 1 is:
(a) 0.6413 (b) 0.7413 (c) 0.8413 (d) 0.3413
10. In normal distribution 𝜇3 is always:
(a) < 0 (b) > 0 (c) = 3 (d) = 0
11. Area under the standard normal curve is:
(a) 1.0 (b) 0.5 (c) 2.0 (d) 0
12. Normal distribution is:
(a) Uni - modal (b) Bi - modal (c) Tri - modal (d) Multi - modal
13. In normal distribution points of inflection are :
(a) ± 𝜎 (b) 𝝁 ± 𝝈 (c) 𝜇 ± 2𝜎 (d) 𝜇 ± 3𝜎
14. If X~𝑁(40, 49) the S.D is:
(a) 49 (b) 7 (c) 40 (d) 20
15. In normal distribution Q.D is equal to:
2 𝟐 4 3
(a) 𝜎 (b) 𝝈 (c) 𝜎 (d) 𝜎
5 𝟑 5 5

16. In normal distribution 𝐸(𝑋 − 𝜇 )2 is:


(a) Q.D (b) S.D (c) Variance (d) Mean

8
17. In a normal distribution 𝜎 2 = 5 then 𝜇4 = ⋯
(a) 25 (b) 75 (c) 125 (d) 0
18. In a normal distribution, all odd ordered moments about mean are equal to :
(a) One (b) zero (c) additive (d) Mean
19. In normal distribution 𝛽2 is always:
(a) < 0 (b) > 0 (c) = 3 (d) = 0
20. Second moment about mean of normal distribution is called:
(a) One (b) zero (c) variance (d) Mean
21. In a normal distribution, Q.D (x) is equal to
(a) 0.9545(σ) (b) 0.9827(σ) (c) 0.6745(σ) (d) 0.7979(σ)
22. In a normal distribution, M.D (x) is equal to
(a) 0.9545(σ) (b) 0.7979(σ) (c) 0.6745(σ) (d) 0.7945(σ)

23. If µ2 = 4, then what will be µ4?

(a) 32 (b) 24 (c) 48 (d) 40


24. Standard normal random variable has mean and variance:
(a) 0, 1 (b) µ, σ (c) 1, 0 (d) None of these
25. A standard normal curve has maximum ordinate at:
(a) z = 0 (b) z = 1 (c) z = -1 (d) None of these

Prepared by: Darshan Jee


Lecturer: Govt. Khawaja Fareed Graduate College, R.Y. Khan
(03081845584)

9
Chapter No.11
Sampling and Sampling Distributions
What is population?
A population consists of the totality of the observations with which we are studied. The population size
is denoted by “N”.
OR
The whole aggregate of material about which some information is desired is called a population.
Examples:
The heights of students in Punjab University.
The population of teachers in Rahim Yar Khan.
The population of stars on the sky.
Define the term sample.
Sample is a representative part of the population which represents the all characteristics of the
population. The sample size is denoted by “n”.
Examples:
A medical doctor takes a sample of medicine to check its effectiveness.
Taking blood in a test tube is an example of sample.
What dose you mean by Parameter?
Any numerical value or quantity calculated from population data are called parameters. Parameters are
fixed quantity. Parameters are denoted by Greek letters.
Examples: Population mean “µ” and population standard deviation “𝝈”.
∑ 𝑋𝑖 ∑(𝑋𝑖 − 𝜇)2
µ= 𝑁
, 𝜎=√
𝑁
Define the term statistic.
Any numerical value or quantity calculated from sample data are called statistic. Statistic is denoted by
English or Roman letters.
Examples: Sample mean X and sample standard deviation S.
2
∑ 𝑋𝑖 ∑(𝑋𝑖 − X)
X= , 𝑆=√
𝑛 𝑛
What is finite population?
A population that consists of countable number of observations is called finite population.
For example, the number of students in a class.
Define the term infinite population?
A population that consists of uncountable number of observations is called infinite population.
For example, the stars on the sky.
What is target population?
A population about some information is desired is called a target population.
For example, all the human population of Punjab will be the target population.
Define the term sampled population.
A population from which the sample is drawn is called a sampled population.

What is census or complete enumeration?


The collection of information from each and every member of the population is called census or
complete enumeration.

10
What is sampling?
Sampling is a procedure of selecting a representative sample from a given population.
What is sampling unit?
The basic units of the given population are called sampling unit. Sampling unit must be distinctive.
What is the main purpose or aims (objective) of sampling?
The main purposes of sampling are as follows:
i) To provide sufficient information about the population without examining every unit of the
population.
ii) To find the reliability of the estimate derived from the sample.
Write down some advantages of sampling.
The advantages of sampling are given below:
Reduced cost: The sampling study is reduced our cost.
Save time: The sampling study save our time.
Greater speed: The sampling study provides greater speed.
Greater accuracy: The sampling study provides greater accuracy.
Greater scope: Sampling is the only method to obtain information, when population is infinite.
Give some disadvantages (Limitations) of sampling.
The disadvantages (Limitations) of sampling are given below:
i) Sample results are less reliable as compared to census.
ii) Faulty sampling frames and personal biases give misleading results.
iii) Sometimes complete count (census) is necessary and sampling is not used.
iv) It is difficult to decide which sampling technique is most suitable.
What is sampling frame?
A complete list of all the elements included in the population is called sampling frame.
Examples:
The list of all doctors who are foreign qualified in Punjab.
The complete list of all the students admitted in the K.F.G. College in 2022.
What is Sampling Design?
A clear cut or a definite statistical plain concerned with all principle steps taken in the selection of
sample is called sampling design.
OR
Sampling design is a procedure or plain for obtaining a sample from a given population prior to
collecting any data.
What are the two main types of sampling design?
The two main types of sampling design are:
i) Probability sampling ii) Non – Probability sampling

What is probability sampling or random sampling?


When each unit of the population has some known probability of its being included in the sample, such
kind of sampling is called probability sampling. Probability sampling is also called random sampling.
OR
Probability sampling is a procedure in which the sample is selected in such a way that every unit of a
population has known or non – zero probability of being included in the sample have equal selection of
probability.

11
What are the types of probability sampling?
The types of probability sampling are:
(i) Simple random sampling
(ii) Stratified random sampling
(iii) Systematic random sampling
(iv) Cluster sampling
Define Non-probability sampling or non – random sampling.
When the selection of the sample from the population depends upon the personal judgment, such kind
of sampling is called non-probability sampling. Non - Probability sampling is also called non - random
sampling.
Name the types of non-probability sampling.
(i) Judgment sampling
(ii) Purposive sampling
(iii) Quota sampling
Define simple random sampling.
If each unit in the population has equal probability of selection in the sample and each possible sample
of the same size has equal probability of being selected, is called simple random sampling.
Write down the method of selection of simple random sample.
A simple random sample can be selected by the following methods.
i) Lottery method ii) Random number table method iii) Using computer iv) Using calculator
Describe the stratified random sampling.
When a heterogeneous population is divided into small non-overlapping homogenous groups called
strata. From each stratum a simple random sample is drawn and the overall sample is selected from all
strata then this whole procedure is called stratified random sampling.
Differentiate between sampling with and without replacement.
Sampling with replacement
Sampling is said to be with replacement when from a finite population a sampling unit is drawn,
observed and then returned to the population before selecting the next unit.
Number of all possible samples 𝑛(𝑆) = 𝑁 𝑛
Sampling without Replacement
Sampling is said to be without replacement when from a population a sampling unit is drawn, observed
and then not returned to the population before selecting the next unit.
𝑁
Number of all possible samples 𝑛(𝑆) = (𝑁𝑛) =
𝐶𝑛
What is meant by sampling error means?
The difference between sample statistic and population parameter is called sampling error.
Sampling error = ̅ X−μ
How can sampling errors occur?
Sampling errors occurred due to be:
(i) Wrong selection of sampling techniques.
(ii) Sample size not of enough size.
How can sampling errors be reduced?
Sampling error can be reduced:
(i) Selection the proper technique of sampling
(ii) Increasing the sample size

12
Define the term non-sampling error.
The error which occurs at the stages of gathering, processing of data and due to wrong collection of
population data is called non-sampling error
How can non-sampling errors occur?
Non - sampling errors occurred due to be:
(i) Incomplete sampling frame.
(ii) Fault reporting facts.
(iii) Un - expert investigator.
How can non-sampling errors be reduced?
Non - sampling error can be reduced:
(i) The correct reporting of facts.
(ii) By selecting the expert or trained investigators.
What is meant by Bias?
The difference between expected value of a statistic and population parameter is called bias. It is
defined as BIAS = E(X ̅) − μ
How can you define sampling distribution?
OR
What is sampling distribution of a statistic?
The probability distribution of the values of any statistic such as a mean(X̅), standard deviation (S) and
proportion (p̂) computed from all possible samples of the same size which might be selected with or
without replacement from a population is called sampling distribution of that statistic.
Define the term standard error.
The standard deviation of the sampling distribution of any statistic is called standard error of that
statistic. It is denoted by (S. E).
Define sampling distribution of mean.
̅) of all possible random samples of the same size that could
The probability distribution of the mean (X
̅).
be selected from a given population is called the sampling distribution of mean(X
Enlist the properties of sampling distribution of mean.
The properties of sampling distribution of mean are:
i) The mean of sampling distribution of sample mean is equal to population mean, that is 𝜇𝑥 = 𝜇
ii) The variance of sampling distribution of mean is given that:
𝜎2
𝜎𝑥2 = (In case of sampling with replacement)
𝑛
𝜎2 𝑁−𝑛
𝜎𝑥2 = ×[ ] (In case of sampling without replacement)
𝑛 𝑁−1
iii) The standard deviation of sampling distribution of mean or standard error of mean is given that:
𝜎
𝑆. 𝐸(𝑥) = 𝜎𝑥 = (In case of sampling with replacement)
√𝑛

𝜎 𝑁−𝑛
𝑆. 𝐸(𝑥) = 𝜎𝑥 = ×√ (In case of sampling without replacement)
√𝑛 𝑁−1

13
Write down the properties of sampling distribution of difference between means.
The properties of sampling distribution of difference between means are:
i) The sampling distribution of differences between sample means is equal to differences of population
means, that is 𝜇𝑥1 − 𝑥2 = 𝜇1 − 𝜇2

ii) The of sampling distribution of differences between variances of sample means is given that:
𝜎12 𝜎22
𝜎𝑥21− 𝑥2 = + (In case of sampling with replacement)
𝑛1 𝑛2

𝜎12 𝑁1 − 𝑛1 𝜎22 𝑁2 − 𝑛2
𝜎𝑥21− 𝑥2 = [ ]+ [ ] (In case of sampling without replacement)
𝑛1 𝑁1 −1 𝑛2 𝑁2 −1

iii) The of sampling distribution of differences between standard deviations of sample means is given
that:

𝜎12 𝜎22
𝑆. 𝐸(𝑥1 − 𝑥2 ) = 𝜎𝑥1− 𝑥2 = √ + (In case of S.W.R)
𝑛1 𝑛2

𝜎12 𝑁1 − 𝑛1 𝜎22 𝑁2 − 𝑛2
𝑆. 𝐸(𝑥1 − 𝑥2 ) = 𝜎𝑥1− 𝑥2 = √ [ ]+ [ ] (In case of S.W.O.R)
𝑛1 𝑁1 −1 𝑛2 𝑁2 −1

Describe the sampling distribution of proportion.


The probability distribution of the proportion (𝑝̂ ) of all possible random samples of the same size that
could be selected from a given population is called the sampling distribution of proportion(p̂).
Enlist the properties of sampling distribution of proportion.
The properties of sampling distribution of proportion are:
i) The sampling distribution of sample proportion is equal to population proportion. i.e., 𝜇𝑃̂ = 𝑃
ii) The variance of sampling distribution of proportion is given that:
𝑃𝑞
𝜎𝑃2̂ = (In case of sampling with replacement)
𝑛
𝑃𝑞 𝑁−𝑛
𝜎𝑃2̂ = × (In case of sampling without replacement)
𝑛 𝑁−1
iii) The standard deviation of sampling distribution of proportion or standard error of proportion is
given that:

𝑃𝑞
𝑆. 𝐸(𝑃̂) = 𝜎𝑃̂ = √ (In case of sampling with replacement)
𝑛

𝑃𝑞 𝑁−𝑛
𝑆. 𝐸(𝑃̂) = 𝜎𝑃̂ = √ ×√ (In case of sampling without replacement)
𝑛 𝑁−1

14
Write down the properties of sampling distribution of difference between proportions.
The properties of sampling distribution of difference between proportions are:
i) The sampling distribution of differences between sample proportions is equal to difference between
population proportions. i.e., 𝜇𝑃̂1 − 𝑃̂2 = 𝑃1 − 𝑃2

ii) The of sampling distribution of differences between variances of sample proportions is given that:
𝑃1 𝑞1 𝑃2 𝑞2
𝜎𝑃2̂1 − 𝑃̂2 = + (In case of S.W.R)
𝑛1 𝑛2
𝑃1 𝑞1 𝑁1 − 𝑛1 𝑃2 𝑞2 𝑁2 − 𝑛2
𝜎𝑃2̂1 − 𝑃̂2 = [ ]+ [ ] (In case of S.W.O.R)
𝑛1 𝑁1 −1 𝑛2 𝑁2 −1
iii) The of sampling distribution of differences between standard deviations of sample proportions is
given that:

̂1 − 𝑃
̂ 2) = 𝜎 ̂ 𝑃 1 𝑞1 𝑃 2 𝑞2
𝑆. 𝐸(𝑃 𝑃 ̂
𝑃 =√ + (In case of S.W.R)
1− 2 𝑛1 𝑛2

̂1 − 𝑃
̂ 2) = 𝜎 ̂ 𝑃1 𝑞1 𝑁1 − 𝑛1 𝑃 𝑞 𝑁 −𝑛
𝑆. 𝐸(𝑃 𝑃 ̂ =√ [ ] + 2 2 [ 2 2] (In case of S.W.O.R)
1 − 𝑃2 𝑛1 𝑁1 −1 𝑛2 𝑁2 −1

Describe the sampling distribution of variance.


The probability distribution of the variance (𝑆 2 ) of all possible random samples of the same size that
could be selected from a given population is called the sampling distribution of variance (𝑆 2 ).
Give the properties of sampling distribution of variance (Biased sample variance 𝑺𝟐 ).
The properties of sampling distribution of variance (Biased sample variance 𝑆 2 ) are:
In case of sampling with replacement
𝑛−1 ∑(𝑥 − 𝑥)2
𝐸(𝑆 2 ) = 𝜇𝑠2 = ( ) 𝜎2 Where 𝑆2 =
𝑛 𝑛
In case of sampling without replacement
𝑁 𝑛−1
𝐸(𝑆 2 ) = 𝜇𝑠2 =
𝑁−1
( 𝑛
) 𝜎2
Give the properties of sampling distribution of variance (Unbiased sample variance 𝒔𝟐 ).
The properties of sampling distribution of variance (Biased sample variance 𝑠 2 ) are:
In case of sampling with replacement
∑(𝑥 − 𝑥)2
𝐸(𝑠 2 ) = 𝜇𝑠2 = 𝜎2 Where 𝑠2 =
𝑛−1
In case of sampling without replacement
𝑁
𝐸(𝑠 2 ) = 𝜇𝑠2 =
𝑁−1
𝜎2

15
Q. Given 𝝈 = 𝟔 and n = 30, find 𝝈𝒙 .

Solution:
Given data: 𝜎 = 6 , n = 30, 𝜎𝑥 =?
𝜎
𝜎𝑥 = (In case of sampling with replacement)
√𝑛
6
𝜎𝑥 =
√30
𝜎𝑥 = 1.095
𝟐
Q. For an infinite population 𝝁 = 𝟑𝟎 , 𝝈 = 𝟓 , 𝒏 = 𝟏𝟎𝟎 , 𝐟𝐢𝐧𝐝 𝝁𝒙 𝐚𝐧𝐝 𝝈𝒙 .

Solution:
Given data: 𝜇 = 30 , 𝜎 = 5 , 𝜎 2 = 25 , 𝑛 = 100 , 𝜇𝑥 =? , 𝜎𝑥2 =?
We know that
𝜇𝑥 = 𝜇
𝜇𝑥 = 30
𝜎2
𝜎𝑥2 = (In case of sampling with replacement)
𝑛
25
𝜎𝑥2 = = 0.25
100

Q. Given 𝒏 = 𝟑𝟔 , 𝝁𝒙 = 𝟓𝟎 , 𝝈𝟐𝒙 = 𝟓. Find 𝝁 and 𝝈𝟐 .

Solution:
Given data: 𝑛 = 36 , 𝜇𝑥 = 50 , 𝜎𝑥2 = 5 , 𝜇 =? , 𝜎2 =?
We know that
𝜇𝑥 = 𝜇
50 = 𝜇 , 𝜇 = 50
𝜎2
𝜎𝑥2 = (In case of sampling with replacement)
𝑛

𝜎𝑥2 × 𝑛 = 𝜎 2
5 × 36 = 𝜎 2 , 𝜎 2 = 180

16
(OBJECTIVE)
1) Any value computed from population is called:
(a) Parameter (b) Statistic (c) Target population (d) Sampling unit
2) A plan for obtaining a sample from a population is called :
(a) Population Design (b) Sampling Design (c) Sampling Frame (d) Sampling Distribution
3) Sampling in which a sampling unit can be repeated more than once is called:
(a) Simple sampling (b) Sampling with replacement
(c) Sampling without replacement (d) None of these
4) Another name of probability sampling is:
(a) Non Random sampling (b) Judgment sampling (c) Purposive sampling (d) Random sampling
5) Selection of questions by the students to solve a paper is:
(a) Non random sampling (b) probability sampling
(c) Sampling with replacement (d) Random sampling
6) The difference between statistic and parameter is called:
(a) Random error (b) Sampling error (c) Standard error (d) Error
7) In sampling with replacement 𝜎𝑋 is equal to:
𝝈 𝜎 𝜎 𝑁−𝑛 𝜎 𝑁−𝑛
(a) (b) (c) √ (d) √
√𝒏 𝑛 √𝑛 𝑁−1 𝑛 𝑁−1

8) Probability distribution of 𝑋 is called its:


(a) Expected value (b) Standard error (c) Sampling distribution of means (d) Sampling error
9) The selection of Cricket team for the world cup is called :
(a) Random sampling (b) Judgment sampling (c) Purposive sampling (d) Cluster sampling
10) Which characteristic relate to the sample:
(a) Parameter (b) Statistic (c) Sampling error (d) Sampling distribution
11) In sampling with replacement the population becomes:
(a) Infinite (b) Existent (c) Finite (d) Hypothetical
12) S.E of 𝑋̅ for without replacement sampling is:
𝜎 𝜎 𝝈 𝑵−𝒏 𝜎2
(a) (b) (c) √ (d)
√𝑛 𝑛 √𝒏 𝑵−𝟏 𝑛

13) Standard deviation of sampling distribution of any statistic is called:


(a) Sampling error (b) Non sampling error (c) Standard error (d) All of these
14) List of all the units in the population is called:
(a) Sampling design (b) Sampling frame (c) Sampling error (d) Distribution

17
15) The errors which arise due to processing of data and fulty sampling frame are called:
(a) Non-Sampling error (b) Sampling error (c) Standard error (d) Error
16) Sampling error can be reduced by:
(a) Increasing sample size (b) Decreasing sample size (c) Cannot be reduce (d) None of these
17) Any numerical value computed from sample is called:

(a) Parameter (b) Statistic (c) Sampling (d) Sampling unit


18) A population about which some information is required is called:
(a) Infinite population (b) Sampled population (c) Target population (d) None of these
19) In sampling without replacement, an element can be chosen:
(a) Once (b) More than once (c) Twice (d) None of these
20) In sampling with replacement, an element can be chosen:
(a) Once (b) More than once (c) Twice (d) None of these

Prepared by: Darshan Jee


Lecturer: Govt. Khawaja Fareed Graduate College, R.Y. Khan
(03081845584)

18
Chapter No.12

Estimation
What you mean by statistical inference?
The process of drawing inferences about a population on the basis of sample information is called
statistical inference.
OR
The procedure draw inferences about unknown population parameters on the bases of sample data is
called statistical inference.
Statistical inference is divided in how many types?
Statistical inference is divided into two major types
(i) Estimation of parameter (ii) Testing of hypothesis of parameter
Elaborate the term Estimation.
Estimation is a procedure of obtaining the unknown values of population parameters using the
sample information.
OR
The estimation is the procedure making judgment about unknown value of population parameters by
using sample observations.
OR
The process of estimating a true unknown value of a population parameters on the bases of sample
data.
What are the different types of estimation?
There are two types of estimation.
(i) Point Estimation (ii) Interval Estimation
Define the term point estimation.
The process of obtaining a single value from the sample as an estimate of the unknown population
parameters, is called point estimation.
What is meant by point estimate?
A single numerical value obtains from sample data buy using formula is called point estimate.
OR
A single value calculated from the sample data is called point estimate.
For example, we wish to estimate average age of 1st year college students. We select a random sample
of 100 students and their average age comes out to be 16 years is the point estimate.

19
Explain the term estimator.
A formula or method which is used to estimate the unknown population parameter is called estimator.
OR
A formula or function used to estimate the population parameter is called estimator.
Explain the terms Estimate and Estimator.
Estimate: A numerical value obtain from sample observations in the formula is called estimate.
Estimator: A formula or function used to estimate the population parameter is called estimator.
∑𝑦 60
For example, let we have values 5, 10, 12, 18, 15 then 𝑌 = = = 12
𝑛 5
∑𝑦
Here 𝑌= is the estimator, which is being used to estimate population mean µ and the value 12 is
𝑛
called the estimate.
Distinguish between point estimate and interval estimate.
Point estimate: A single value calculated from the sample data is called point estimate. e.g., if on the
bases of sample, we have 𝑌 = 5.0, so it is a point estimate of µ.
Interval estimate: If the estimated value about unknown population parameter is expressed by the
range of values, then it is called interval estimate. e.g., on the bases of sample, we have 2 ≤ 𝑌 ≤ 5, so it
is an interval estimate.
Describe the term interval estimation.
Interval estimation is a procedure in which we estimate a range of values within which the
population parameter is expected to lie with a certain degree of confidence.
For example, it is very difficult to say that the average age of 1st year students is 16 years, rather we
may say that average age of the 1st year students lies between 15 and 17 years.
What are the properties of good point estimator?
A point estimator is considered to be good or best. If it has following properties:
1) Unbiasedness 2) Consistency 3) Sufficiency 4) Efficiency
Define an unbiased estimator.
An estimator is defined to be unbiased estimator if its expected value i.e., mean equal to corresponding
population parameter. The estimator 𝜃̂ is said to be unbiased estimator for 𝜃 if 𝐸(𝜃̂) = 𝜃

What is biased estimator?


An estimator is defined to be biased estimator if its expected value i.e., mean is not equal to
corresponding population parameter. The estimator 𝜃̂ is said to be biased estimator for 𝜃 if

𝐸(𝜃̂) ≠ 𝜃.

20
Define bias.
If an estimator T of a population parameter θ is biased, the amount of bias is Bias = E(T) – θ.
What is meant by unbiasedness?
The property that the expected value of a statistic is equal to its parameter is called unbiasedness.

Let 𝜃̂ is an estimator of population parameter 𝜃, then 𝜃̂ is said to be unbiased estimator of 𝜃.

If 𝐸(𝜃̂) = 𝜃.

If 𝐸(𝜃̂) ≠ 𝜃 then 𝜃̂ is biased estimator of 𝜃.

If 𝐸(𝜃̂) > 𝜃 then 𝜃̂ is positive estimator of 𝜃.

If 𝐸(𝜃̂) < 𝜃 then 𝜃̂ is negative estimator of 𝜃.

Write down the formula of two estimators of population S.D.

∑(x − x)2
(i) Unbiased estimator of pupation S.D is 𝑠 = √ n − 1 , 𝐸(𝑠) = 𝜎

∑(x − x)2
(ii) Biased estimator of pupation S.D is S = √ , 𝐸(𝑆) ≠ 𝜎
n

Give the formula of unbiased and biased variances.


∑(x − x)2
(i) Unbiased estimator of pupation variance is 𝑠 2 = n − 1 , 𝐸(𝑠 2 ) = 𝜎 2

∑(x − x)2
(ii) Biased estimator of pupation variance is 𝑆 2 = , 𝐸(𝑆 2 ) ≠ 𝜎 2
n

What do you understand by Confidence interval?


A range of values calculated from the sample data that will include the population parameters with a
certain degree of confidence is known as confidence interval.
OR
The interval (L , U) that will include the population parameters with a high probability of

100(1 − 𝛼)% is known as confidence interval.


What are Confidence Limits?
The end points that bound the confidence interval are called the lower and upper limits for population
parameter θ i. e. (L < θ < U).
What is Level of Confidence?
The probability that the unknown population parameter lies in the confidence interval is known as
level of confidence or confidence coefficient. It is denoted by (1 − 𝛼).

21
If sample size is increased, what will be change in 100(𝟏 − 𝜶)% confidence interval?

If sample size is increased, 100(1 − 𝛼)% confidence interval will be decreased.


Define degree of freedom.
The number of values that can be selected independently is called degree of freedom. It is denoted by
𝜈 (n).
What is Confidence Coefficient?
The probability that the unknown population parameter lies in the confidence interval is known as
confidence coefficient. It is denoted by (1 − 𝛼).
What is the point estimator of the population mean µ?
∑x
The point estimator of the population mean µ is x = .
n

Write down the Confidence interval for population mean 𝝁 when population standard deviation
𝝈 is known.

The Confidence Interval for population mean 𝜇 when population standard deviation 𝜎 is known is
𝜎
𝑃 [ 𝑋 ± 𝑍𝛼 ]=1−𝛼
2 √𝑛
Write down the Confidence interval for population mean 𝝁 when population standard deviation
𝝈 is unknown but sample size is large i.e., 𝒏 > 𝟑𝟎.

The Confidence Interval for population mean 𝜇 when population standard deviation 𝜎 is unknown and
sample size is large i.e., 𝑛 > 30 is

𝑆
𝑃 [ 𝑋 ± 𝑍𝛼 ]=1−𝛼
2 √𝑛

∑(x − x)2
Where “S” is biased estimator of pupation S.D is S =√ , 𝐸(𝑆) ≠ 𝜎
n

Write down the Confidence interval for population mean 𝝁 when population standard deviation
𝝈 is unknown but sample size is small i.e., 𝒏 ≤ 𝟑𝟎.

The Confidence Interval for population mean 𝜇 when population standard deviation 𝜎 is unknown and
sample size is small i.e., 𝑛 ≤ 30 is
𝑠
𝑃 [ 𝑋 ± 𝑡𝛼(𝜈) ]=1−𝛼
2 √𝑛

∑(x − x)2
Where “s” is unbiased estimator of pupation S.D is s =√ , 𝐸(𝑠) = 𝜎
n−1

22
Write down theoretically the Confidence interval for Population Mean 𝝁 when Population Variance
𝝈𝟐 is to be unknown.

The Confidence Interval for population mean 𝜇 when population Variance 𝜎 2 is unknown and is
𝑠
𝑃 [ 𝑋 ± 𝑡𝛼(𝜈) ]=1−𝛼
2 √𝑛

∑(x − x)2
Where “s” is unbiased estimator of pupation S.D is s =√ , 𝐸(𝑠) = 𝜎
n−1

Write down theoretically the Confidence interval for Population Proportion 𝑷.

The Confidence Interval for population proportion 𝑃 is

𝑝̂ 𝑞̂
𝑃 [ 𝑝̂ ± 𝑍𝛼 √ ]=1−𝛼
2 𝑛

(OBJECTIVE)
1) The statistical inference has types:
(a) One (b) Two (c) Three (d) Four
2) A specific value calculated by using the sample data is called:
(a) Estimator (b) Estimate (c) Estimation (d) None
3) A rule or formula used to estimate an unknown parameter is called:
(a) Estimator (b) Estimate (c) Estimation (d) Bias
4) A single number used to estimate the unknown population parameter:
(a) Biased estimate (b) Unbiased estimate (c) Interval estimate (d) Point estimate
5) The process of obtaining a single number used to estimate the unknown population parameter:
(a) Biased estimation (b) Unbiased estimation (c) Interval estimation (d) Point estimation
6) A range of values used to estimate the unknown population parameter:
(a) Biased estimate (b) Unbiased estimate (c) Interval estimate (d) Point estimate

7) An interval estimate is associated with:


(a) Probability (b) Non-probability (c) Error (d) Bias

8) The point estimate of 𝜇 is:

(a) 𝑿 (b) 𝜎 (c) 𝜎 2 (d) 𝜇


9) A function that is used to estimate a parameter is called:
(a) Bias (b) Estimate (c) Estimator (d) Estimation
10) An estimate T is said to be unbiased estimator of 𝜃. if
(a) E(𝑇) > 𝜃 (b) E(𝑇) < 𝜃 (c) E(𝑻) = 𝜽 (d) E(𝑇) ≠ 𝜃

23
11) Two types of estimation are:
(a) One sided and two sided (b) Type - I & Type - II errors
(c) Point estimation and interval estimation (d) Estimation of parameter and Testing of hypothesis
12) 1 – 𝛼 is called:
(a) Power of test (b) Confidence coefficient (c) Size of test (d) Level of significance
13) The probability that the confidence interval does not contain the parameter is denoted by:
(a) 𝜶 (b) 1− 𝛼 (c) 𝛽 (d) 1 − 𝛽
14) A sample is considered a small sample if its size is:
(a) 50 (b) Less than 30 (c) 30 or less (d) 100
15) t – distribution is used when:
(a) Population is normal (b) 𝜎 is unknown (c) Sample size is small (d) All of these
16) In a Z test the no. of degrees of freedom is:
(a) n -1 (b) n - 2 (c) n (d) None of these
17) The width of confidence interval decreases if the confidence coefficient is:
(a) Increased (b) Decreased (c) Fixed (d) One
18) The level of confidence is denoted by:
(a) 𝛼 (b) 1− 𝜶 (c) 𝛽 (d) 1 − 𝛽
19) Level of Confidence is also known as :
(a) Power of test (b) Confidence coefficient (c) Size of test (d) Level of significance
20) If 1− 𝛼 = 0.90, then value of 𝑍𝛼 is :
2
(a) 1.645 (b) 1.96 (c) 2.326 (d) 2.575

Prepared by: Darshan Jee


Lecturer: Govt. Khawaja Fareed Graduate College, R.Y. Khan
(03081845584)

24
Chapter No. 13

Testing of Hypothesis
Define the term hypothesis.
In general the “hypothesis” may be a statement, which may or may not be true about a phenomenon.
OR
Any assumption which may or may not be true is called hypothesis.
What is meant by statistical hypothesis?
Any assumption made about the population parameter which may or may not be true is known as
statistical hypothesis.
Examples:
On the basis of sample data, we can test the claim.
i) Whether the coin is unbiased or not.
ii) Whether a certain drug is effective or not.
What is testing of hypothesis?
Testing of hypothesis is a procedure of accepting or rejecting a statement about a population parameter on
the basis of the given sample information.
OR
Testing of hypothesis is a procedure which enables us to decide whether a specified statement about
population parameter is true or not, on the basis of sample information.
Explain Null Hypothesis.
Any assumption about the population parameter which is to be tested for the possible rejection under the
assumption that it is true is called null hypothesis. It is denoted by 𝐻0 .
For example, suppose the average height of college students is 62′′ . This statement is taken as hypothesis
𝐻𝑂 : 𝜇 = 62′′ .
Define Alternative Hypothesis.
Any other hypothesis which is different from null hypothesis is known as an alternative
hypothesis. It is denoted by 𝐻1 𝑜𝑟 𝐻𝐴 .
OR
Any hypothesis which is accepted when the null hypothesis has been rejected is called alternative
hypothesis. It is denoted by 𝐻1 𝑜𝑟 𝐻𝐴 .
For example, suppose the average height of college students is more than 62′′ . This statement is taken as
hypothesis 𝐻1 : 𝜇 > 62′′ .

25
Distinguish between Simple Hypothesis and Composite Hypothesis.
Simple Hypothesis:

A hypothesis which specifies all the parameters of the distribution is called simple hypothesis.

For example, the average height of girls in a college is 63′′ . 𝐻𝑂 : 𝜇 = 63′′

e.g. i) 𝑋~𝑁(20 , 36) ii) 𝑋~𝑏(5 , 0.3)

Composite Hypothesis

A hypothesis which does not specify all the parameters of the distribution is called composite
hypothesis. For example, the average height of girls in a college is at least 63′′ . 𝐻𝑂 : 𝜇 ≥ 63′′

e.g. i) 𝑋~𝑁(𝜇 , 36) ii) 𝑋~𝑏(𝑛 , 0.3)

Explain Level of Significance.

The probability of rejecting the true null hypothesis (𝐻0 ) is called level of significance. It is also called
size of critical region. It is also called size of test. It is denoted by 𝛼.

𝛼 = 𝑃(Type − I error) = 𝑃(Rject Ho when Ho is true)


OR
The probability of committing type – I error is called level of significance. It is denoted by 𝛼 (alpha).
e.g. if 𝛼 = 5%
It means that, there are 5% chances to make wrong decision. In other words, we are 95% sure to make
correct decision.
What is meant by Test-Statistic?
The formula which provides the base whether to accept or reject any assumption about the
population parameter which may or may not be true is known as test-statistic. Commonly used
test – statistics are z, t, χ2 and F.
OR

A statistic on which the decision can be based whether to accept or reject a hypothesis is called
test - statistic. Commonly used test – statistics are z, t, χ2 and F.

26
What is meant by Critical Region?

The part of the sampling distribution of test-statistic


which leads toward the rejection of the null hypothesis
is known as rejection region. The rejection region is
denoted by 𝛼. The shaded area of the distribution is
called critical region.

Discuss Acceptance Region and Rejection Region.

Acceptance Region
The part of the sampling distribution of test-statistic which
leads toward the rejection of the alternative hypothesis is
known as acceptance region. The acceptance region is
denoted by 1 − 𝛼. The non-shaded area of the distribution is
called acceptance region.
Rejection Region

The part of the sampling distribution of test-statistic which


leads toward the rejection of the null hypothesis is known as
rejection region. The rejection region is denoted by 𝛼.The
shaded area of the distribution is called rejection region. It is
also called critical region.

Distinguish between one-tailed test and two-tailed test.

One -Tailed test


If the critical region is located in only one tail of the sampling distribution of the test statistic, the test is
called a one – tailed test or one - sided test.

a) It is right sided test: 𝐻𝑂 : 𝜇 ≤ 20

𝐻1 : 𝜇 > 20

b) It is lift sided test: 𝐻𝑂 : 𝜇 ≥ 20

𝐻1 : 𝜇 < 20

27
Two – tailed test
If the critical region is located on both tails of the sampling distribution of test statistic, it is called two
sided or two tailed test.

It is both sided test: 𝐻𝑂 : 𝜇 = 32

𝐻1 : 𝜇 ≠ 32

Define right tailed test.


If the critical region is located in only the right tail of the sampling distribution of the test statistic, the test
is called a right-tailed test.

a) It is right sided test: 𝐻𝑂 : 𝜇 ≤ 20

𝐻1 : 𝜇 > 20

Define left tailed test.


If the critical region is located in only the left tail of the sampling distribution of the test statistic, the test
is called a left – tailed test.

a) It is left sided test: 𝐻𝑂 : 𝜇 ≥ 20

𝐻1 : 𝜇 < 20

Explain the concept of Type - I error and Type - II error.


Type - I error
Reject 𝐻0 when 𝐻0 actually true then it is called type - I error. The probability of type - I error called level
of significance. It is denoted by 𝛼 .
OR
If we reject such a hypothesis which is infect true. Such kind of error is called type-I error.
For example, an innocent driver may be held by a traffic constable.
Type - II error
Accept 𝐻0 when 𝐻0 actually false then it is called type - II error. The probability of type - II error is
denoted by 𝛽.
For example, suppose a person is guilty, while judge declares him innocent. In this situation, the judge has
committed type - II error.

What is 𝜶 (Alpha)?

The probability of making of type - I error is called level of significance. It is denoted by 𝛼.

𝛼 = 𝑃(Type − I error) = 𝑃(Rject Ho when Ho is true)

28
What is meant by degree of freedom?
Degree of freedom is the number of independent observations in a sample minus the number of population
parameters estimated from the sample. It is denoted by 𝜈(𝑛𝑢).

What is meant by test of significance? OR What is meant by level of confidence?


The probability accepting a true null hypothesis is called level of confidence. It is denoted by 1 − 𝛼.

1 − 𝛼 = 𝑃(Accept Ho when Ho is true)

Explain power of test?

The probability of rejecting the false null hypothesis is called the power of test. It is denoted by 1 − 𝛽.

1 − 𝛽 = 𝑃(Reject Ho when Ho is false)

When z-test is used for testing population mean?


The z-test is used for testing population mean when population standard deviation (σ) is known and
sample size either large or small (n > 30, n < 30).
When t-test is used for testing population mean?
The t-test is used for testing population mean when population standard deviation (σ) is unknown and
sample size is small (n < 30).
What are the conditions or assumptions using z-test?
Following are the assumptions using z-test:
1) The population from which the sample is drawn is normally distributed.
2) The population standard deviation "𝜎" is known.
3) The samples are selected randomly and independently.
4) The sample size “n” is smaller or larger i.e., 𝑛 ≤ 30 𝑜𝑟 𝑛 > 30.
What are the conditions or assumptions using t-test?
Following are the assumptions using t-test:
1) The population from which the sample is drawn is normally distributed.
2) The population standard deviation "𝜎" is unknown.
3) The samples are selected randomly and independently.
4) The sample size “n” is smaller i.e., 𝑛 < 30 .

29
Q. Given 𝑯𝒐 : 𝝁 = 𝟏𝟐 , 𝑯𝟏 : 𝝁 > 12 , 𝑛 = 64 , 𝐱 = 𝟏𝟓 , 𝛔 = 𝟏𝟎. Calculate 𝒁𝒄 .
Solution:
Given data: 𝐻𝑜 : 𝜇 = 12 , 𝐻1 : 𝜇 > 12 , 𝑛 = 64 , x = 15 , σ = 10 , 𝑍𝑐 =?

𝑥− 𝜇
𝑍𝑐 = 𝜎
⁄ 𝑛

15− 12 3 3
𝑍𝑐 = 10 = 10 =
⁄ ⁄8 1.25
√64

𝑍𝑐 = 2.4
Q. Given 𝐗 = 𝟏𝟎𝟎 , 𝛔𝐱 = 𝟏𝟔 , 𝛍𝐨 = 𝟗𝟎. Find 𝒁.
Solution:
Given data: X = 100 , σx = 16 , μo = 90 , 𝑍 =?
𝑋− μo
𝑧=
σx
100 − 90 10
𝑧= = = 0.625
16 16

Q. Find the value of t-statistic if 𝒏 = 𝟐𝟎 , 𝐗 = 𝟔 , 𝐬 = 𝟑, 𝛍 = 𝟒.


Solution:
Given data: X = 6 , n = 20 , s = 3 , 𝜇 = 4 , 𝑡 =?
𝑋−𝜇
𝑡= 𝑠
⁄ 𝑛

6−4 2 2
𝑡=3 =3 =
⁄ ⁄4.47 0.67
√20

𝑡 = 2.98

Q. Given 𝛍 = 𝟓 , 𝒏 = 𝟗 , 𝐱 = 𝟐 , 𝐬 = 𝟒. 𝟓. Calculate t-statistic.


Solution:
Given data: μ = 5 , 𝑛 = 9 , x = 2 , s = 4.5 , 𝑡 =?
𝑋−𝜇
𝑡= 𝑠
⁄ 𝑛

2–5 −3
𝑡 = 4.5 = 4.5
⁄ ⁄3
√9
−3
𝑡=
1.5

𝑡 = −2

30
(OBJECTIVE)

1) A hypothesis which is to be tested for possible rejection is:

(a) Simple (b) Composite (c) Null (d) Alternative

2) Which of the following can be 𝐻1

(a) 𝜃 > 𝜃𝑜 (b) 𝜃 < 𝜃𝑜 (c) 𝜽 ≠ 𝜽𝒐 (d) All of these

3) Which of the following is the simple hypothesis?

(a) 𝝁 = 𝟐𝟎 (b) 𝜇 ≠ 20 (c) 𝜇 > 20 (d) 𝜇 < 20

4) The probability of rejecting 𝐻𝑜 when 𝐻𝑜 is true is called:

(a) Type - I error (b) Type-II error (c) Non error (d) None of these

5) 1 – 𝛽 is called:

(a) Power of test (b) Confidence coefficient (c) Size of test (d) Level of significance

6) A judge can release a guilty person is an example of:

(a) Type - I error (b) Type - II error (c) Non error (d) Correct decision

7) The probability of rejecting false 𝐻𝑜 is called:

(a) Power of test (b) Confidence coefficient (c) Level of significance (d) None of these

8) The critical region is located on both sides, it is called:

(a) One tailed test (b) Two tailed test (c) One sided test (d) None of these

9) In a t-test the no. of degrees of freedom is:

(a) n - 1 (b) n - 2 (c) n (d) None of these

10) For 𝛼 = 5% the critical value of 𝑍0.05 is equal to:

(a) 1.96 (b) 1.645 (c) 2.33 (d) 2.58

11) The critical region is located on one side, it is called:

(a) One tailed test (b) Two tailed test (c) two sided test (d) None of these

12) The probability of type - II error is:

(a)  (b)  (c) 1 −  (d) 1 − 

31
13) The range of t – distribution is:

(a) 0 to ∞ (b) −∞ to 0 (c) −∞ to ∞ (d) -1 to +1

14) The probability of type - I error is:

(a)  (b)  (c) 1 −  (d) 1 − 

15) A null hypothesis is denoted by:

(a) H0 (b) H1 (c) Ha (d) Hb

16) A alternative hypothesis is denoted by:

(a) H0 (b) H1 (c) H (d) Hb

17) A hypothesis which completely specifies the population parameter is called:

(a) Simple Hypothesis (b) Composite Hypothesis (c) Null Hypothesis (d) Alternative Hypothesis

18) A hypothesis which does not completely specifies the population parameter is called:

(a) Simple Hypothesis (b) Composite Hypothesis (c) Null Hypothesis (d) Alternative Hypothesis

19) The choice of one tailed test and two tailed test depends upon________ hypothesis :

(a) Simple Hypothesis (b) Composite Hypothesis (c) Null Hypothesis (d) Alternative Hypothesis

20) The level of significance is denoted by:

(a) 𝜶 (b) 1− 𝛼 (c) 𝛽 (d) 1 − 𝛽

Prepared by: Darshan Jee


Lecturer: Govt. Khawaja Fareed Graduate College, R.Y. Khan
(03081845584)

32
Chapter No.14

SIMPLE REGRESSION AND CORRELATION

Discuss the term regression.


The dependence of one random variable upon others non-random variables is called regression.
OR
The process in which we estimate one variable on the basis of another variable is called regression.
Note: The term regression was introduced by the English biometrician, Sir Francis Galton in 1885.
The term regression means “to step back or to regress”.
Define regression analysis.

In regression analysis, we obtain an equation which can be used to estimate the values of the dependent
variable on the basis of independent variable whose values are known.

OR

The technique used to develop the equation and provide the estimates is called regression analysis.

Explain the simple linear regression model.

The simple linear regression model is 𝑌 = 𝛼 + 𝛽(𝑋), where X is independent Y is dependent variables.
In this model α is the y - intercept and β is the slope of line or regression coefficient.

Define independent variable or Regressor.

A variable whose values do not depend on any other variable is called independent variable or regressor.
OR

The variable that provides the basis of estimation or prediction is called as independent variable or
regressor.

Define Regressand or dependent variable.


A variable whose values depend on the values of another variable is called dependent variable or
regressand. OR
The variable which we want to estimate or predict on the basis of independent variable is called
dependent variable or regressand.

33
What is scatter diagram?
The graphical representation of the paired observations (𝑥𝑖 , 𝑦𝑖 ) is called scatter diagram.
Scatter diagram helps us to find out the relationship between two variables. If we plot “X values on X-axis
and Y values on Y-axis” then joint points of (𝑋𝑖 , 𝑌𝑖 ) on graph paper.

Define slope or regression coefficient for the simple regression line.


What is regression coefficient?
Slope or regression coefficient is the rate of change in the dependent variable as per unit change in the
independent variable. It is denoted by β.
Give the properties of regression co-efficient.
The properties of regression co-efficient are
1) Regression Coefficients (𝑏𝑦𝑥 , 𝑏𝑥𝑦 ) always have the same sign.

2) If regression coefficient 𝑏𝑦𝑥 is greater than 1, then 𝑏𝑥𝑦 will be less than 1.
3) Regression coefficient is independent of change of origin but dependent on change of scale.
4) Two regression coefficients are not symmetric with respect to x and y. 𝑏𝑦𝑥 ≠ 𝑏𝑥𝑦

5) The correlation coefficient is the geometric mean of two regression coefficients. r = ± √𝑏𝑦𝑥 × 𝑏𝑥𝑦
Define intercept of straight line or regression line.
In regression model the average value of dependent variable when there is no association
is called intercept. In simple linear regression model is 𝑌 = 𝑎 + 𝑏𝑋 + 𝑒𝑖

𝑎 = Intercept i.e., average value of “Y” when “X = 0”.

34
What do you understand by simple linear regression?
In simple linear regression, the dependent variable “y” is expressed as a linear function of one
independent variable “x” is called simple linear regression.
OR
If the dependent variable depends on a single independent variable is called simple linear
regression model. Simple linear regression model is 𝑌 = 𝑎 + 𝑏𝑋 + 𝑒𝑖
Where 𝑌 = Dependent variable;
𝑋 = Independent variable
𝑎 = Intercept i.e., average value of “Y” when “X = 0”
𝑏 = Regression coefficient or coefficient of independent variable Slope of regression line
𝑒𝑖 = Random error
Enlist the properties of regression line.

1. The least square regression line always passes through the mean values i.e.,(𝑋 , 𝑌).

2. The sum of observed values and estimated values is equal. i.e.,∑ 𝑌= ∑ 𝑌̂

3. The sum of the residual error is always equal to zero. i.e., ∑ 𝑒𝑖 = ∑( 𝑌 − 𝑌̂ ) = 0


∑𝑌 ∑ 𝑌̂
4. The mean of the observed and estimated values are equal. i.e., =
𝑛 𝑛

5. The sum of square deviation of observed values from the estimated values is minimum,
2
i.e., ∑(𝑌 − 𝑌̂) = Minimum

Define principle of least square for fitting a regression line.


OR
State the principle of least square.
The principle of least squares states that the sum of squared deviations of the observed values from the
estimated values should be least or minimum.

Define the term residual.


The difference between observed and estimated value is called residual or error.
i.e., ∑ 𝑒𝑖 = ∑( 𝑌 − 𝑌̂ ) = 0

Write the Normal equation of regression line Y on X.


The Normal equations of regression line Y on X are

𝑌 = 𝑎 + 𝑏𝑋
∑ 𝑌 = 𝑛𝑎 + 𝑏 ∑ 𝑋…………………. (i)

∑ 𝑋𝑌 = 𝑎 ∑ 𝑋 + 𝑏 ∑ 𝑋 2 ……………. (ii)

35
Give the formulas of intercept and slope of the line in the equation?
The formulas of intercept and slope of the line are

𝑛 ∑ 𝑋𝑌 − ( ∑ 𝑋 )( ∑ 𝑌 ) ∑(𝑋−𝑋)(𝑌−𝑌) ∑ XY − nX y
𝑏𝑦𝑥 = 𝐨𝐫 𝑏𝑦𝑥 = 2 or 𝑏𝑦𝑥 = ∑
𝑛 ∑ 𝑋2 − (∑ 𝑋)2 ∑(𝑋− 𝑋) X2 − (∑ X)2

(∑ x)2 (∑ y) − (∑ x) (∑ xy) ∑ 𝑌 − 𝑏𝑦𝑥 (∑ 𝑋)


𝑎𝑦𝑥 = 𝐨𝐫 𝑎𝑦𝑥 = 𝑌 − 𝑏𝑦𝑥 (𝑋) 𝐨𝐫 𝑎𝑦𝑥 =
𝑛 ∑ 𝑋2 − (∑ 𝑋)2 𝑛

Explain the term correlation.


The measure the strength or closeness of linear relationship between two variables is called simple
correlation.
Examples
i) The height and weight of children correlated with age.
ii) Supply and demands of goods correlated with price.
Give two examples of correlation.
Examples of correlation are height and weight of children, ages of husband and ages of wives at the time
of their marriage, temperature and length of copper wire.
Define positive correlation.

The correlation is said to be positive if the two random variables tend to move in the same direction i.e.,
increase or decrease simultaneously.

Example: The length of iron bar increases as temperature increases.


Elaborate the term negative correlation.

The correlation is said to be negative if the two random variables tend to move in opposite direction i.e.,
one random variable increases as the other random variable decreases.

Example: The demand of items increases as price of items decreases.

Describe perfect positive correlation.


The correlation is said to be perfect positive if the relationship between the two variables is perfectly
linear with positive slop i.e., 𝑟 = +1.
Define perfect negative correlation.
The correlation is said to be perfect negative if the relationship between the two variables is perfectly
linear with negative slops i.e., 𝑟 = −1.

36
Define the term correlation coefficient.
A numerical value which measures the degree of strength in the linear relationship between any two
variables is called correlation coefficient. It is denoted by "𝑟" . It is given as
𝑛 ∑ 𝑋𝑌 − (∑ 𝑋)(∑ 𝑌) ∑(𝑋−𝑋)(𝑌−𝑌)
𝑟𝑋𝑌 = 𝑟𝑌𝑋 = or 𝑟𝑋𝑌 = 𝑟𝑌𝑋 =
2 2
√[𝑛 ∑ 𝑋 2 −(∑ 𝑋)2 ][𝑛 ∑ 𝑌 2 −(∑ 𝑌)2 ] √∑(𝑋−𝑋) × ∑(𝑌−𝑌)

What is the rage of correlation coefficient?


The rage of correlation coefficient is −1 ≤ 𝑟 ≤ +1 .
If 𝒓 = +𝟏 , 𝒓 = −𝟏 , 𝒓 = 𝟎. what does it show?
𝑟 = +1, It shows that there is perfect positive correlation.
𝑟 = −1, It shows that there is perfect negative correlation.
𝑟 = 0, It shows that there is no correlation i.e., the variables are independent.
Sate any five properties of correlation coefficient.
The properties of correlation coefficient are as fellows
i). Correlation Coefficient is symmetric with respect to x and y i.e. 𝑟𝑥𝑦 = 𝑟𝑦𝑥
ii). Correlation Coefficient lies between -1 to +1 inclusive i.e. −≤ 𝑟 ≤ +1
iii). Correlation Coefficient is the G.M of two regression co-efficient i.e. 𝑟 = ± √𝑏𝑥𝑦 × 𝑏𝑦𝑥
iv). The signs of Regression coefficients and of correlation coefficient remain same.

v). Correlation Coefficient is independent of change of origin and unit scale i.e. 𝑟𝑥𝑦 = 𝑟𝑢𝑣

What do you understand by zero – correlation?

If the change in one variable does not affect the other variable then there will no correlation or zero
correlation.

What is the relationship between regression co – efficient and correlation coefficient?

The relationship between regression co – efficient and correlation coefficient is the geometric mean of the
two regression coefficients 𝐛𝐲𝐱 and 𝐛𝐱𝐲 i.e. 𝒓 = ± √𝒃𝒙𝒚 × 𝒃𝒚𝒙 .

Write any three formulas of correlation coefficient.

Sxy Sxy
r= 𝐨𝐫 r= 𝐨𝐫 r = ± √bxy × byx
Sx . Sy
√Sx2 . Sy2

Q. Given 𝐫𝐱𝐲 = 𝟎. 𝟗𝟕, explain or interpret it.

There is a high positive correlation between the variables.

37
Q. Given 𝐛𝐱𝐲 = 𝟎. 𝟖𝟐 , 𝐫𝐱𝐲 = 𝟎. 𝟗𝟕. Find 𝐛𝐲𝐱 =?

Solution:

Given data: bxy = 0.82 , rxy = 0.97 , byx =?

rxy = ± √bxy × byx

Taking square on both side


2
rxy = byx × bxy

(0.97)2 = byx (0.82)

0.9409 = byx (0.82)

0.9409
byx =
0.82

byx = 1.15

Q. If 𝐛𝐱𝐲 = 𝟎. 𝟐𝟕 , 𝐛𝐲𝐱 = 𝟎. 𝟔𝟎 Find 𝐫𝐲𝐱 =?

Solution:

Given data: bxy = 0.27 , byx = 0.60 , rxy =?

rxy = ± √bxy × byx

rxy = √0.27 × 0.60 = √0.162 = 0.402

Q. Given 𝐒𝐱 = 𝟒 , 𝐫𝐱𝐲 = 𝟎. 𝟖 , 𝐒𝐱𝐲 = 𝟐𝟎. Find 𝐒𝐲 =?

Solution:

Given data: Sx = 4 , rxy = 0.8 , Sxy = 20 , Sy =?

Sxy
rxy =
Sx . Sy

20
(0.8) =
(4) . Sy

20
Sy = = 6.25
(4). (0.8)

38
Q. Given 𝐛𝐲𝐱 = 𝟎. 𝟖𝟔 , 𝐛𝐱𝐲 = 𝟎. 𝟗𝟓 Find 𝐫𝐲𝐱 =?

Solution:

Given data: byx = 0.86 , byx = 0.95 , rxy =?

rxy = ± √bxy × byx

rxy = √0.86 × 0.95 = √0.817

rxy = 0.9038

There is a positive correlation between the variables.

Q. Given 𝐫𝐱𝐲 = 𝟎. 𝟖 , 𝐛𝐱𝐲 = 𝟎. 𝟒𝟓 find 𝐛𝐲𝐱 .

Solution:

Given data: rxy = 0.8 , bxy = 0.45 , byx =?

rxy = ± √byx × bxy

Taking square on both side

2
rxy = byx × bxy

(0.8)2 = byx (0.45)

0.64 = byx (0.45)

0.64
byx = = 1.42
0.45

Q. Given 𝐛𝐲𝐱 = 𝟎. 𝟑𝟖 , 𝐛𝐱𝐲 = 𝟎. 𝟔𝟕 Find 𝒓𝐱𝐲 =?


Solution:
Given data: byx = 0.86 , byx = 0.95 , rxy =?

rxy = ± √bxy × byx

rxy = √0.38 × 0.67 = √0.2546

rxy = 0.5045

There is a positive correlation between the variables.

39
Given 𝐗 = 1, 𝐲 = 8 and b = 2 find the value of intercept a.
Solution:

a = y - bX

a = 8 – 2(1)
a=6
(OBJECTIVE)
1. The Variable whose value is Predicted is called:
a) Independent variable b) random variable c) regressand d) regressor
2. Simple linear regression model contains:
a) One variable b) Two variables c) Three variable d) More than three
3. Independent Variable is also called:
a) Regressor b) Regressand c) Predectand d) None of these
4. The Regression line always passes through the points:
a) 𝒙 𝒚 b) 𝑥 c) 𝑦 d) 𝑋 𝑌
5. In Regression Analysis 𝑏𝑦𝑥 and 𝑏𝑥𝑦 has always:
a) Same Signs b) Opposite Signs c) No Signs d) None of these
6. The Dependent Variable is also called :
a) Regressand b) Predictand c) Explained d) All of these
7. In the Regression equation : 𝑦̂ = 𝑎 + 𝑏𝑥, the constant ‘a’ is called :
a) X – Intercept b) Y – Intercept c) Dependent d) Error
8. The Regression Coefficient is independent of:
a) Origin b) Scale c) Origin and Scale d) All of these
9. In the Regression equation model, Y = α + βX + ϵ , the β represents,
a) Intercept b) Dependent variable c) Random error d) Slope
10. The dependence of one variable upon other variable is called:
a) Association b) Correlation c) Regression d) Covariance
11. If both Regression Coefficients are Negative, then the Correlation Co – efficient will be:
a) Negative b) Positive c) Zero d) 1
12. The value of Correlation Coefficient always lies between :
a) 0 to 1 b) -1 to 0 c) -1 to 1 d) −∞ 𝑡𝑜 + ∞
13. Strength of Relationship between two variables is called :
a) Association b) Correlation c) Regression d) Covariance
14. Perfect Positive Correlation is signified by :
a) -1 b) +1 c) 0 d) ± 1

40
15. If the both variables move in the opposite direction then the correlation will be :
a) Negative b) Positive c) Zero d) None of these
16. If regression coefficients 𝑏𝑦𝑥 = 1.6 and 𝑏𝑥𝑦 = 0.4, then 𝑟𝑥𝑦 will be:
a) 0.4 b) 0.64 c) 0.8 d) - 0.8
17. When both variables move in same direction then correlation is :
a) Zero b) Positive c) Negative d) None of these
18. The Correlation Coefficient is independent of:
a) Origin b) Scale c) Origin and Scale d) None of these
19. The Range of Rank Correlation Coefficient 𝒓𝒔 always lies between:
a) −∞ 𝑡𝑜 + ∞ b) -1 to 0 c) -1 to 1 d) 0 to 1
20. The correlation coefficients are the _____ of two regression coefficients.
a) Arithmetic mean b) Geometric mean c) Harmonic mean d) Median
21. The term regression was used by :
a) Newton b) Pearson c) Galton d) Spearman
22. In regression, ∑ 𝑦̂ is equal to :
a) 0 b) ∑ 𝒚 c) a d) bx
23. If 𝑦̂ = 2 + 0.6𝑥, the value of slope is : a)
2 b) 0.6 c) 0 d) 0.3
24. If 𝑦̂ = 2 + 0.6𝑥, the value of y – intercept is :
a) 2 b) 0.6 c) 0 d) 0.3
25. When two variables are uncorrelated, the value of “r” is:

a) Negative b) Remained unchanged c) Zero d) Positive

41
Chapter No. 15

ASSOCIATION OF ATTRIBUTES
What is association?
The strength of relationship between two attributes is called association. If two attributes A and B are not
independent, they are said to be associated.
Define association of attributes.
A characteristic which varies only in quality from one individual to other is called an attribute. For
example, eye colour, religion, beauty, gender etc.
Differentiate between variable and attribute.
Variable: A characteristic which varies only in quantity from one individual to another is called variable.
For example: age, height, weight and temperature.
Attribute: A characteristic which varies only in quality from one individual to other is called an attribute.
For example: eye colour, religion, beauty, gender etc.
What is class frequency?
A class frequency is the number of objects which are distributed in a class. Class frequency is denoted by
enclosing the class symbols in the brackets. i.e., (A)
What is ultimate class frequency?
The frequency of classes of highest order is called ultimate class frequency.
negative classes.
Discuss negative and positive attributes.
Positive attributes: Positive attributes denoted by capital letters A, B, C,.... represent the
presence of attributes.
Negative attributes: Negative attributes denoted by Greek letters α, β, 𝛾, .. represent the
absence of attributes.
What is meant by order of class?
The number of attributes specifying the classes is known as order of class. For example, a class specified
by one attribute is known as the order of class is 1.
What is consistency of data?
In contingency table if no ultimate class frequency is negative, then the data is said to be consistent.
Define the term of dichotomy.
The process of dividing the objects into two distinct mutually exclusive and complementary classes is
called dichotomy.
Explain independence of attributes.
Two attributes A and B are said to be independent if there is no relationship between them. Two attributes
A and B are independent if expected frequency of attributes is equal to observed frequency i.e.
(𝐴)(𝐵)
(𝐴𝐵) =
𝑛
Express the term negatively associated.
Two attributes A and B are said to be negatively associated or simple disassociated if expected frequency
of attributes is less than observed frequency i.e.,
(𝐴)(𝐵)
(𝐴𝐵) <
𝑛

42
Define the term positively association.
Two attributes A and B are said to be positive associated if expected frequency of attributes is greater than
observed frequency i.e.,
(𝐴)(𝐵)
(𝐴𝐵) >
𝑛
Explain Coefficient of Association.
The numerical measure of association between two attributes A and B is known as the coefficient of
association. The Yule’s Co-efficient of association denoted by Q is given as.
(𝐴𝐵)(𝛼𝛽) − (𝐴𝛽)(𝛼𝐵)
𝑄=
(𝐴𝐵)(𝛼𝛽) + (𝐴𝛽)(𝛼𝐵)
Interpreting the meaning of coefficient of association Q when Q = 0, Q = +1, Q = -1.
The strength of association between two attributes A and B is known as coefficient of association.
If Q = 0, the two attributes are independent.
If Q = +1, the two attributes are completely associated.
If Q = -1, the two attributes are completely disassociated.
Define contingency table.
A table consisting of “r” rows and “c” column in which the data are classified according to the two
attributes is called an 𝑟 × 𝑐 contingency table.
Describe Chi –square (𝝌𝟐 ) Statistic.
A numerical quantity to compares the observed and expected frequencies is known as
Chi – square 𝜒 2 Statistic. This test statistic is used to determine whether the difference between the
observed and expected frequencies is statistically significant.

𝟐
(𝑓𝑜 − 𝑓𝑒 )2
𝝌 = ∑[ ] ~𝜒𝜈2
𝑓𝑒
Degree of freedom 𝜈 = (𝑐 − 1)(𝑟 − 1)

What is Chi – square (𝝌𝟐 ) distribution?


Chi – square 𝜒 2 distribution which has a
positively skewed distribution ranging from 0 to ∞.
The shape of a chi-square distribution depends on its
2
degrees of freedom ’ 𝝂 ’. Chi – square 𝜒 distribution is useful to test independence between two
attributes.
Write the general procedure for test of independence between the attributes.
i) Setup null and alternative hypotheses.
𝐻𝑜 : The attributes are independent.
𝐻1 : The attributes are dependent.
ii) Level of significance 𝛼 is chosen.
iii) Test statistic:
(𝑓𝑜 −𝑓𝑒 )2
𝝌𝟐 = ∑ [ ] ~𝜒𝜈2 , Degree of freedom 𝜈 = (𝑟 − 1)(𝑐 − 1)
𝑓𝑒
iv) Computation of test-statistic.
2
v) Critical region: 𝜒 2 ≥ 𝜒𝛼(𝑟−1)(𝑐−1)
vi) Conclusion: Accept 𝐻𝑜 if calculated value of 𝜒 2 falls in acceptance region, otherwise reject 𝐻𝑜 .

43
Define degrees of freedom.
T number of independent values which can be assigned to a statistical distribution is called degrees of
freedom. It is commonly denoted by “𝜈” or “n”.
Q. For a given data if (𝑨𝑩) = 𝟏𝟏𝟎, (𝜶𝑩) = 𝟗𝟎, (𝑨𝜷) = 𝟐𝟗𝟎, (𝜶𝜷) = 𝟓𝟏𝟎. Discuss association.
The Yule’s Co-efficient of association denoted by Q is given as.
(𝐴𝐵)(𝛼𝛽) − (𝐴𝛽)(𝛼𝐵)
𝑄=
(𝐴𝐵)(𝛼𝛽) + (𝐴𝛽)(𝛼𝐵)

(110)(510) − (290)(90) 56100 − 26100


𝑄= =
(110)(510) + (290)(90) 56100 + 26100

30000
𝑄=
82200

𝑄 = 0.36 (𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑎𝑠𝑠𝑜𝑐𝑖𝑡𝑎𝑖𝑜𝑛)


Explain the term rank correlation.
The rank correlation describes the relationship between the two sets of rankings that is, between the
rankings of the one variable and the rankings of the other variable.
What is Spearman’s rank correlation co-efficient?
A method of measuring and testing the degree of association between the two variables measured at the
ordinal level is called Spearman’s rank correlation co-efficient. It is denoted by "𝑟𝑠 ".
6 ∑ 𝑑𝑖2
𝑟𝑠 = 1 − , − 1 ≤ 𝑟𝑠 ≤ +1
𝑛(𝑛2 − 1)
Enlist the two properties of Spearman’s rank correlation co-efficient.
i) The value of 𝑟𝑠 is always lies between −1 and +1.
ii) If 𝑟𝑠 = 0 when the ranks not correlated.
Write the formula of Spearman’s rank correlation co-efficient ” 𝒓𝒔 ”.
The formula of Spearman’s rank correlation co-efficient ” 𝑟𝑠 ” is
6 ∑ 𝑑𝑖2
𝑟𝑠 = 1 − , − 1 ≤ 𝑟𝑠 ≤ +1
𝑛(𝑛2 − 1)
Q. If ∑ 𝒅𝟐𝒊 = 𝟏𝟎 𝒂𝒏𝒅 𝒏 = 𝟓, calculate Spearman’s rank correlation co-efficient.
Solution:
∑ 𝑑𝑖2 = 10 𝑎𝑛𝑑 𝑛 = 5, 𝑟𝑠 =?
6 ∑ 𝑑𝑖2
𝑟𝑠 = 1 −
𝑛(𝑛2 − 1)
6(10)
𝑟𝑠 = 1 −
5(52 − 1)
60
𝑟𝑠 = 1 − 5(24)

𝑟𝑠 = 1 − 0.5 = 0.5

44
(OBJECTIVE)
1. A characteristic which varies in quality from one individual to another is:
(a) Statistic (b) Attribute (c) Variable (d) Regression
2. Qualitative variable is also called:
(a) Frequency (b) Attribute (c) Variable (d) Classes
3. Eye colour of a man is an example of:
(a) Frequency (b) Attribute (c) Variable (d) Classes
4. The process of dividing the objects into two distinct and mutually exclusive classes is called:
(a) Bichotomy (b) Trihotomy (c) Dichotomy (d) Classes
5. If any ultimate class frequency is negative, the data will be:
(a) Consistent (b) Inconsistent (c) Correlated (d) Independent
6. The strength or degree of linear relationship between two attributes is called:
(a) Correlation (b) Regression (c) Association (d) None of these
7. The value of chi-square statistic is always
(a) One (b) 0 (c) Negative (d) Positive
8. The range of Chi - Square distribution is:
(a) 0 to ∞ (b) −∞ to 0 (c) −∞ to ∞ (d) -1 to +1
9. The shape of the chi-square distribution is:
(a) Positively skewed (b) Negatively skewed (c) Symmetrical (d) None of these
10. For(r × c) contingency table, the degree of freedom should be:
(a) rc (b) (r – 1)( c – 1) (c) rc – 1 (d) (r – 1) c
11. For 2 × 2 contingency table, the degree of freedom should be:
(a) 1 (b) 4 (c) 3 (d) 6
2 2
12. If 6Σd = n(n - 1), the rank correlation of coefficient will be:
(a) 1 (b) – 1 (c) 0 (d) 0 to 1
13. The two attributes are said to be positive associated if:
(𝐴𝐵) (𝐴𝐵) (𝑨𝑩)
(a) 𝑛 = (AB) (b) 𝑛 > (AB) (c) 𝒏 < (AB) (d) None of these
14. The two attributes are said to be negative associated if:
(𝐴𝐵) (𝑨𝑩) (𝐴𝐵)
(a) 𝑛 = (AB) (b) 𝒏 > (AB) (c) 𝑛 < (AB) (d) None of these
15. The two attributes are said to be independent if:
(𝑨𝑩) (𝐴𝐵) (𝐴𝐵)
(a) 𝒏 = (AB) (b) 𝑛 > (AB) (c) 𝑛 < (AB) (d) None of these
6𝛴𝑑 2
16. If the = 0 then the rank correlation coefficient will be:
𝑛(𝑛2 − 1)
(a) 1 (b) – 1 (c) 0 (d) 0 to 1
17. The range of Rank Correlation Coefficient 𝑟𝑠 is:
(a) 0 to ∞ (b) 0 to 1 (c) −∞ to ∞ (d) -1 to +1
2
18. The value of Chi – Square 𝜒 cannot be :
(a) Negative (b) Zero (c) Positive (d) None of these
19. The Coefficient of Association 𝑄 always lies between :
(a) 0 and ∞ (b) 0 and 1 (c) −∞ and ∞ (d) -1 and +1
20. If 𝑄 = 0, then attributes are:
(a) Associated (b) Disassociated (c) Independent (d) Positively associated

45
Chapter No.16

Analysis of Time Series


Describe the term time series.
The arrangement of statistical data by successive time periods is called time series.
OR
A time series is a sequence of observations on a variable, that are arranged with respect to time.
Examples:
i) The hourly temperature recorded in a city.
ii) The monthly sale of motor cycles in Rahim Yar Khan.
What do you mean by analysis of time series?
OR
What do you mean by decomposition of a time series?
The analysis of time series is the decomposition of a time series into its different components for their
separate study.
OR
The analysis of time series consists of a describing measuring the different components in the time series.
Give any two examples of time series data.
1) The hourly temperature recorded by weather bureau of statistics.
2) Total monthly sales of a book shop.
3) Annual rainfall recorded at Lahore.
What is the purpose of analysis of time series?
The study of time series is mainly required for prediction, estimation and forecasting.
Write a short note on Signal in time series.
The systematic component of time series which follows regular patterns of variations is called signal.
Write a short note on Noise in time series.
The unsystematic component of time series which follows irregular patterns of variations is called noise.
Elaborate the historigram.
The graph of time series which shows the changes occurred at different time period is called historigram.
It is constructed by taking time period along x-axis and observed values along y-axis.
Define histogram.
The graph of frequency distribution is called histogram. It is constructed by taking class boundaries along
x-axis and frequency along y-axis.

46
What are the components of time series?
The factors that are responsible to bring about changes in a time series are called the components of
t i m e series, are as follows:
(i) Secular Trend (T) (ii) Seasonal Variations (S)
(iii) Cyclical Movements (C) (iv) Irregular Movements (I)
Explain the secular trend.
A secular trend is a long-term movement that indicates the general direction of the variation in a time
series.
Example:
Increasing demand for food due to increase the population.
Give two examples of secular trend.
The examples of secular trend are:
(i) A decline in death rate due to advancement in science.
(ii) Increasing demand for food due to increase the population.
(iii) Continually increasing demand for smaller automobiles in a country.
Explain the seasonal variation.
Seasonal variations are short term movements that indicate the identical changes in a time series
during the corresponding seasons.
Example:
The prices of soft drink increase in summer while fall in winter.
Give two examples of seasonal variations.
1) Increased demand of ice-cream in the summer.
2) Increase in the sales of clothes and shoes near Eid.
3) Decrease in the sales of coolers in the winter.
Explain the cyclical variations.
Cyclical movements are the long-term fluctuations about the trend line, which occur after the
period of one year or more.
Give an example of cyclical variation.
The example of cyclical variation is business cycle, which consists of four stages of a business
cycle.
What is business cycle?
A business cycle describes the expansion and contractions of economic activity in an economy over a
period of time.
47
What are the four phases of business cycle?
A business cycle has the four phases:
(i) Prosperity (Boom)
(ii) Recession (Construction)
(iii) Depression (Trough)
(iv) Recovery (Expansion)
Explain irregular or random variations.
Irregular or random variations are unsystematic in nature and they occur in a completely
unpredictable manner by chance events, such as wars, floods, earthquakes, strikes, etc.
It is also called erratic or residual or accidental variations.
Examples:
Floods, Strikes, Earthquakes and War tec.
Write four examples of the irregular or random variations.
Examples of irregular variations are:
i) Delay of production due to fire in factory.
ii) Delay of production due to strike of workers.
iii) Rise in prices of food due to flood, etc.
Define modal of time series.
The mathematical relationship between the four components of time series is called modal of time series.
There are two types of time series models.
Y=T×S×C×I Multiplicative Model
Y=T+S+C+I Additive Model
Explain the two models of the time series.
There are two types of time series models.
Y=T×S×C×I Multiplicative Model
Y=T+S+C+I Additive Model
What is multiplicative model of time series?
In multiplicated model, it is assumed that the value y of a composite series is the product of four
components
T, S, C, I. symbolically: Y = T × S × C × I
What is additive model of time series?
In additive model, it is assumed that the value y of a composite series is the sum of four components
T, S, C, I.
symbolically: Y = T + S + C + I
Give names of different methods of measuring secular trend.
Following are the four methods used to measure secular trend.
i) The method of freehand curve.
ii) The method of semi-averages.
iii) The method of moving averages.
iv) The method of least squares.
Explain free hand curve method.
In free hand curve method, time series is plotted on a graph and plotted points are linked with the help
of a free hand curve.

48
Write the merits of the free hand curve.
i) It is the simplest method for measuring secular trend.
ii) It saves much mathematical calculations.
iii) The trend line or curve smoothes out the seasonal variations.
Write the demerits of the free hand curve.
i) It depends too much on personal judgment.
ii) It is a rough and crude method.
iii) It requires too much practice to get a good fit.
Explain semi average method.
In semi average method, the observed time series is divided into two or approximately two equal
parts. The averages are computed for each part and placed against their center. These averages are then
used to fit a linear trend.
The mathematical equation for semi average is:
y = a + bx , where
Ȳ2 − Ȳ1
b=
𝑥̅ 2 − 𝑥̅1

a = 𝑦̅ − 𝑎𝑥̅

Write the merits of the semi average method.


i) This method is very easy.
ii) It smoothes out seasonal variation.
iii) It gives an objective result.
Write the demerits of the semi average method.

i) The arithmetic mean used in semi average method is greatly affected by extreme values.
ii) This method is not applicable if the trend is not linear.
Define the method of moving average.
In moving average method, we find the simple average successively by taking specific number of
values at a time. The process will be continued till all the values of the series are exhausted.
Write the merits of the moving average method.
i) The method is easy and simple.
ii) It is used to eliminate cyclical and seasonal movements.
Write the demerits of the moving average method.
i) It does not give the trend values at the beginning and at the end.
ii) The moving averages are highly affected by extreme values.
Explain the principle of least square.
The principle of least squares states that the sum of squared deviations of the observed values from the
estimated values should be least or minimum.

49
Write down the properties of least square line.
The properties of least square line are:
i)The least square line always passes through the mean values i.e. (𝑋̅ , 𝑌̅).
ii) The sum of observed values and estimated values is equal i.e. ΣY = ΣŶ
iii) The sum of residual error is always equal to zero i.e., Σe = Σ (Y – Ŷ) = 0
̂
iv) The mean of observed and estimated values is equal i.e., 𝑌 = 𝑌
Write the merits of the least square method.
i) The least square estimates are unbiased.
ii) It is easy to calculate and interpret.
iii) This method gives most satisfactory measurement of secular trend.
Write the demerits of the least square method.
i) The method is not applicable for non-linear types of curves.
ii) This method gives too much weight to extremely large deviation from the trend.
Define residual.
The difference between observed and estimated value is called residual or error. i.e., e = (Y - Ŷ)
write down the normal equation of straight line.
Equation of straight line:
Y = a + bx
Normal equations of straight line:
Σy = na + bΣx …………….. (i)
Σxy = aΣx + bΣx2 …………….. (ii)
Give the normal equation of second-degree parabola.
The equation of the second-degree parabola is:
Y = a + bx + cx2
The normal equations are:
Σy = na + bΣx + cΣx2 ……………. (i)
Σxy = aΣx + bΣx2 + cΣx3 …………..(ii)
Σx2y = aΣx2 + bΣx3 + cΣx4 …………(iii)

50
(OBJECTIVE)
1) A sequence of observation recorded over time is called:
a) Geographical data b) Time series c) Grouped data d) None of these
2) Graphical representation of a time series is called:
a) Historigram b) Histogram c) Ogive d) None of these
3) Decomposition of is called:
a) Analysis of time series b) Deseasonalization c) Detrending d) Time series
4) Systematic component of variation in a time series is called:
a) Signal b) Time series c) Noise d) None of these
5) The unsystematic sequence which follows irregular pattern of variation is called:
a) Signal b) Time series c) Noise d) None of these
6) The additive model of the time series is regarded as:
a) Y = T × S × C × I b) Y = T + S + C + I c) Y = T – S – C – I d) None of these
8) The multiplicative model of the time series is regarded as:
a) Y = T × S × C × I b) Y = T + S + C + I c) Y = T – S – C – I d) None of these
9) A time series has components:
a) 2 b) 5 c) 3 d) 4
10) Secular trend is measured by methods:
a) 4 b) 2 c) 3 d) 5
11) Movements in the secular trend are:
a) Smooth b) Steady c) Regular d) All of these
12) A decline in the death rate due to advancement in science is an example of:
a) Seasonal variations b) irregular variations c) Cyclical variations d) Secular trend

13) Repetitive movements around the trend line in one year or less is:
a) Seasonal variations b) irregular variations c) Cyclical variations d) Secular trend
14) A business cycle has phases:
a) 2 b) 3 c) 4 d) 5
15) The fire in a factory is an example of:
a) Seasonal variations b) Irregular variations c) Cyclical variations d) Secular trend
16 In semi average method, data is divided into:
a) Two groups b) Three groups c) Four groups d) No group
17) The best fitted trend is one for which, the sum of squares of residuals is:
a) Maximum b) Least or Minimum c) Zero d) None of these

51
18) The difference between actual value and trend value is called:
a) Slope b) Intercept c) Residual d) None of these
19) In fitting a straight line to the time, the sum of residual is
a) Zero b) Least c) Most d) Positive
20) A second degree parabola has constants.
a) Zero b) One c) Two d) Three
21) A long term, regular, smooth, slowly moving and steady variations in time series is called:
a) Seasonal variations b) Irregular variations c) Cyclical variations d) Secular trend
22) A short term variations which are regularly repeated every year due to seasons, religious festivals

and social customs are called :


a) Seasonal variations b) Irregular variations c) Cyclical variations d) Secular trend
23) If a straight line is fitted to the observed time series, then:
a) ∑ 𝑦 > ∑ 𝑦̂ b) ∑ 𝒚 = ∑ 𝒚
̂ c) ∑ 𝑦 < ∑ 𝑦̂ d) ∑ 𝑦 ≠ ∑ 𝑦̂
24) Seasonal variations in a time series can occur within a period of :
a) One year b) Two year c) Five year d) Ten year
25) A set observations recorded at equal intervals of time is called :
a) Geographical data b) Time series c) Grouped data d) Array data

Prepared by: Darshan Jee


Lecturer: Govt. Khawaja Fareed Graduate College R.Y. Khan
(03081845584)

52
Chapter No.17
Orientation of Computers
What is Computer?
A computer is an electronic device that input data, stores and process data and helps us to solve a
wide range of problems efficiently and quickly.
What are the types of computers?
Following are the types of computers:
(i) Digital Computer (ii) Analog Computer (iii) Hybrid Computer
What is digital computer?
A computer that is based on two digits (0 and 1) is called digital computer. It operates by counting
digits and gives output in digital form. For example, digital clocks, digital thermometer, etc.
What is analog computer?
An analog computer measures physical quantities directly to give output on scale. For example, dial
clock, weighting machine, etc.
What is hybrid computer?
A hybrid computer has features of both digital and analog computers. For example, modem,
robots, modern patrol pumps, etc.
What is the classification of computers?
Following are the classification of computers:
(i) Micro Computers (ii) Main Frame Computers (iii) Mini Computers (iv) Super Computers
What are micro computers?
Micro computers are designed to be used by one user at a time. They are used at homes, offices, etc.
What are main frame computers?
Main frame or macro computers are very powerful general- p u r p o s e computers. There is used in
banks, research institutions and weather forecasting departments, etc.
What is mini-computer?
Mini computers are smaller version of main frame computers. They are used for maintaining details
of large business organizations to analyze the results of experiments or control the production activity of
a factory.
What are super computers?
Super computers are the largest and fastest computers of the Era and designed to process complex jobs.
It can process millions of instructions per second.
What are the basic components of a computer?
There are two basic components of a computer. (i) Software (ii) Hardware
What is meant by hardware?
The physical parts of the computer are called hardware. i.e., CPU, Monitor, etc.
What is Central Processing Unit (CPU)?
CPU is the main part of the computer that performs all the operations according to the program
instructions. It carried out instructions and tells other parts of the computer what to do.

53
Define Arithmetic Logic Unit (ALU)?
The ALU is the place where the actual execution of the instructions takes place during the
processing operation. It carries out the arithmetic and logical operations.
Define Read Only Memory (ROM).
It is used to store data and programs that are permanent. New data cannot be written on it.
Define Random Access Memory (RAM).
It is the memory that computer uses temporarily to store the information as it is being processed.
What is meant by Software?
The set of instructions given to the computer to perform a specific task is called software.
Define Operating System (OS).
An operating system is an integrated set of programs that is used to manage the various hardware
resources of the computer system.
What do you know about DOS?
DOS stands for the disk operating system. It is the most widely used operating system.
Differentiate between input and output devices.
The devices which are used to give input to the computer are called input devices. For example,
keyboard, mouse, etc.
The devices which are used to take information from the computer are called input devices. For
example, monitor and printer, etc.
Name some input devices?
Keyboard, Mouse, Joy Stick, Scanner and Touch Pad, etc.
Name some output devices?
Monitor, Printer, LCD, Plotters and Speakers, etc.
Differentiate between hard copy and soft copy.
The output received from the printer on the paper is called hard copy.
The temporary output on the screen of monitor or LCD is called soft copy.
What are the types of programming language?
There are two types of programming languages.
(i) Low level languages (ii) High level languages.
What is low level language?
The languages that are close to machine code are called low level languages.
What is high level language?
The languages that are close to human languages are called high level languages.
What is assembler?
An assembler is a system program that translates an assembly language to a machine language.
What is compiler?
A compiler is a system program that translates high level language to machine language.
Explain the term byte in Computer use.
A collection of eight bits is called byte. Byte is the basic unit to store data in computer memory. One byte
store single character in memory.

54
(OBJECTIVE)
1. The base of Binary number system is :
(a) 2 (b) 8 (c) 10 (d) 16
2. The base of Decimal number system is :
(a) 2 (b) 8 (c) 10 (d) 16
3. An octal number system has base :
(a) 2 (b) 8 (c) 10 (d) 16
4. DOS and Microsoft Windows are :
(a) Multi Media software (b) Application software (c) System software (d) Translator
5. There are _____ types of computers :
(a) 1 (b) 2 (c) 3 (d) 4
6. The types of language processor are :
(a) 1 (b) 2 (c) 3 (d) 4
7. 1024 MB = :
(a) 1 KB (b) 1 MB (c) 1GB (d) 1 TB
8. Which of the following is not an input device?
(a) Mouse (b) Keyboard (c) Speakers (d) Scanner
9. One byte = :
(a) 4 bits (b) 6 bits (c) 8 bits (d) 16 bits
10. The physical parts of the computer system are called:
(a) Software (b) AutoCAD (c) Program (d) Hardware
11. The most common input devices are:
(a) Monitor and mouse (b) Keyboard and mouse (c) Monitor and printer (d) None
12. Brain of computer system is called:
(a) CPU (b) Main memory (c) Hard disk (d) Monitor
13. The most common number system used in computer is:
(a) Octal (b) Hexadecimal (c) Decimal (d) Binary
14. Drag and drop is a term associated with:
(a) Keyboard (b) Printer (c) Monitor (d) Mouse
15. Digital computers work with digits:
(a) 0 and 1 (b) 1 and 2 (c) 2 and 3 (d) 0 and 9
16. The primary storage unit is also known as:
(a) Storage register (b) Disk memory (c) Accumulator (d) Main memory
17. In computer, RAM stands for:
(a) Read actual memory (b) Random access memory (c) Read any memory (d) None
18. CPU is an example of:
(a) Software (b) Hardware (c) Program (d) Output
19. ALU stands for:
(a) All logical units (b) Arithmetic logical unit (c) Alonelogical unit (d) Binary
20. Which of the following is not an output device?
(a) Modem (b) Monitor (c) Printer (d) Scanner

Prepared by: Darshan Jee


Lecturer: Govt. Khawaja Fareed Graduate College, R.Y. Khan
(03081845584)

“Keep going. Everything you need will come to you at the perfect time.”
"‫"ےتلچرہ ۔آپیکرضورتیکرہزیچآپےکاپسحیحصوتقرپآےئگ‬

55
56

You might also like