Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
15 views39 pages

Statistics for Medical Students

Uploaded by

PranavKulkarni
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views39 pages

Statistics for Medical Students

Uploaded by

PranavKulkarni
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 39

DR PRANAV KULKARNI

I MDS
PREVIOUSLY
COVERED CONTENTS
DEFINITION
• DEFINITION
COMMON STATISTICAL TERMS
• STANDARD DEVIATION
DATA
• VARIANCE
TYPES OF DATA
• SAMPLING AND SAMPLE
PRESENTATION OF DATA
DESIGNS
MEASURES OF CENTRAL TENDANCY
• METHODS OF SAMPLING
USES OF DATA
• TESTS OF SIGNIFICANCE
USES OF STATISTICS
DEFINITION
• Statistic or datum means a measured or counted fact
or piece of information stated as a figure such as height
of one person , birth of a baby ,etc.
• An art and science of collection , compilation,
presentation, analysis and logical interpretation of
biological data affected by multiplicity of factors.
• It is the term used when the tools of statistics are
applied to data that is derived from biological sciences
such as medicine or dentistry.
Standard deviation

• Standard deviation is a measure of variation of dispersion of a group


of values around an arithmetic mean
• A low standard deviation indicates that the values tend to be close to
the mean (also called the expected value) of the set, while a high
standard deviation indicates that the values are spread out over a
wider range.
• σ = √ (value – mean )2
n-1
Variance
Day of No. of patient
• Nothing else but square of reporting with reported Ix- x I2
standard deviation and complaint
denoted by σ2 1 10 32.49
2 25 86.49
Mean = 110/7 = 15.7 3 35 372.49
4 05 110.49
SD = √ 668.34 =
5 10 32.49
10.5
6 6 10 32.49
7 15 1.4
Total 668.34
• Summarizes the deviation of a
large distribution from mean in
Uses one figure used as a unit of
of variation
Standa • Indicates weather the variation
rd is real or due to special reason
Deviati • Helps in comparing two samples
on:- • Helps in finding the suitable
sample size for valid conclusion
Merits of SD:-
Demerits of SD:-
Rigidly defined
Difficult to
Based on all
understand and
observations
calculate
Doesn’t ignore the
Cannot be
algebric sign of
calculated for
deviation
qualitative data and
Capable of further distribution with
mathematical open end classes
treatment
Unduly affected by
Not much effected by extreme deviations
sample fluctuation
Normal curve
• Properties
Bell shaped
Symmetrical
Height is maximum at mean ,
Mean=Median=Mode
Maximum number of observation at
mean and it decreases on either side
Relation between mean and standard
deviation forms basis of tests of Normal distribution or Gaussian curve
significance
Coefficient of Dispersion or Variation

• Following are some definitions of coefficient of dispersion


i) Mean deviation x 100 ii) Mean deviation x 100
Median Mean

iii) Standard deviation x 100


Mean

• iiith is most commonly used CV


Coefficient of Dispersion or Variation

• When variability of two series is compared,


• Series having greater CD is said to have more variation
• Series having lower CD is said to be more homogenous

CD = 100 x SD
Mean
Skewnes
s
ic a te s w h et her
m m e t r y … in d
s la c k o f sy to t h e ot he r
k e w n e s s m e an o ne s ide t h a n
•S r n e d m o re to t I . e.
r ve is t u to t h e ri g h
the cu m or e e lo n ga t e d
s i f c ur v e is
• +ve skewnes e than mode e a n
a n is m o r le ft I . e . m
if me e lo ng at e d t o t he
s if c u r ve i s
• -ve skewnes mode
e ss t h a n t h e
is l
Skewnes
s
• Formula for Skewness
• Karl Pearson’s coefficient of skewness
= Mean – Mode
SD
• Coefficient of skewness is 0 for a symmetrical
distribution
• If coefficient of skewness is positive , distribution is
positively skewed
• If coefficient of skewness is negative , distribution is
negatively skewed
Sampling and Sample designs
 A sample is a part of a population, called the ‘universe’ ,
‘reference’ or ‘parent’ population.

 It is the process or technique of selecting a sample of appropriate

characteristics and adequate size.

Sampling frame is the total of the elements of the survey


population, redefined according to certain specifications.

It consists of sampling units ,which are individual entities that


form the focus of the study.
Advantages
Reduces cost of investigation, time and the number of
personnel involved.

Allows thorough investigation of the units of


observation.

Provide adequate and in depth coverage of the


sample units.
Ideal requirements
Efficiency
Representativeness
Measurability
Size
Coverage
Goal orientation
Feasibility
Economy and cost efficiency
Actual sample selection can be accomplished in two
ways

A. Purposive sampling:
• Representing the sample as a whole.
• Great temptation to deliberately or purposively select the
individuals….

B. Random selection:
• All the characteristics of the population are reflected in the
sample
• Selecting the units at random
Methods
NON-
PROBABILITY
PROBABILITY
SAMPLING
SAMPLING
• Simple random • Accidental/incidental
• Stratified random • Judgment/purposive
• Systematic random • Quota
• Cluster • Sequential
• Multiphase • Snowball
• Multistage
1. Simple random:
• Sample units - individuals from the community or group.
• Sampling frame - community or group from where the
sample will be drawn.
• The word ‘random’ indicates haphazard which seems to
be misnomer in this sampling context.

1.PRO
The basic procedure is : BABIL
Prepare a sampling frame SAMP I TY
Decide on the size of the sample LING
Select the required no of units

 The sample can be drawn by using a random

number table or the lottery method


 The population to be sampled is subdivided into groups
known as strata, such that each group is homogeneous in its
characteristic.
2. S
 A simple random sample is then chosen from each TR A
stratum. T
RAN IFIED
D
SA M O M
 Is used when the population is heterogeneous. PLIN
This methods ensures more representativeness, provides G
greater accuracy and can concentrate on a wider
geographical area.

 The limitation : care should be taken while dividing the

population into strata regarding the homogeneity in each


stratum.
SYS 3.
TEM
RAN ATIC
D
SA M O M
PLIN
G

 It is formed by selecting one unit at random and then

selecting additional units at evenly spaced intervals till


the required sample size has been reached.

 Used when complete list of population is available.


 Used when the population forms natural groups or 4.
Cluster
clusters such as villages, wards, blocks, children in school, etc.
sampling
• Sampling units - clusters
• Sampling frame - group of clusters

First using simple random sampling the clusters are selected and then all the units in each of the selected

clusters are surveyed.

 Method is simple and less expensive than random sampling.

 Example…to find out the prevalence of flurosis in rural community.


5. MULTIPH SAMPLI
A SE NG

 Part of information is collected from the whole sample

and a part from sub-sample.

 This method may be adopted when there is interest in

any specific disease.

 Survey by this procedure is less costly, less laborious and

more purposeful.
6.Multistage sampling
The first stage is to select the groups or clusters. Then
subsamples are taken in as many subsequent stages as necessary
to obtain the desired sample size.

Eg: 1st stage: choice of states within countries


2nd stage: choice of towns within each state
3rd stage: choice of neighborhoods within each town
NON-PROBABILITY SAMPLING

1. Accidental/ incidental/ convenience :


 Here sample is selected by the convenience of the
situation for the examiner.

2. Judgment/ purposive/ deliberate sampling:


 It is a non-representative subset of some larger population,
and is constructed to serve a very specific need or purpose.
3. SNOWBALL SAMPLING:

 A subset of a purposive sample is a snowball sample


(chain referral sampling)

 So named because one picks up the sample along


the way, analogous to a snow all accumulating snow.

 Snowball samples are particularly useful in


hard-to-track population, such as those with illegal
behavior like drug users, homeless people, etc.
4. Quota sampling:

 General composition of the sample is decided in


advance.

 The only requirement is that the right number of


people be somehow found to fill these quotas.

 This is generally done to insure the inclusion of a


particular segment of the population.
Tests of significance
• Parametric test:-
one in which population constants are
used such as mean , variance etc and data
tend to follow one assumed or established
distribution such as normal,
binomial ,Poisson, etc
• Non- parametric tests:-
no constants are used ,data do not follow any
specific distribution and no assumptions are made
. E.g. to classify good, better and best you allocate
arbitrary no. to each category.
Tests of significance
Classification of tests

Parametric tests Non-parametric tests

• Mann Whitney U test


• t-test- paired/unpaired • Wilcoxon signed rank
• ANOVA test
• Chi square test • Mc nemar’s test
• Fisher’s test • Kruskal Wallis test
CHI SQUARE TEST
 Developed by Karl Pearson

 Used to determine whether there is a

statistically significant difference between


observed values and expected values in one
or more categories.

 Sample size should be > 30 and when

dealing with 2 or more groups of subject.


Steps
1. Test the null hypothesis
2. Then the x² statistic is calculated :
x²= Ʃ (O-E)²
E
O= observed frequency
E= expected frequency
Student’s t test

Designed by W.S. Gossett


Applied to find the difference between two means
Criteria for applying t test

1. Random samples
2. Quantitative data
3. Sample size < 30
4. Variable normally distributed
Unpaired t test
 when sample in two groups give individual value,

to test for the difference in between the groups

Paired t test
When each individuals give a pair of

observations, to test for the difference in the pair


of values
Analysis of Variance (ANOVA) test

 Compares more than two samples drawn from corresponding normal


population
If the difference between their means is significant - different agents
used do have different effect on the decrease in microbial load
To assess this difference in means- ANOVA test is important

E.g : to check if different agents used for subgingival irrigation have


an effect on the decrease in microbial load.
Use 3 groups (chlorhexidine , saline, povidone iodine)
ANOVA
• ONE WAY CLASSIFICATION

• Suppose there are three different preparations of tonic say A, B and C


which are to be compared for their effectiveness in controlling
anaemia.
• 3 groups of 7 patients and one control group getting no tonic at all.

• The three preparations and one group having none (called as control)
are referred to as 'treatments' (likewise fertilisers, drugs, etc., are also
called as treatments) in ANOVA.
ANOVA
• Response of patients in the 4 treatment groups

will vary from one patient to another and from

one treatment to the other.

• Part of the variation may be due to pure

chance which is called as error


ANOVA
• TWO-WAY ANOVA CLASSIFICATION
• Two-way analysis of variance is utilized when there is a
need to study the impact of two factors on variations in
a specific variable.
• E.g:
• i. The effect of age and sex on variations in height. It is
known that in children height increases with increasing
age. Also males are taller than females
• ii. The effect of protein levels and calories on the gain in
body weights.
• The assumptions made in this type of ANOVA are ANOVA
• i) The subjects must be chosen at random,

• ii) The variable under study must have normality


characteristics ( I.e. coeff of skewness = 0 )

• Variance between comparable groups are mostly same


or homogenous

• There is no interaction between the two factors


Z test

• Used to test the significance of difference in means


for large samples (>30)
• Pre-requisites are:
Sample must be randomly selected
Quantitative data
Sample size >30
Variable normally distributed
Observation-mean
standard deviation
THANK YOU !!!

You might also like