Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
12 views11 pages

Statistics Notes

The document provides a comprehensive overview of statistics, defining it as a branch of mathematics focused on data collection, organization, analysis, and interpretation. It covers key concepts such as measures of central tendency (mean, median, mode), measures of variability (range, variance, standard deviation), and the importance of these measures in summarizing and comparing data. Additionally, it discusses various statistical methods, sampling techniques, and limitations of statistics.

Uploaded by

ktd.temp
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views11 pages

Statistics Notes

The document provides a comprehensive overview of statistics, defining it as a branch of mathematics focused on data collection, organization, analysis, and interpretation. It covers key concepts such as measures of central tendency (mean, median, mode), measures of variability (range, variance, standard deviation), and the importance of these measures in summarizing and comparing data. Additionally, it discusses various statistical methods, sampling techniques, and limitations of statistics.

Uploaded by

ktd.temp
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Koel Das

XI ‘A’
Sophia High School

Statistics

Definition of Statistics -
Statistics can be defined as the branch of mathematics concerned with collecting, organizing,
analysing and drawing conclusions from numerical data.

Uses of Statistics -
1. Statistics simplifies complex information and is used in summarizing or describing
large amounts of data.
2. Since psychology is concerned with human behaviour, statistics help in comparing
individuals in various ways, with regard to traits, scores, anxiety, creativity and so on.
3. Statistics also helps in determining whether different aspects of behaviour are related.
4. It helps us in predicting future behaviours from current information. From the findings
obtained in a sample, inferences can be drawn about the population.

Branches of Statistics -
➢ Descriptive Statistics:
• It is the branch of mathematics that provides means of summarizing data.
• It deals with the presentation and collection of data.
➢ Inferential Statistics:
• It involves drawing the right conclusions from the statistical analysis that has been
performed using descriptive analysis.

Measures of Central Tendency -


• A measure of central tendency is a single value that attempts to describe a set of data
by identifying the central position within that set of data.
• It is the most representative score in the distribution of scores.
• The measures of central tendency are mean, median and mode.

Page 1 of 11
Mean -
The mean is the most popular and well-known measure of central tendency derived by
adding all scores and dividing it by number of scores.

Advantages:
• Its value is always definite. It is clearly defined.
• It is based on all the observations of the scores in the distribution.
• It can be used to calculate complex statistics like variability.
• It is easy to calculate and simple to understand.
• It is not affected by fluctuations of sampling.

Disadvantages:
• Since, it is computed from all the items, extreme scores can affect the mean.
• It cannot be calculated without all the items in a series.
• In certain cases, median and mode can be calculated with a quick inspection of scores.
But not with mean.
• It can be a figure that does not exist in the distribution at all.

Uses:
• When the central tendency with the greatest stability is necessary.
• When other statistics like standard deviation needs to be calculated later.
• When the scores are distributed symmetrically.
• When each score is important to determine the measure of central tendency.
• If the purpose is to find the average score, mean is the ideal measure.

Page 2 of 11
Median -
It is the measure of central tendency indicating the mid-point of an array of scores, arranged
in order of magnitude.

Advantages:
• It is clearly defined.
• It is based on all observations of scores in a distribution.
• It is not affected by extreme values.
• It can be easily located by inspection.

Disadvantages:
• Some items can be affected by sampling fluctuations.
• Sometimes not a good representation of a distribution of scores.
• Cannot be used for further calculations.
• The process of arranging it in ascending or descending order can be tedious.

Uses:
• Extreme scores do not affect median.
• Useful for measuring aspects like intelligence and sociability which can be directly
measured.
• Can be computed quickly and easily.
• When the midpoint of a distribution is required, median is used.

Page 3 of 11
Mode -
It is a measure of central tendency that indicates the most frequent score in an array of
scores. Mode is the value that appears most often in a set of data, and therefore, sometimes
can be considered as the most popular option.

Advantages:
• It is simple to understand.
• It is easy to calculate.
• It is not necessary to know all the values in the distribution, to calculate the mode.
• Being the most common item, the mode is not an isolated value like the median.

Disadvantages:
• It is not well defined.
• It is not based on all scores in the distribution.
• It cannot be used for further statistical calculations.
• It may not be representative in some cases.

Uses:
• When the measure of the most typical value or score is required, mode is used.
• When a quick approximate measure of central tendency is required, mode is used.
• We use mode with categorical, ordinal and discrete data.

Page 4 of 11
Measures of Variability -
It is the index of the spread or variability of scores in a distribution. The range, variance and
standard deviation are the most common/frequently used measures of variability.

Range:
It is the difference between the highest and lowest scores in the distribution of scores. It is
the most straightforward measure of variability to calculate and the simplest to understand.

Variance:
It is a measure of dispersion reflecting the average squared distance between each score and
the mean. Unlike range, the variance includes all values in the calculation by comparing each
value to the mean. While it is difficult to interpret the variance itself, the standard deviation
resolves the problem.

Standard Deviation:
It is a measure of dispersion reflecting the average distance between each score and the
mean. When the values in a dataset are grouped closer together, you have a smaller standard
deviation. On the other hand, when the values are spread out more, the standard deviation is
larger because the standard deviation is greater. The standard deviation uses the original
units of the data, which makes interpretation easier.

➢ Uses:
• The standard deviation has all the properties which can ideally measure dispersion or
variability.
• Its value is always definite
• It is based on all the scores in a distribution.
• It can be used when we want to find the statistics having the greatest stability.
• When co-efficient of correlation and other statistics have to be calculated it is very
useful.

Page 5 of 11
Importance of Central tendency and Variability -
• After the scores have been tabulated into a frequency distribution, usually the task is to
calculate one or more measures of central tendency in order to reduce the amount and
complexity of information and to compare them easily.
• It is useful to reduce the data into one common figure, a figure which does not lie in
the higher or lower extremes.
• Measures of central tendency provides us with information about where the centre of a
distribution lies.
• It is the most representative index or reflection of the group or set of scores.
• However, measures of central tendency are not sufficient themselves.
• Variability is the spread or dispersion of scores in a distribution.
• Two sets of scores can have Identical means, but they differ in variability.
• Measures of variability provide a means of describing the spread of scores in a
distribution.
• If the scores are less spread out, we know that the group is homogenous or similar.

Normal Probability Curve -


A normal distribution curve is a symmetrical bell-shaped frequency distribution. Most scores
are found near the middle and fewer and fewer occur towards the extremes. Many
psychological characteristics like intelligence, personality traits etc. are distributed in this
manner.

Characteristics:
• The curve is perfectly symmetrical.
• For this curve mean, median and mode are the same at the centre of the curve.
• Since there is only one maximum point in the curve, the normal curve is unimodal,
i.e., it has only one mode.
• The tails are asymptotic which means that the tails end in a curve but never touch the
horizontal axis.
• There are two equal halves (50%-50%).
• Tables exist so that we can find the proportion of scores above and below any part of
the curve, expressed in standard deviation units. The scores expressed in standard
deviation units are referred to as Z-scores.
Page 6 of 11
Advantages:
• It is also often used even when it just a rough approximation because it is easy to
handle.
• Helps to determine the proportion of scores between the mean and a particular score.
• Helps to determine the no. of people within a particular range of scores.
• It helps to determine percentile ranks, that is, the proportion of scores that lie above
and below a subject's score in a distribution.

Skewness (Measuring divergence from Normality) -


In the normal curve there is a perfect balance between the right and left halves of the figure
and all the three measures of central tendencies coincide with the mid-point.
A distribution is said to be skewed when the mean, median and mode fall at different points
in the distribution and the balance is shifted to one side or the other, that is, either towards
the right or towards the left. Thus, we can say skewness is the measure of degree of
asymmetry of a distribution.

Negative Skew:
A distribution is said to be skewed negatively or to the left when the scores are amassed at
the high end of the scale (right end) and spread out gradually at the low end of the scale (left
end).
If the left tail is more pronounced than the right tail, the function is said to have negative
skewness. The left tail is longer, the mass of the distribution is concentrated on the right end
of the figure. The distribution is said to be left skewed.

Positive Skew:
A distribution is said to be skewed positively or to the right when the scores are amassed at
the low end of the scales (left end) and spread out gradually at the high end (right end). A
distribution is said to be right skewed when the right tail is longer as the mass of distribution
is concentrated on the left of the figure.

Page 7 of 11
Measurement Scale -
One may define measurement as the application of rules for assigning number to objects.
The rules help to convert specific qualities like intelligence, anxiety, creativity etc. into
numbers.
The type of data selected determines the correct measurement scale. The measurement scale
in turn determines the appropriate statistical procedure for analysing a data and drawing
conclusions from that data. Each measurement scale has specific use. There are six common
types of scales -
1. Age Scales: A scale or test in which items are grouped not by the type of task but by
the average age at which the children pass each item. Scores are expressed as mental
age.

2. Nominal Scales: The are composed of sets of categories in which objects are
classified. Data used in the construction of a nominal scale is frequency data or the
number of subjects in each category.

3. Ordinal Scales: They indicate the order of the data according to some criteria.
Ordinal scales tell nothing about the distance between units of the scale and supply
information only about the order of preference.

4. Interval Scales: They have equal distances between the scale units and permit
statements to be made about those units as compared to other units. They do not allow
conclusions that one unit is a multiple of the other because on interval scales there is no
zero, that is, the scale does not allow the complete absence of the phenomenon being
measured. Interval scales permit a statement of "more than" or "less than" but not of
"how much" or "how many times more".

5. Ratio Scales: They have equal distances between scale units as well as an absolute
zero. Most measures encountered in daily life are based on the ratio scale.

6. Continuous or Discontinuous Scales: A continuous scale is one in which the


variable under construction can assume an infinite number of values.
A discontinuous scale expresses the measurement of the variable under consideration
in a finite number of ways.

Page 8 of 11
Standard Scores -
Raw scores from one test are usually not comparable to row scores from a test using a
different unit of measurement (e.g. time in seconds to inches). Standard scores allow us to
make comparisons between tests using different measurement units.

Types of Standard Scores:


1. Percentiles: They are specific scores or points within a distribution of scores.
Percentiles divide the total frequency for a set of observations into hundredths. They
indicate the score which lies in a defined percentage of scores. Percentile ranks are
the percentage of the proportion of scores that are lower than a given score.

2. Z-scores: The problem with the mean and standard deviation units is that they do not
convey enough information for us to make meaningful assessments or accurate
interpretations of the data. When a score is expressed in standard deviation units, it is
referred to a z-score. It is the difference between a score and the mean divided by the
standard deviation.

3. T-scores: They are divided scores with a mean of 50 and a standard deviation of 10.
The average T-score for a group of scores would be 50. T-scores were developed to
create an easily interpreted standard score. T-scores are represented by positive or
whole numbers. ∴ T=10+50

Page 9 of 11
Terminology –
Sample: Its simple described a part of population.
Random sampling: A sample drawn such that every member of a population has an equal
chance of being selected.
Stratified sampling: A sample drawn such that, identified sub-groups in a population are
represented proportionally, that is, the population is divided into subgroups and random
sampling is done.
Biased sampling: It is one in which the method used to create the sample results in samples
that are systematically different from the population.

Variable: An event or condition which can have different values. Ideally, in experiments, an
event or condition which can be measured, and which varies quantitatively.
Independent Variable: A condition selected or manipulated by an experimenter to see
whether it will influence behaviour.
Dependent Variable: The variable whose value depends or may depend on the value of the
value of the independent variable.
Compounding variable: They are variables which are not considered in the experiment, but
they do have an impact which is unwanted on the subjects and the results of the experiment.
Example: the behaviour of a person or animal in an experiment.

Group: Two or more people who interact with one another perceive themselves as part of a
group and are interdependent.
Experimental group: The group in an experiment that receives the independent variable but
is otherwise equivalent to the control group.
Control group: The group in an experiment which is equivalent to the experimental group
but which does not receive the independent variable.

Blind study: In this method, participants are usually unaware they are part of the
experimental group or the control group.
Double blind study: In this method, both the subjects and the experimenter remain unaware
of who belongs to the control group.

Page 10 of 11
Limitations of Statistics -
• Statistics deals with groups and aggregates only.
• Statistics is not applicable to qualitative data.
• If sufficient care is not exercised in collecting analysing and interpreting the data,
statistical data might be misleading.
• Statistics is liable to be misused.
• Statistical laws are not exact.
• There are too many methods to study problems.
• Results are true only on average.

Page 11 of 11

You might also like