Psych Stats
Psych Stats
Statistic
STATISTICS - a value (numerical) that describes the sample.
- usually derived from measurements of individuals in the
sample
I. INTRODUCTION AND DESCRIPTIVE STATISTICS
Data
➔ Measurements or observations
Datum
➔ Singular term for data
➔ Commonly called score or raw score
Data Set
➔ a collection of measurements or observations
Parameter
- a value (numerical) that describes a population
- usually derived from measurements of the individuals in
the population
Example
Individual Variables
- characteristics or attributes of individuals that are being
studied in a research project
- Describe individual variables as they naturally exist
1. Manipulation
- The researcher manipulates one
Two Variables in Experimental Method
variable by changing its value from
one level to another. A second variable
● Independent Variable
is observed (measured) to determine
- The variable that is manipulated by the
whether the manipulation causes
researcher
changes to occur.
- The manipulated variable
2. Control
- The researcher must exercise control
● Dependent Variable
over the research situation to ensure
- The variable that is observed to assess the
that other extraneous variables do not
effect of the treatment
influence the relationship being
- The variable being measured
examined.
Conditions In An Experiment Pre-existing Groups
Interval
- Each measurement category defined by boundaries
Real Limits
- Boundaries of intervals for scores that are represented
on continuous number line
- Positioned exactly halfway between adjacent scores
Example:
For example, two people who both claim to weigh 150 pounds are
probably not exactly the same weight. However, they are both
around 150 pounds. One person may actually weigh 149.6 and the
other 150.3.
Just simply add 0.5 to the lowest score and subtract 0.5 to the
highest score
Nominal Scale
- Set of categories that have different names
- Used to categorize
Example:
- Socioeconomic status
Categories: upper, middle lower
- T-shirt sizes
Categories: small, medium, large
Interval Scale
- Ordered categories that are all intervals of exactly the
same size (no absolute zero)
- Data can be categorized, ranked, and evenly spaced
Ratio Scale
- An interval scale with the additional feature of an
absolute zero point
V. Statistical Notations
X, Y - variables
X - raw score/original value
N- number of scores in a set
N - population
n - sample size
Test your knowledge!
Σ - summation
Σx - summation of x
Frequency distribution
- An organized tabulation of the number of individuals
located in each category on the scale of measurement.
- Can be structured as a table or as a graph but should
present the same two elements:
1. Set of categories
2. Record of frequency/number of individual in
each category
Types:
● Cumulative Frequency
- Sum of the frequencies less than or equal to
each value or class interval of a variable
See answers in p. 66
Additional:
Example
Scores:
82, 75, 88, 93, 53, 84, 87, 58, 72, 94, 69, 84, 61,
91, 64, 87, 84, 70, 76, 89, 75, 80, 73, 78, 60
42 rows are still too many, hence let’s use the systematic
trial-and-error approach (guidlines a and b) using interval widths
which is usually 2, 5, and 10.
X-axis
- The horizontal line
- The abscissa
- Set of x values
Y-axis
- Vertical line
- The ordinate
Notice that an interval width of 5 will result in about 10 intervals, - Frequencies
which is exactly what we want.
Graphs for Interval or Ratio Data IV. The Shape of a Frequency Distribution
Histograms Symmetrical
A. The height of the bar corresponds to the frequency for - one side of the distribution is a mirror image of the other
that category.
B. For continuous variables, the width of the bar extends to Skewed
the real limits of the category. For discrete variables, - the scores tend to pile up toward one end of the scale
each bar extends exactly half the distance to the adjacent and taper off gradually at the other end
category on each side.
3 Characteristics of distribution:
________
Tail
- the section where the scores taper off toward one end of
a distribution
Polygons
- List the numerical values along the x-axis Positively skewed
- A dot is centered above each score so that the vertical - the tail points toward the positive end of the x-axis
position of the dot corresponds to the frequency for the (right)
category.
Negatively skewed
- The tail points to the left
Bar graphs
- A bar graph is essentially the same as a histogram,
except that spaces are left between adjacent bars.
See p. 76 for answers
Percentile Rank
- The percentage of individuals in the distribution with Where:
scores to or less than the particular value.
- Describe the position of individual scores within a cf = cumulative frequencies
distribution N = total number of scores or the totality of the frequencies
Percentile
- A score that is identified by its percentile rank
Cumulative Frequency
- Show the number of individuals located at or below each
score
- Obtained by adding up the frequencies in and below that
category
Example
Practice exercise
Bottom = 2
Second to the last: 4+2 = 6
Third: 2+4+8 = 14
Second: 2+4+8+5 = 19
First: 2+4+8+5+1 = 20
Cumulative percentage
- Convert frequencies into percentages
- Shows the percentage of individuals who are
accumulated as you move up the scale
CHAPTER 3 - CENTRAL TENDENCY
I. Overview
Central tendency
- A statistical measure to determine a single score that
defines the center of a distribution
- Attempts to identify the “average” or “typical” individual
II. Mean
Mean
- Arithmetic average
- The sum of the scores divided by the number of scores
- Represented by the x̄ or M (for sample) and μ (for
population)
- Balance point of the distribution
Example:
The mean is
M = $5 per boy
Find: the total money for the whole group (Σx)
In this, case we need to derive or just use or logic lmao (but the
formula is graded so…)
Notice that the mean balances the distances. That is, the total
distance below the mean is the same as the total distance above
the mean:
Weighted mean
- The overall sum of the scores for the combined group Σx = 10+18+32+6
(Σx), and Σx = 66
- The total number of scores in the combined group
Example:
First sample: there are 12 people and each person receives $6 Characteristics of the Mean:
n = 12
M=6 Changing a score
- changing the value of any score changes the mean.
Second sample: there are 12 people and each person receives $7
n=8 Example:
M=7
Quiz scores: 9, 8, 7, 5, and 1
If the two samples are combined, what is the mean for the total n=5
group? Σx = 30
9, 8, 7, 5, 8
Computing a Mean from a Frequency Distribution Table n=5
Σx = 37
Introducing a new score or removing a score
- Adding a new score to a distribution, or removing an
existing score, usually changes the mean.
Example:
Original Sample
n=5
M=7
Σx = 35
See p.106 for answers
What happens to the mean if a new score of X = 13 is
added to the sample?
Ordinal Scale
- No specific symbols or notation to identify the median
- Poin on the measurement scale below which 50% of the
score in the distribution are located
IV. Mode
Mode
- It has the greatest frequency
- The term mode means the “customary fashion” or a
“popular style”
- Only measure of central tendency that corresponds to an
actual score in the data
- It is possible for a distribution to have more than one
mode
- Easily located by finding the peak in a frequency
distribution graph or table
Example of mode.
Mean = 20.3
Median = 11.50
Notice that the mean is not very representative of any score in this
distribution. Although most of the scores are clustered between 10
and 13, the extreme score of X= 100 inflates the value of X and
distorts the mean.
See p. 113 for answers
Undetermined Values
V. Selecting a Measure of Central Tendency
- Unknown or the undetermined score
- Ordinal measurements allow you to determine direction
Example: (greater than or less than) but do not allow you to
determined distance
- Median: direction
- Mean: distance
Nominal Scale
- Differentiated only by names
- Do not measure quantity (distance/direction)
Discrete Variables
- Exists only in whole, indivisible categories
Describing Shape
- Because the mode requires little or no calculation, it is
often included as a supplementary measure along with
the mean or median as a no-cost extra.
- The value of the mode (or modes) in this situation is that
it gives an indication of the shape of the distribution as
well as a measure of central tendency.
Open-ended distributions
- No upper limit (or lower limit) for one of the categories
Symmetrical distribution
● Median - middle
● Mean - equal to the median
● Mode - if there is only one mode, then it has the same
value too
Ordinal Scale
- Directions not distance
CHAPTER 4 - MEASURES OF VARIABILITY
Variability
- Provides a quantitative measure of differences between
scores in a distribution
- Describes the degree to which the scores are spread out
or clustered together
Two purposes:
Range
● Formula: Xmax - Xmin
- Distance covered by the scores in a distribution
- Difference between the upper real limit (URL) for the
largest score (Xmax) and the lower real limit (LRL) for
the small score (Xmin)
Quartiles
- Measures of central tendency that divide a group of data
into four subgroups.
Q1 = 25th percentile
Q2 = 50th percentile
Q3 = 75th percentile
Semi-Interquartile Range
- Describes the variability, commonly transformed into the
semi-interquartile range.
- Half of the interquartile range
- Gives a better and more stable measure of variability
than the range.
Standard Deviation
- Most commonly used and the most important measure of
variability
- it provides a measure of the standard distance form the
mean
- The square root of the variance
Population Variance
- Equals the mean squared deviation
- The average squared distance from the mean
1. Find the deviation (distance from the mean) for each
score (X - μ)
2. Square each deviation (X - μ)²
3. Find the average of the squared deviations called the sum
of squared deviations (SS) Σ(X - μ)²
4. Divide the SS by the sample size Σ(X - μ)²/N
5. Take the square root of the variance to get the standard
deviation or the standard distance from the mean.
Example: Population Variance:
Sample Variance:
Definitional formula: SS = Σ(X - μ)² ● Adding a constant value to every score in a distribution
does not change the standard deviation.
Computational formula:
● However, multiplying every score by a constant causes
the standard deviation to be multiplied by the same
constant.
CHAPTER 5 - Z - SCORES: LOCATION OF
SCORES AND STANDARDIZED
DISTRIBUTIONS
Z-score formula:
Example 1:
See answers in p. 142 A distribution of scores has a mean of 100 and a standard
deviation of 10. What z-score corresponds to a score of X 130 in
this distribution?
DETERMINING A RAW SCORE (X) FROM A z-SCORE
Example 2:
Formula:
A distribution of scores has a mean of 86 and a standard
deviation of 7. What z-score corresponds to a score of X 95 in
this distribution?
Example:
Example:
1. Shape
- The distribution of z-scores will have exactly
the same shape as the original distribution of
scores.
2. The mean
- The z-score distribution will always have a
mean of zero. In Figure 5.5, the original
distribution of X values has a mean of 100.
When this value, X 100, is transformed into a
z-score, the result is
Learning Check