0% found this document useful (0 votes)

5 views13 pages

Introduction To Statistics

The document introduces the fundamentals of statistics, emphasizing its application in analyzing multispectral data and digital image processing. It covers key concepts such as data types, measures of central tendency, and statistical properties, including sampling and inferential statistics. Additionally, it discusses the characteristics of geographic data and the importance of understanding variability and distribution in statistical analysis.

Uploaded by

Ameziane Bachir

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views13 pages

Introduction To Statistics

Uploaded by

Ameziane Bachir

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

Introduction to Statistics

Danny M. Vaughn, Ph.D., CMS

Introduction

The nature of statistical applications is introduced throughout the two Spectral

courses, and while a detailed treatment is beyond the course’s principal thesis,
there is a need to introduce these quantitative measures as a means of
understanding the analysis of multispectral data. This is a short discussion, and
many of the details, including more advanced formulas and spatial statistical
applications, are available upon request for those interested in penetrating deeper
into the quantitative aspects of digital image processing, Geographic Information
Systems, spatial modeling, and analysis.

Statistics

Statistics deal with the collection, classification, description, presentation, and

analysis (interpretation) of data (numerical information). They are based upon
observations or measurements of data. Statistics is inductive which implies
specific observations and measurements yield a more general conclusion.
Statistics rely upon some notion of repetition. It follows, estimates can be derived,
and variation and uncertainty of an estimate understood from repeated
observations. Statistics are often used to describe and summarize data. They can
also help to generalize complex spatial patterns such as clustering, uniform,
regular, and random spatial patterns. Probabilities or estimates of outcomes can
also be determined for an event at a given location within specified limits.
Statistics is sometimes referred to as a study of variation in data sets.

Statistics allow us to make inferences (conclusions most often based upon known
and accepted facts) from a sample, based upon a population of numerical data.
Sampling Statistics represent a portion of the total population set of data.
Populations are groups or aggregates of data. An estimate (statistic) is a property
of a sample drawn at random (by chance) from a population. Estimates are
expressed by roman letters. A sample standard deviation is symbolized by the
letter s, mean is xbar, and variance is s2. A more detailed discussion on these
measures will be addressed in later sections.
2

Selected Vocabulary

 Data – Numerical information.

 Data set – Groups of data in tabular format. A data set may consist of,
observations, variables, and variates.

 Observation - Elements of phenomena (e.g. individuals, actions,

processes).

 Variable – A property that can be measured, classified, or counted, e.g.

spectral values, male/female, discharge, velocity, etc.

 Variate – A particular value of a variable, e.g. a digital number in

multispectral imagery, discharge in m3s-1 (cubic meters per second), velocity
in ms-1 (meters per second).

 Descriptive Statistics – A concise, numerical or quantitative summary is

reported for a variable or data set. Statistics include:

 Measures of central tendency.

 Measures of dispersion and variability.

 Measures of shape or relative position.

 Spatial data.

 Location Issues.

 Inferential Statistics – A reported result (generalization) is derived from a

sample from a larger population. Inferential Statistics are based upon probability
theory.

 Population Statistics – Groups or aggregates of data.

 Sampling Statistics – A portion of the total set of population data.

 Parameter – A property descriptive of a population. Population parameters are

expressed by Greek letters:  is standard deviation, 
 is a population variance.

 Estimate (statistic) – A property of a sample drawn at random from a

population. Estimates are expressed by roman letters. Standard deviation is s,
mean is xbar, and variance is s2. Estimates are values based upon confidence
intervals.

 Function – When two variables are related such that the values of one are
dependent upon the values of the other. If the functional relationship is not
known, causal conclusions cannot be inferred.

 Hypothesis testing – Examples include: Z tests and t tests.

Characteristics of Geographic Data

 Primary Data – Data acquired directly from an original source, e.g. from
a field observation or measurement.

 Secondary Data – Pre-existing data from an agency or other source.

Variables of the Data Set

 Continuous Variable – Any value within a specifically identified range of

values. Values which belong to a continuous series include; height, weight,
chronological time, discharge, velocity, etc.

 Discontinuous (discrete) Variable – A specific (counted and limited to whole

numbers) values only. An eight bit spectral digital number range is from 0 to 255.
The size of a family (3) implies the exact size of the group. Other examples
include school enrollment and number of books in a library.

Levels of Measurement

 Nominal Variable – A qualitative property of equality or difference in

established categories. Variables must be exhaustive and mutually exclusive.
4

 Ordinal Variable – A property of equality or difference and rank order within

the data.

 Interval variable – A property of equality or difference, order, and no true 0

(starting) point within the data (temp in Fº or Cº).

 Ratio variable –A property of equality or difference, order, and a true 0

(starting) point within the data. Interval data may transformed to ratio data by
subtracting the differences of variates which eliminates or cancels out the arbitrary
origin.

Measurement Concepts

 Precision – Degree of exactness or a measure of repeatability. A measurement

of how close positions are clustered. Precision is based on a relative reference, e.g.
a circle 1 inch in diameter.

 Accuracy – The closeness of a position to a known absolute reference system.

 Validity – Credibility based upon operational definitions of acceptance. A

subjective parameter, e.g. level of poverty, quality of …, etc.

Reliability – How consistent, repeatable, or stable is the data over changes in

spatial pattern over time?

Basic Statistical Properties

Constant – A property common to all members of a group.

 Property1 – Multiplying a constant (c) by each score is equal to adding all the
scores (Xi), and multiplying by a constant (c): cXI:

cXi = cX1 + cX2 + cX3 + ....cXn

 Property 2 – If a given constant (C) equals 4, and there are 5 variables (N), then:

C = C + C + C + C + C; which equals NC:

4+4+4+4+4 = 20, and N(5) x C(4) = 20.

 Property 3 – The summation () of the sum of any number of terms is the sum
of the summations of these terms taken separately:

(Xi + Yi + Zi) = (X1 + Y1 + Z1) + (X2 + Y2 + Z2)+

(X3 + Y3 + Z3 )... = Xi + Yi + Zi

 Property 4 - The sum of the products of two sets of paired numbers is:

Xi Yi ; which equals: X1Y1 + X2Y2 + XnYn...

 Property 5 - Given a set of values (Xn), the sum of the squared values

(Xn2 ) is equal to Xn2 where, Xn X2

3 9

2 4

5 25

6 36
4 16

Xn = 20 Xn2 = 90
______ ______

 Property 6 - Given a set of values (Xn), the square of the sum of the values
(Xn's) is equal to : (X)2

where: Xn

Xn = 20
______
(Xn)2 = (20) 2 = 400
6

The Normal Distribution of Scores

Frequency curves are conceptualized as extending across the x axis from minus
infinity to plus infinity; although they realistically taper off barely above the x axis.
The total area under the curve is always infinite since the curve never intersects
with the x axis, but for convenience the total area is taken as unity (1). The
Normal Curve (Figure 1) is written in standard score (Z scores) form with a
mean equal to 0, variance equal to 1, and standard deviation equal to 1.

Figure 1. A normal distribution illustrating the area percentages for plus or minus
three standard deviations.

Standard Scores (Z scores) are derived as a transformation from raw scores

(variates) to standard deviation units which are used to compare a score
(variates) with a collection of scores (variates) derived from different procedures
(e.g. an English vs. a Mathematics test). Position is considered rather than the
magnitude and measurement of units of scores. The formula for computing a Z
score from a raw score is: Z score = (Xi - Xbar)/s; where Xi is a variate, Xbar is the
mean, and s is the standard deviation. The discussions for measures of central
tendency and measures of dispersion are presented later in this set of statistics
notes.

Properties of a Z score:

 If a raw score is >X, it is referred to as a positive Z score.

 If a raw score is <X, it is referred to as a negative Z score.
 If a raw score =X, it is referred to as a 0 Z score.
7

Standard scores (Z scores) have a mean = 0, and a standard deviation = 1, thus

they are readily amendable to algebraic manipulation. After computing for a
standard score, locate the Z value in a table of Z scores (reference to any statistic’s
textbook). This will give a value of area between the mean and a Z score. If the
raw score is greater than the mean, add the Z score to 50 to obtain the percentile
rank. If the raw score is lower than the mean, subtract the Z score from 50.

Frequency Distribution

A Scatterplot is a graphic distribution of two variables (points), e.g. brightness

values (Digital Numbers/DN) for two spectral bands. A frequency distribution
shows the number of times each value occurs, and arranges scores from lowest to
highest.

Given a frequency distribution of values (for example brightness values of 56, 57,
67, 99, 120), a histogram plots frequency (counts) on the vertical axis, and
variables (brightness values in this case) are plotted on the horizontal axis.

The Ogive or cumulative frequency (Cf) plot is a continuous count of frequencies

for each BV at or below a given level. The cumulative percentage frequency
(Cf%) is the percentage of a given number of BV's to the total (200 in this
example), so that a Cf% = Cf / 200.

BV's Frequency Cf Cf%

56 5 5 0.025

57 34 39 0.20

67 100 139 0.70

99 45 184 0.92

120 16 200 1.00

A percentile rank is a percentile corresponding to a raw score in which as an

example, if one is in the 90 percentile (percentile rank), it would be interpreted as
90 percent of the scores are at or below this value, while 10 percent of the scores
are above this value. To obtain the percentile rank for a given score, e.g. (67):
8

1. Calculate the lower true limit of the score (67) by subtracting 0.5 unit from the
score (66.5).

2. Subtract the lower limit (66.5) from the score whose percentile rank is being
estimated (67).

3. Multiply the result by the frequency of scores with a value of 67 (100).

4. Divide the result by the width of the class interval (1 in this case).

5. Add the result to the cumulative frequency (139).

6. Divide the result by the total number of frequencies (200).

(((67-66.5)(100)/1)+139)/200 = .945 x 100 = 95%.

In a given distribution of brightness values (frequency distribution), it is important

to recognize a variety of properties about the distribution.

Measures of Central Tendency

Central tendency is a method of describing the spread of the distribution of scores

around a central measure of the frequency distribution. The four properties
include: mode, median, arithmetic mean and deviation from the mean.

Arithmetic Mean (Xbar) – The sum of the value of Xi (Xi) multiplied by the
frequency of its occurrence (fi), divided by the number of measurements (N):
Mean equals the arithmetic average:

Xbar = Xi/N also

Xbar = f1X1 + f2X2 + ....fnXn/N = fiXi / N

where Xi = fiXi

Deviation from the mean (xi) – The difference between a particular score (Xi) and
the mean (Xbar):

xi = (Xi- Xbar)
9

 Property 1 of the mean - The sum of the deviations of all the measurements in a
set from their arithmetic mean equals 0.

(Xi - Xbar) = Xi - X = NX - NX = 0

since X = Xi/N; then Xi = NX

 Property 2 of the mean – The sum of squares or the sum of deviations from the
arithmetic mean, (Xi - Xbar)2 or x2 is less than the sum of squares of deviations
from any other value.

 Property 3 of the mean – The mean is that measure of central tendency about
which the sum of squares is a minimum. It follows the mean is a measure of
central location in the least square sense.

Median – The point on the number scale such that half of the observations fall
above it and half below it.

Mode – The most frequently occurring value. If the frequency of occurrences is

equal for each value, there is no mode. Where two values have equal frequency,
the mode is determined by adding the brightness values of the two that occur
equally, and dividing by the total number of repetitive values (2). The mode
represents the highest point on a curve (histogram).In a normal distribution
(symmetric bell curve, Figure 2), the mean, mode, and median are the same values.
10

Figure 2. A normal curve illustrating the relationship between the mean, median,
and mode.

When a distribution is not normal, it results in the tail extending long to the right
(positively skewed toward the high end of the distribution). When the tail trends
to the left it is negatively skewed toward the low end of the distribution.

In a positively skewed distribution the mean, median, and mode are distributed as
illustrated below in Figure 3.
11

Figure 3. A positively skewed distribution illustrating the relationship between the

mean, median, and mode.

In a negatively skewed distribution the mean, median, and mode are distributed as
illustrated below in Figure 4.

Figure 4. A negatively skewed distribution illustrating the relationship between

the mean, median, and mode.
12

Measures of Dispersion and Variation

The sample (unbiased) variance (s2) is the mean (average) of the squared
deviations around the mean:

s2 = (Xi - Xbar)2 / N-1

A sample variance includes degrees of freedom (df) defined as the total number of
variables (N) minus the number of constraints placed on the data or the number of
variables free to vary. For example, given 5 measurements which equal 100, four
are free to vary, but the last must be a value which when combined with the other 4
= 100, thus:

df = N-1 or 4

In many cases a variate (e.g. a brightness value) can be less that the mean of the
distribution resulting in a negative variance. A preferred method of dealing with
negative signs would be to report a measure of variability as the standard
deviation (s) which is a measure of variation in units of original measurements.

The sample standard deviation is written as:

s = √(Xi - Xbar)2 / N-1

Standard deviation is a measure of the spread of data (variability) from the mean
value in the distribution of a data set (Figure 1). The area under the normal curve
is divided into standard deviation units such that a positive one (+1.0) standard
deviation unit accounts for 0.3413 percent (34.13% since the total area under the
curve is assumed to equal unity or 1). Alternatively, a negative one (-1.0) standard
deviation unit also accounts for 0.3413 percent (34.13%) of the total area under the
curve. It follows that a combined plus and minus one standard deviation from a
mean statistically states that approximately 68% of the pixels values in the
distribution are found between the minus 1.0 and plus 1.0 standard deviation. For
example, TM band 5 in the “Forest” class statistics has a mean DN value of 62.0
with a standard deviation of 7.0. This indicates that approximately 68% (68.2%)
of the 702 pixels in the training data set for band 5 are found in the DN value range
55-69 (62-7 and 62 + 7). If the standard deviation is 2.0 from the mean, then
13

approximately 95% (95.4%) of the pixels in band 5 of the 702 pixel data set are
found with DN values that range between 48 (62-14) and 76 (62+14). Three
standard deviations from a mean value accounts for over 99% (99.7%) of a data
distribution within the range of 41(62-21) and 83 (62+21).

Correlation*

When two variables are related such that the values of one are dependent upon the
values of the other, this relationship is termed a function. Correlation is a degree
of relationship between variables. The range is from r = +1 (direct relationship) to
r = -1 (inverse relationship) with |1| as a perfect predictor, and all points plotted on
a straight line. The closer an association between two variables approaches one,
the higher the correlation. No correlation between observations implies they are
independent of each other; therefore, they are not correlated. In image processing,
high correlations between two bands would suggest that using only one of the
bands would account for a majority of the variability in the spectral values
throughout the entire scene. It follows correlation can be used to reduce the
dimensionality of the data (use less bands in a final classification) to a more
manageable number. This has important applications when using hyper-spectral
data scenes with over 200 spectral bands, since computer processing time could
become very long the more bands that are used in the classification process.

Covariance*

Covariance is a joint variation between two independent variables (expressed in

both directions along the X axis) about their common mean. It may also be stated
as how much two random variables change together. When plotted, it represents a
bivariate normal probability surface (the counterpart to the normal distribution).
Volume under any part of the surface may be expressed as the probability of an
individual pairing of digital numbers (brightness values) between two spectral
bands occurring at that location under the delineated surface.

* A more comprehensive treatment can be requested by contacting Dr. Danny M.

Vaughn at [email protected].

250+ TOP MCQs On Geotechnical Engineering and Answers
100% (4)
250+ TOP MCQs On Geotechnical Engineering and Answers
4 pages
Introduction of Sludge Management
No ratings yet
Introduction of Sludge Management
154 pages
Senarai Amali Fizik SPM Ting 4
100% (4)
Senarai Amali Fizik SPM Ting 4
52 pages
Advance Statistics For Data Science and Data Analysis
No ratings yet
Advance Statistics For Data Science and Data Analysis
47 pages
Mathematics in The Modern World
No ratings yet
Mathematics in The Modern World
2 pages
1.1.1 Binary Systems Worksheet
No ratings yet
1.1.1 Binary Systems Worksheet
5 pages
ChatGPT in Exploratory Data Analysis
No ratings yet
ChatGPT in Exploratory Data Analysis
6 pages
STATISTICS
No ratings yet
STATISTICS
98 pages
StatisticsRefresher Part1
No ratings yet
StatisticsRefresher Part1
7 pages
RME Closed Door Part 1 - PEC
100% (2)
RME Closed Door Part 1 - PEC
14 pages
Module in MMW Semi2022 23 - 053523
No ratings yet
Module in MMW Semi2022 23 - 053523
10 pages
Lesson 1: Engineering Data Analysis First Semester - A.Y. 2021 - 2022
100% (1)
Lesson 1: Engineering Data Analysis First Semester - A.Y. 2021 - 2022
4 pages
Session 01
No ratings yet
Session 01
16 pages
Inverse of A Matrix
100% (1)
Inverse of A Matrix
71 pages
Statistical Tools and Techniques: College-Level Notes
No ratings yet
Statistical Tools and Techniques: College-Level Notes
14 pages
Elevator Installation Safety Guide
No ratings yet
Elevator Installation Safety Guide
30 pages
Notes in Statistics
No ratings yet
Notes in Statistics
9 pages
Statistics
No ratings yet
Statistics
63 pages
Ai - Ssmda
No ratings yet
Ai - Ssmda
142 pages
UNIT II 2 Marks
No ratings yet
UNIT II 2 Marks
6 pages
Basic Statistics
No ratings yet
Basic Statistics
95 pages
Basic Statistics and Probability Assignment
No ratings yet
Basic Statistics and Probability Assignment
11 pages
Assignment JTW115E 2023-2024 v5
No ratings yet
Assignment JTW115E 2023-2024 v5
5 pages
Basic Statistics Power Point
No ratings yet
Basic Statistics Power Point
41 pages
Geo-Statistical Analysis Teaching Notes 1
No ratings yet
Geo-Statistical Analysis Teaching Notes 1
19 pages
Predictive Analytics Notes1
No ratings yet
Predictive Analytics Notes1
37 pages
MMW Reviewer
No ratings yet
MMW Reviewer
9 pages
Stat Quick Overview
No ratings yet
Stat Quick Overview
35 pages
Data and Its Presentation
No ratings yet
Data and Its Presentation
60 pages
Probstats Reviewer
No ratings yet
Probstats Reviewer
3 pages
Introduction To Statistics and Probability
No ratings yet
Introduction To Statistics and Probability
134 pages
Basic Concepts of Statistics
No ratings yet
Basic Concepts of Statistics
41 pages
MMW Reviewer
No ratings yet
MMW Reviewer
3 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
38 pages
Statistics MMW
No ratings yet
Statistics MMW
65 pages
Basic Statistics Notes
No ratings yet
Basic Statistics Notes
10 pages
General Anisotropic Elasticity: Abstract This Chapter Is An Introduction To General Anisotropic Elasticity, I.E. To The
100% (1)
General Anisotropic Elasticity: Abstract This Chapter Is An Introduction To General Anisotropic Elasticity, I.E. To The
56 pages
Statistics Lecture 1
No ratings yet
Statistics Lecture 1
20 pages
Training Manual Fish Stock Assessment and Management - 2015 - T.V. Sathianandan
No ratings yet
Training Manual Fish Stock Assessment and Management - 2015 - T.V. Sathianandan
6 pages
MMW Reviewer For Midterms
No ratings yet
MMW Reviewer For Midterms
4 pages
Chapter One Probability and Statistics
No ratings yet
Chapter One Probability and Statistics
57 pages
Statistics
No ratings yet
Statistics
81 pages
Chapter2-Statistical Analysis
No ratings yet
Chapter2-Statistical Analysis
86 pages
Session 1 ISM May 2024
No ratings yet
Session 1 ISM May 2024
59 pages
Reviewer Part 1
No ratings yet
Reviewer Part 1
9 pages
Presentation 1
No ratings yet
Presentation 1
9 pages
Math Notes Module 4A
No ratings yet
Math Notes Module 4A
4 pages
Basics of Statistics
No ratings yet
Basics of Statistics
32 pages
Midterms Gec Math Adooooor
No ratings yet
Midterms Gec Math Adooooor
6 pages
Introduction to Statistics: Descriptive & Inferential
No ratings yet
Introduction to Statistics: Descriptive & Inferential
23 pages
Statistics Ppt.1
No ratings yet
Statistics Ppt.1
39 pages
Statistics Basics for Data Science
100% (1)
Statistics Basics for Data Science
27 pages
Statistics
No ratings yet
Statistics
68 pages
Statistics
No ratings yet
Statistics
12 pages
Stats Reviewer
No ratings yet
Stats Reviewer
3 pages
Statistics 24 04 2021 20210618114031
No ratings yet
Statistics 24 04 2021 20210618114031
41 pages
STA301 IMP Notes Headings and Some Questions Answers Prepared by
No ratings yet
STA301 IMP Notes Headings and Some Questions Answers Prepared by
32 pages
Statistics - Basic Concepts
No ratings yet
Statistics - Basic Concepts
29 pages
Activity 1.6
No ratings yet
Activity 1.6
3 pages
Guiang Mamow Paper 1 Statistical Terms
No ratings yet
Guiang Mamow Paper 1 Statistical Terms
5 pages
Differentiates Kinds of Variables and Their Uses
No ratings yet
Differentiates Kinds of Variables and Their Uses
4 pages
AutoCAD and Its Applications - Capítulo 5
100% (1)
AutoCAD and Its Applications - Capítulo 5
26 pages
Intro to Statistics for Students
No ratings yet
Intro to Statistics for Students
18 pages
Basics For Understanding
No ratings yet
Basics For Understanding
8 pages
Riser Concept Selection For FPSO in Deepwater Norwegian Sea: A Case Study
No ratings yet
Riser Concept Selection For FPSO in Deepwater Norwegian Sea: A Case Study
12 pages
Stat-Reviewer Notes
No ratings yet
Stat-Reviewer Notes
25 pages
Basics of Statistics: Definition: Science of Collection, Presentation, Analysis, and Reasonable
100% (1)
Basics of Statistics: Definition: Science of Collection, Presentation, Analysis, and Reasonable
33 pages
Unit One Graphing and Descriptive Statis-1
No ratings yet
Unit One Graphing and Descriptive Statis-1
12 pages
Statistics
No ratings yet
Statistics
25 pages
Study Guide For Statistics
No ratings yet
Study Guide For Statistics
7 pages
NITKclass 1
No ratings yet
NITKclass 1
50 pages
Chapter 1
No ratings yet
Chapter 1
41 pages
NEET Portion Status
No ratings yet
NEET Portion Status
2 pages
Fix
No ratings yet
Fix
4 pages
Module 2 - Statistical Foundations
No ratings yet
Module 2 - Statistical Foundations
108 pages
AI Unit 4
No ratings yet
AI Unit 4
11 pages
M2 Bertoin Subordinateurs
No ratings yet
M2 Bertoin Subordinateurs
92 pages
BA Graville-Chapter 3
No ratings yet
BA Graville-Chapter 3
36 pages
New Pattern Input Output Exam Cart
No ratings yet
New Pattern Input Output Exam Cart
55 pages
Civil Engineering Channel Flow
No ratings yet
Civil Engineering Channel Flow
22 pages
Basic Concepts in Statistics
No ratings yet
Basic Concepts in Statistics
42 pages
1st Year Math Worksheet: Number Systems
No ratings yet
1st Year Math Worksheet: Number Systems
2 pages
I. Introduction To Computer Programming
No ratings yet
I. Introduction To Computer Programming
14 pages
Statistics II Central Tendency and Sprea
No ratings yet
Statistics II Central Tendency and Sprea
4 pages
Chapter 3 Statistics For Describing Expl
No ratings yet
Chapter 3 Statistics For Describing Expl
28 pages
Electronic Cheat Sheet
No ratings yet
Electronic Cheat Sheet
1 page
Measures of Central Tendency and Dispers
No ratings yet
Measures of Central Tendency and Dispers
9 pages
Research and Statistics Distribution Var
No ratings yet
Research and Statistics Distribution Var
4 pages
The Effects of Instrument in Measurements
No ratings yet
The Effects of Instrument in Measurements
18 pages
Straight To Market in An Autoinjector
No ratings yet
Straight To Market in An Autoinjector
8 pages
Your Paper: You January 3, 2025
No ratings yet
Your Paper: You January 3, 2025
3 pages
FYP Proposal Form
No ratings yet
FYP Proposal Form
7 pages
Dzexams 3am Anglais 325911
No ratings yet
Dzexams 3am Anglais 325911
3 pages
Dzexams 3am Anglais 150165
No ratings yet
Dzexams 3am Anglais 150165
3 pages
Non-Invasive Cylicon (Cylinder and Cone) Antenna For Blood Glucose Monitoring
No ratings yet
Non-Invasive Cylicon (Cylinder and Cone) Antenna For Blood Glucose Monitoring
5 pages
4in SB12MNRX2 25 4
No ratings yet
4in SB12MNRX2 25 4
1 page

Introduction To Statistics

Uploaded by

Introduction To Statistics

Uploaded by

Introduction to Statistics

Danny M. Vaughn, Ph.D., CMS

The nature of statistical applications is introduced throughout the two Spectral

Statistics deal with the collection, classification, description, presentation, and

 Data – Numerical information.

 Observation - Elements of phenomena (e.g. individuals, actions,

 Variable – A property that can be measured, classified, or counted, e.g.

 Variate – A particular value of a variable, e.g. a digital number in

 Descriptive Statistics – A concise, numerical or quantitative summary is

 Measures of central tendency.

 Measures of dispersion and variability.

 Measures of shape or relative position.

 Inferential Statistics – A reported result (generalization) is derived from a

 Population Statistics – Groups or aggregates of data.

 Sampling Statistics – A portion of the total set of population data.

 Parameter – A property descriptive of a population. Population parameters are

 Estimate (statistic) – A property of a sample drawn at random from a

 Hypothesis testing – Examples include: Z tests and t tests.

Characteristics of Geographic Data

 Secondary Data – Pre-existing data from an agency or other source.

Variables of the Data Set

 Continuous Variable – Any value within a specifically identified range of

 Discontinuous (discrete) Variable – A specific (counted and limited to whole

 Nominal Variable – A qualitative property of equality or difference in

 Ordinal Variable – A property of equality or difference and rank order within

 Interval variable – A property of equality or difference, order, and no true 0

 Ratio variable –A property of equality or difference, order, and a true 0

 Precision – Degree of exactness or a measure of repeatability. A measurement

 Accuracy – The closeness of a position to a known absolute reference system.

 Validity – Credibility based upon operational definitions of acceptance. A

Reliability – How consistent, repeatable, or stable is the data over changes in

Basic Statistical Properties

Constant – A property common to all members of a group.

cXi = cX1 + cX2 + cX3 + ....cXn

C = C + C + C + C + C; which equals NC:

4+4+4+4+4 = 20, and N(5) x C(4) = 20.

(Xi + Yi + Zi) = (X1 + Y1 + Z1) + (X2 + Y2 + Z2)+

(X3 + Y3 + Z3 )... = Xi + Yi + Zi

Xi Yi ; which equals: X1Y1 + X2Y2 + XnYn...

(Xn2 ) is equal to Xn2 where, Xn X2

The Normal Distribution of Scores

Standard Scores (Z scores) are derived as a transformation from raw scores

 If a raw score is >X, it is referred to as a positive Z score.

Standard scores (Z scores) have a mean = 0, and a standard deviation = 1, thus

A Scatterplot is a graphic distribution of two variables (points), e.g. brightness

The Ogive or cumulative frequency (Cf) plot is a continuous count of frequencies

BV's Frequency Cf Cf%

67 100 139 0.70

120 16 200 1.00

A percentile rank is a percentile corresponding to a raw score in which as an

3. Multiply the result by the frequency of scores with a value of 67 (100).

5. Add the result to the cumulative frequency (139).

6. Divide the result by the total number of frequencies (200).

In a given distribution of brightness values (frequency distribution), it is important

Measures of Central Tendency

Central tendency is a method of describing the spread of the distribution of scores

Xbar = Xi/N also

Xbar = f1X1 + f2X2 + ....fnXn/N = fiXi / N

where Xi = fiXi

(Xi - Xbar) = Xi - X = NX - NX = 0

since X = Xi/N; then Xi = NX

Mode – The most frequently occurring value. If the frequency of occurrences is

Figure 3. A positively skewed distribution illustrating the relationship between the

Figure 4. A negatively skewed distribution illustrating the relationship between

Measures of Dispersion and Variation

s2 = (Xi - Xbar)2 / N-1

The sample standard deviation is written as:

s = √(Xi - Xbar)2 / N-1

Covariance is a joint variation between two independent variables (expressed in

* A more comprehensive treatment can be requested by contacting Dr. Danny M.

You might also like