0% found this document useful (0 votes)

12 views8 pages

Statistics Unit 2.3

The document outlines various types of regression analysis, including linear, multiple linear, polynomial, ridge, lasso, elastic net, logistic, Poisson, support vector, and decision tree regression. It also explains the normal distribution, its significance, properties, and the concept of standard normal distribution and z-scores. The z-scores are used for standardizing data and comparing scores across different distributions.

Uploaded by

bauriapurba95

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views8 pages

Statistics Unit 2.3

Uploaded by

bauriapurba95

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Types of regression analysis

1. Linear Regression
2. Multiple Linear Regression
3. Polynomial Regression
4. Ridge Regression
5. Lasso Regression
6. Elastic Net Regression
7. Logistic Regression
8. Poisson Regression
9. Support Vector Regression (SVR)
10. Decision Tree and Random Forest Regression

1. Linear Regression:

Linear regression is the simplest form of regression that models the relationship between one
independent variable (X) and one dependent variable (Y) using a straight line.

Equation: Y=β0+β1X+ϵY

Where:

 Y = Dependent variable (target)

 X = Independent variable (predictor)
 β0 = Intercept
 β1 = Slope
 ϵ = Error term

Example: Suppose you want to predict a student's score based on the number of hours
studied.

2. Multiple Linear Regression

This regression model uses two or more independent variables to predict the dependent variable.

Equation: Y=β0+β1X1+β2X2+...+βnXn+ϵ

Example: Predicting house prices based on square footage and location rating.

3. Polynomial Regression:

Polynomial regression fits a non-linear relationship by adding powers of the independent

variable.
Equation: Y=β0+β1X+β2X2+...+βnXn+ϵ

Example- Predicting the growth of bacteria over time.

4. Ridge Regression:

Ridge regression is a regularization technique used to prevent over fitting by adding a penalty
term.

Example: Predicting house prices with multiple features (square footage, number of bedrooms,
etc.) while avoiding over fitting.

5. Lasso Regression:

Lasso regression uses L1 regularization, which can shrink coefficients to zero, making it
effective for feature selection.

Example: Used in predicting stock prices by selecting only the most relevant features and
ignoring others.

6. Elastic Net Regression:

Elastic Net combines Ridge and Lasso penalties, making it useful for high-dimensional data.

Example: Used in genomics for predicting disease risk by selecting relevant genes.

7. Logistic Regression:

Used for binary classification problems. Although called regression, it is a classification

algorithm.

Example: Predicting whether a student will pass or fail based on study hours.

8. Poisson Regression:

Used for count data and models the rate of occurrence of an event.

Example: Modeling number of road accidents based on vehicle density.

9. Support Vector Regression (SVR):

Used in machine learning for regression problems by finding a hyper-plane that fits the data.

Example: Predicting stock prices with high-dimensional data.

10. Decision Tree & Random Forest Regression:

 Decision Tree Regression: Uses a tree-like structure to model decisions.

 Random Forest Regression: Uses multiple decision trees and averages the results for
better accuracy.

Example: Predicting house prices based on multiple variables (size, location, amenities).

Normal distribution
A continuous probability distribution for a variable is called as normal probability distribution or
simply normal distribution. It is also known as Gaussian/ Gauss or LAPlace – Gauss distribution.
The normal distribution is determined by two parameters, mean and variance. The normal
distributions are used to represent the real valued random variables whose distributions are
unknown. They are used very frequently in the areas of natural sciences and social sciences.
When the normal distribution is represented in form of a graph, it is known as normal probability
distribution curve or simply normal curve. A normal curve is a bell shaped curve, bilaterally
symmetrical and is continuous frequency distribution curve. Such a curve is formed as a result of
plotting frequencies of scores of a continuous variable in a large sample. The curve is known as
normal probability distribution curve because its y ordinates provides relative frequencies or the
probabilities instead of the observed frequencies. A continuous random variable can be said to be
normally distributed if the histogram of its relative frequency has shape of a normal curve

It is important to understand the characteristics of frequency distribution

of Normal Probabilty Curve (NPC).

.
Importance of Normal Distribution

As discussed earlier, the normal distribution plays a very significant role in the fields of natural
science and other social sciences. Some of the relevance of the normal distribution is described
below:

• The normal distribution is a continuous distribution and plays significant role in statistical
theory and inference.
• The normal distribution has various mathematical properties which makes it convenient to
express the frequency distribution in simplest form.
• It is a useful method of sampling distribution.
• Many of the variables in behavioral sciences like, weight, height, achievement, intelligence
have distributions approximately like the normal curve. • Normal distribution is a necessary
component for many of the inferential statistics like z-test, t-test and F-test.

Properties of Normal Distribution

As discussed earlier, the representation of normal distribution of random variable in graphic form
is known as Normal Probability Curve (NPC). The following are the properties of the normal
curve:
 It is a bell shaped curve which is bilaterally symmetrical and has continuous frequency
distribution curve.
 It is a continuous probability distribution for a random variable.
 It has two halves (right and left) and the value of mean, median and mode are equal
(mean = median = mode), that is, they coincide at same point at the middle of the curve.
 The normal curve is asymptotic, that is, it approaches but never touches the x-axis, as it
moves farther from mean.
 The mean lies in the middle of the curve and divides the curve in to two equal halves.
The total area of the normal curve is within z ± 3 σ below and above the mean.
 The area of unit under the normal curve is said to be equal to one (N=1), standard
deviation is one (σ =1), variance is one (σ²=1) and mean is zero (μ=0).
 At the points where the curve changes from curving upward to curving downward are
called inflection points.
 The z-scores or the standard scores in NPC towards the right from the mean are positive
and towards the left from the mean are negative.
 About 68% of the curve area falls within the limit of plus or minus one standard
deviation (±1 σ) unit from the mean; about 95% of the curve area falls within the limit of
plus or minus two standard deviations (±2 σ) unit from the mean and about 99.7% of the
curve area falls within the limit of plus or minus three standard deviations (±3 σ) unit
from the mean.
 The normal distribution is free from skewness, that is, it’s coefficient of skewness
amounts to zero.
 The fractional areas in between any two given z-scores is identical in both halves of the
normal curve, for example, the fractional area between the z-scores of +1 is identical to
the z-scores of –1. Further, the height of the ordinates at a particular z-score in both the
halves of the normal curve is same, for example, the height of an ordinate at +1z is equal
to the height of an ordinate at –1z.

Standard Normal Distribution

Standard normal distribution, also known as the z-distribution, is a special type of normal
distribution. In this distribution, the mean (average) is 0 and the standard deviation (a measure of
spread) is 1. This creates a bell-shaped curve that is symmetrical around the mean.
STANDARD SCORES or Z-SCORES
Standard score or z-score is a transformed score which shows the number of standard deviation
units by which the value of observation (the raw score) is above or below the mean. The standard
score helps in determining the probability of a score in the normal distribution. It also helps in
comparing scores from different normal distributions.

The standard score is a score that informs about the value and also where the value lies in the
distribution. Typically, for example, if the value is 5 standard deviations above the mean then it
refers to five times the average distance above the mean. It is a transformed score of a raw score.
A raw score or sample value is the unchanged score or the direct result of measurement. A raw
score (X) or sample value cannot give any information of its position within a distribution.
Therefore, these raw scores are transformed in to z-scores to know the location of the original
scores in the distribution.
The z-scores are also used to standardise an entire distribution. These scores (z) help compare the
results of a test with the “normal” population. Results from tests or surveys have thousands of
possible results and units. These results might not be meaningful without getting transformed.

For example, if a result shows that height of a particular person is 6.5 feet; such findings can
only be meaningful if it is compared to the average height. In such a case, the z-score can
provide an idea about where the height of that person is in comparison to the average height of
the population.

Properties of z-score

Following are some of the properties of the Standard (z) Score:

 The mean of the z-scores is always 0.
 It is also important to note that the standard deviation of the z-scores is always 1.
 Further, the graph of the z-score distribution always has the same shape as the original
distribution of sample values.
 The z-scores above the value of 0 represent sample values above the mean, while z-
scores below the value of 0 represent sample values below the mean.
 The shape of the distribution of the z-score will be similar or identical to the original
distribution of the raw scores. Thus, if the original distribution is normal, then the
distribution of the z-score will also be normal. Therefore, converting any data to z-score
does not normalize the distribution of that data.

Uses of z-score
z-scores are useful in the following ways:

 It helps in identifying the position of observation(s) in a population distribution: As

mentioned earlier, the z-scores helps in determining the position/distance of a value or an
observation from the mean in the units of standard deviations. Further, if the distribution
of the scores is like the normal distribution, then we are able to estimate the proportion of
the population falling above or below a particular value. z-score has important
implication in the studies related to diet and nutrition of children. It helps in estimating
the values of height, weight and age of children with reference to nutrition.

 It is used for standardising the raw data: It helps in standardising or converting the data
to enable standard measurements. For example, if you wish to compare your scores on
one test with the scores achieved in another test, comparison on the basis of raw score is
not possible. In such a situation, comparisons across tests can only be done when you
standardise both sets of test scores.

 It helps in comparing scores that are from different normal distributions: As mentioned
in the previous example, z-scores help in comparing scores from different normal
distribution. Thus, z-scores can help in comparing the IQ scores received from two
different tests.

Computation of z-score

As mentioned earlier, z-score refers to the distance of the sample value from the mean in the
standard deviations. z-score can be computed for each value of the sample. The following
formula is used to compute z-score of a sample value.

Z= X- M/ SD or Z= X- M/σ

where,
X = a particular raw score
M = Sample mean
SD or σ = Standard Deviation

To illustrate, suppose the following are the marks obtained by students in mathematics. The
marks obtained are expressed here in terms of raw scores. The mean, SD and z-scores can be
then calculated accordingly:

Students Raw Scores (X) X- M z

A 50 -15 -1.24
B 60 -5 -0.41
C 66 1 0.08
D 70 5 0.41
E 80 15 1.24

N=5
Sum 326
Mean (M) 65
SD 12.04
The above illustration shows the z-scores of the marks obtained by each student (A,B,C,D and
E). In the above example, student A is 1.24 standard deviations, or 1.24 standard deviation units
below the mean. Similarly the student E is 1.24 units above the mean. The standard deviation is
used as unit of measurement in standard scores. The standard score helps in normalising or
collapsing the data to a common standard based on how many standard deviations values lie
from the mean.

The variation of z-scores range from -3 standard deviations (which would fall to the far left of
the normal distribution curve) up to +3 standard deviations (which would fall to the far right of
the normal distribution curve). Further, we need to know the values of the μ (mean) and also the
σ (standard deviation) of the population.

Thus, if we want to compute z-score for X = 70, M= 65 and SD= 12.04, we will use the formula
Z= X- M/ SD
= 70- 65/ 12.04
= 5/ 12.04
= 0.42
Thus, the z-score is obtained as 0.42

Normal Distribution Overview
No ratings yet
Normal Distribution Overview
941 pages
LECTURE NO. 2 (Chapter 2 Normal Distribution)
100% (1)
LECTURE NO. 2 (Chapter 2 Normal Distribution)
10 pages
Normal Distribution:: - Probability - Characteristics and Application of Normal Probability Curve - Sampling Error
No ratings yet
Normal Distribution:: - Probability - Characteristics and Application of Normal Probability Curve - Sampling Error
21 pages
Statistics in Psychology
No ratings yet
Statistics in Psychology
15 pages
Normal Probability Curve: Dr. K Uldeep Kaur
No ratings yet
Normal Probability Curve: Dr. K Uldeep Kaur
37 pages
Eclipse Board Game Rules
No ratings yet
Eclipse Board Game Rules
32 pages
Normal Distribution
No ratings yet
Normal Distribution
3 pages
Understanding Normal Distribution
No ratings yet
Understanding Normal Distribution
5 pages
Intro to Normal Distribution
100% (1)
Intro to Normal Distribution
82 pages
Engineering Thermodynamics (In Si Units) - Nodrm
100% (3)
Engineering Thermodynamics (In Si Units) - Nodrm
870 pages
Basic Electrical Engineering Notes
No ratings yet
Basic Electrical Engineering Notes
45 pages
Robability Istribution: Poona College of Pharmacy, Centre of Advanced Research in Pharmaceutical Sciences
No ratings yet
Robability Istribution: Poona College of Pharmacy, Centre of Advanced Research in Pharmaceutical Sciences
23 pages
Normal Distribution
No ratings yet
Normal Distribution
15 pages
Compilation of Lessons and Activities
No ratings yet
Compilation of Lessons and Activities
37 pages
The Normal Distribution: Sue Gordon
No ratings yet
The Normal Distribution: Sue Gordon
39 pages
Statistics
No ratings yet
Statistics
27 pages
The Normal Distribution: Sue Gordon
No ratings yet
The Normal Distribution: Sue Gordon
40 pages
Jomapa Shs Worktext in Stat. Prob. Lesson 2
No ratings yet
Jomapa Shs Worktext in Stat. Prob. Lesson 2
8 pages
Statistics and Probability
100% (1)
Statistics and Probability
26 pages
Statistics 101: Introduction To Data Management
No ratings yet
Statistics 101: Introduction To Data Management
37 pages
Onstruction Anagement Ecture: Project Planning and Scheduling Using Probabilistic Models
No ratings yet
Onstruction Anagement Ecture: Project Planning and Scheduling Using Probabilistic Models
15 pages
Module 3
No ratings yet
Module 3
54 pages
Statistics and Probability-Module 2
No ratings yet
Statistics and Probability-Module 2
29 pages
Normal Distribution Module Guide
No ratings yet
Normal Distribution Module Guide
40 pages
Lecture Notes
No ratings yet
Lecture Notes
80 pages
Memristor Seminar
No ratings yet
Memristor Seminar
25 pages
An Experiment With Time PDF
100% (1)
An Experiment With Time PDF
5 pages
Lesson 6 Normal Distribution
No ratings yet
Lesson 6 Normal Distribution
15 pages
Physics Lab Manual Diploma PDF
No ratings yet
Physics Lab Manual Diploma PDF
39 pages
Intro to Normal Distribution
No ratings yet
Intro to Normal Distribution
10 pages
Statistics
No ratings yet
Statistics
10 pages
Module 2
No ratings yet
Module 2
13 pages
SHS Statistics and Probability Q3 Mod2 Normal Distribution v4 1 2 Cutted
No ratings yet
SHS Statistics and Probability Q3 Mod2 Normal Distribution v4 1 2 Cutted
32 pages
Normal Probability Curve
No ratings yet
Normal Probability Curve
6 pages
Normal Distribution
No ratings yet
Normal Distribution
25 pages
4 Normal Distribution
No ratings yet
4 Normal Distribution
40 pages
Equilibrium or "Steady-State" Temperature
No ratings yet
Equilibrium or "Steady-State" Temperature
22 pages
The Normal Distribution
No ratings yet
The Normal Distribution
26 pages
Civil
No ratings yet
Civil
26 pages
Normal Distribution For ML
No ratings yet
Normal Distribution For ML
17 pages
MIL-R-18546E (Resistors, Fixed, Wire-Wound (Power Type, Chassis Mounted), General Specification For)
No ratings yet
MIL-R-18546E (Resistors, Fixed, Wire-Wound (Power Type, Chassis Mounted), General Specification For)
18 pages
Normal Distribution
No ratings yet
Normal Distribution
10 pages
Business Research 2
No ratings yet
Business Research 2
8 pages
Unit-4 Biostatistics Descriptive
No ratings yet
Unit-4 Biostatistics Descriptive
19 pages
Intro to Normal Distribution
No ratings yet
Intro to Normal Distribution
6 pages
Normal Probability Distribution
No ratings yet
Normal Probability Distribution
6 pages
UNIT-III (Part 1)
No ratings yet
UNIT-III (Part 1)
31 pages
Stat Prob q1 Week 3 4
No ratings yet
Stat Prob q1 Week 3 4
13 pages
MUCLecture 2023 1132309
No ratings yet
MUCLecture 2023 1132309
7 pages
Aero First Principles Modeling Workbook (Student)
No ratings yet
Aero First Principles Modeling Workbook (Student)
6 pages
ISO 1000 - 1992 - Libgen - Li
No ratings yet
ISO 1000 - 1992 - Libgen - Li
26 pages
Iodine 131 Coficient Attenuation
No ratings yet
Iodine 131 Coficient Attenuation
6 pages
Linear Guideway Specs for Engineers
No ratings yet
Linear Guideway Specs for Engineers
2 pages
Pressure Buildup Test Data Analysis
No ratings yet
Pressure Buildup Test Data Analysis
7 pages
Curl-Noise For Procedural Fluid Flow
No ratings yet
Curl-Noise For Procedural Fluid Flow
4 pages
K6zp0gqmdznormal Distribution
No ratings yet
K6zp0gqmdznormal Distribution
31 pages
RK7002 N-Channel MOSFET Specs
No ratings yet
RK7002 N-Channel MOSFET Specs
5 pages
Shock Resistant and Energy Absorbing Properties of Bionic Niti Lattice Structure Manufactured by SLM
No ratings yet
Shock Resistant and Energy Absorbing Properties of Bionic Niti Lattice Structure Manufactured by SLM
15 pages
Sampling and Sampling Distribution With Business Application - v2
No ratings yet
Sampling and Sampling Distribution With Business Application - v2
11 pages
Normal
No ratings yet
Normal
8 pages
PRP PBL-1
No ratings yet
PRP PBL-1
12 pages
Full Test 3 Solutions
No ratings yet
Full Test 3 Solutions
12 pages
Normal Distributions
No ratings yet
Normal Distributions
11 pages
Grade 11 Statistics Module
No ratings yet
Grade 11 Statistics Module
20 pages
StatProb11 Normal-Distribution
No ratings yet
StatProb11 Normal-Distribution
41 pages
TSP On Manifolds: David Zisselman October 5, 2021
No ratings yet
TSP On Manifolds: David Zisselman October 5, 2021
54 pages
Normal Probability Curve - Characteristics and Properties
No ratings yet
Normal Probability Curve - Characteristics and Properties
12 pages
Design Features For Bobbin Friction Stir Welding Tools Development of A Conceptual Model Linking The Underlying Physics To The Production Process
No ratings yet
Design Features For Bobbin Friction Stir Welding Tools Development of A Conceptual Model Linking The Underlying Physics To The Production Process
12 pages
Introduction To Single Event Upsets SEUs
No ratings yet
Introduction To Single Event Upsets SEUs
16 pages
M2Q3 - Statistics & Probability
No ratings yet
M2Q3 - Statistics & Probability
4 pages
22amh32 - Data Analytics and Data Science Unit I & Mathematics Foundations For Data Science 1. Mathematics Foundations For Data Science
No ratings yet
22amh32 - Data Analytics and Data Science Unit I & Mathematics Foundations For Data Science 1. Mathematics Foundations For Data Science
5 pages
QTAOR
No ratings yet
QTAOR
14 pages
GPR Exploive Filler + Crosscut
No ratings yet
GPR Exploive Filler + Crosscut
21 pages
CETs TRACKER
No ratings yet
CETs TRACKER
1 page
MMW Reviewer
No ratings yet
MMW Reviewer
9 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
13 pages
Critical Assessment of New Polymer-Modified Bitumen For Porous Asphalt Mixtures
No ratings yet
Critical Assessment of New Polymer-Modified Bitumen For Porous Asphalt Mixtures
12 pages
Compito Architettura 2019 en
No ratings yet
Compito Architettura 2019 en
42 pages
Learning Journal Unit 4 Hs 4510-01
No ratings yet
Learning Journal Unit 4 Hs 4510-01
3 pages
Lecture-9 (Normal Distribution Curve)
No ratings yet
Lecture-9 (Normal Distribution Curve)
30 pages
CRC Handbook of Organic Photochemistry and Photobiology 2 Vols W. Horspool Download
No ratings yet
CRC Handbook of Organic Photochemistry and Photobiology 2 Vols W. Horspool Download
52 pages
Distribution
No ratings yet
Distribution
5 pages
NPC Presentation 1
No ratings yet
NPC Presentation 1
36 pages
Session 41 Normal Distribution
No ratings yet
Session 41 Normal Distribution
23 pages
Unit 4 Data Analytics
No ratings yet
Unit 4 Data Analytics
13 pages

Statistics Unit 2.3

Uploaded by

Statistics Unit 2.3

Uploaded by

Types of regression analysis

 Y = Dependent variable (target)

2. Multiple Linear Regression

Polynomial regression fits a non-linear relationship by adding powers of the independent

Example- Predicting the growth of bacteria over time.

6. Elastic Net Regression:

Used for binary classification problems. Although called regression, it is a classification

Example: Modeling number of road accidents based on vehicle density.

9. Support Vector Regression (SVR):

Example: Predicting stock prices with high-dimensional data.

 Decision Tree Regression: Uses a tree-like structure to model decisions.

It is important to understand the characteristics of frequency distribution

Properties of Normal Distribution

Standard Normal Distribution

Following are some of the properties of the Standard (z) Score:

 It helps in identifying the position of observation(s) in a population distribution: As

Students Raw Scores (X) X- M z

You might also like