Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
77 views7 pages

Unit 2 Notes

The document discusses various statistical concepts including measures of relationship, position, and Bayes' theorem. It also covers topics like correlation, regression, percentiles, z-scores, quartiles, Bayesian networks, discriminative learning, hidden Markov models, and latent variable models.

Uploaded by

Vedant Chinta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
77 views7 pages

Unit 2 Notes

The document discusses various statistical concepts including measures of relationship, position, and Bayes' theorem. It also covers topics like correlation, regression, percentiles, z-scores, quartiles, Bayesian networks, discriminative learning, hidden Markov models, and latent variable models.

Uploaded by

Vedant Chinta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

Unit 2 : Statistical Inference II

Measure of Relationship:
Definition: The statistical measures which show a relationship between two or more variables
are called Measures of Relationship. Correlation and Regression are commonly used measures
of relationship.

Covariance and Karl Pearson's Coefficient of Correlation are measures used in statistics to
quantify the relationship between two variables. Let's explore each of these measures:

1. Covariance:
- Definition: Covariance measures the extent to which two variables change together. It
indicates whether an increase in one variable corresponds to an increase or decrease in another.
- Formula: The covariance (cov) between two variables X and Y in a dataset is calculated using
the following formula:

Interpretation:
- Positive covariance indicates a direct relationship (both variables increase or decrease
together).
- Negative covariance indicates an inverse relationship (one variable increases while the other
decreases).

2. Karl Pearson's Coefficient of Correlation (Pearson's r):


- Definition: Pearson's coefficient of correlation measures the strength and direction of a linear
relationship between two variables. It is normalized, providing a value between -1 and 1.
Key Differences:
- Covariance is not normalized and depends on the scales of the variables, making it difficult to
compare covariances across different datasets.
- Pearson's correlation coefficient is normalized, making it more interpretable and comparable
across datasets.
- Pearson's correlation coefficient specifically measures linear relationships, while covariance
does not provide information about the strength or type of relationship.

Measures of Position:
Measures of position in statistics help us understand the relative location of a particular data
point within a dataset.

Percentile:
A percentile is a measure that indicates the relative standing of a particular value within a
dataset.
Percentiles divide a dataset into 100 equal parts, and each percentile represents the percentage of
data points below it.
For example, the 80th percentile indicates that 80% of the data points are below that particular
value.
Z-score (Standard Score):
The Z-score measures how many standard deviations a data point is from the mean (average) of a
dataset.

Z-scores help standardize data, making it easier to compare values from different datasets.
Quartiles:
Quartiles divide a dataset into four equal parts, each containing approximately 25% of the data.
The three quartiles are:
First Quartile (Q1): The 25th percentile.
Second Quartile (Q2): The median or 50th percentile.
Third Quartile (Q3): The 75th percentile.
Interquartile Range (IQR) is the range between the first and third quartiles and is a measure of
the spread of the middle 50% of the data.
Bayes' Theorem:
Bayes' Theorem is a mathematical formula that describes the probability of an event based on
prior knowledge of conditions that might be related to the event.
It is named after Thomas Bayes, an 18th-century statistician and theologian.
The theorem is often expressed as follows:

Bayes' Theorem is widely used in various fields, including statistics, machine learning, and
medical diagnosis. It provides a systematic way to update probabilities as new information
becomes available, making it a powerful tool for reasoning under uncertainty.
Bayes Classifier:
A Bayes classifier, also known as a Naive Bayes classifier, is a probabilistic machine learning
model based on Bayes' Theorem. It is a simple and efficient algorithm for classification tasks,
especially in situations with a large number of features. Despite its simplicity, Naive Bayes often
performs well in practice, making it a popular choice for text classification, spam filtering, and
other similar applications.
The basic idea behind a Bayes classifier is to use prior knowledge about the distribution of
classes and features in the training data to make predictions about the class of new, unseen data.
Working of Naïve Bayes' Classifier can be understood with the help of the below example:
Suppose we have a dataset of weather conditions and corresponding target variable "Play". So
using this dataset we need to decide that whether we should play or not on a particular day
according to the weather conditions. So to solve this problem, we need to follow the below steps:
1. Convert the given dataset into frequency tables.
2. Generate Likelihood table by finding the probabilities of given features.
3. Now, use Bayes theorem to calculate the posterior probability.
Advantages of Naïve Bayes Classifier:

 Naïve Bayes is one of the fast and easy ML algorithms to predict a class of datasets.
 It can be used for Binary as well as Multi-class Classifications.
 It performs well in Multi-class predictions as compared to the other Algorithms.
 It is the most popular choice for text classification problems.

Disadvantages of Naïve Bayes Classifier:

 Naive Bayes assumes that all features are independent or unrelated, so it cannot learn the
relationship between features.
Applications of Naïve Bayes Classifier:

 It is used for Credit Scoring.


 It is used in medical data classification.
 It can be used in real-time predictions because Naïve Bayes Classifier is an eager learner.
 It is used in Text classification such as Spam filtering and Sentiment analysis.

Types of Naïve Bayes Model:


There are three types of Naive Bayes Model, which are given below:

 Gaussian: The Gaussian model assumes that features follow a normal distribution. This
means if predictors take continuous values instead of discrete, then the model assumes
that these values are sampled from the Gaussian distribution.
 Multinomial: The Multinomial Naïve Bayes classifier is used when the data is
multinomial distributed. It is primarily used for document classification problems, it
means a particular document belongs to which category such as Sports, Politics,
education, etc.
The classifier uses the frequency of words for the predictors.
 Bernoulli: The Bernoulli classifier works similar to the Multinomial classifier, but the
predictor variables are the independent Booleans variables. Such as if a particular word is
present or not in a document. This model is also famous for document classification
tasks.
Bayesian network:
Bayesian belief network is key computer technology for dealing with probabilistic events and to
solve a problem which has uncertainty. We can define a Bayesian network as:
"A Bayesian network is a probabilistic graphical model which represents a set of variables and
their conditional dependencies using a directed acyclic graph."
It is also called a Bayes network, belief network, decision network, or Bayesian model.
Bayesian networks are probabilistic, because these networks are built from a probability
distribution, and also use probability theory for prediction and anomaly detection.
Applications of Bayesian networks include:
1. Medical Diagnosis: Modeling the relationships between symptoms, diseases, and test
results to assist in diagnosing medical conditions.
2. Risk Assessment: Evaluating the probability and impact of different risks in a system.
3. Speech Recognition: Modeling dependencies between phonemes to improve the accuracy
of speech recognition systems.
4. Natural Language Processing: Capturing the probabilistic relationships between words in
a language to enhance language understanding.
Bayesian networks are valuable tools in decision support systems, where they help in reasoning
about uncertain and complex scenarios. They provide a principled framework for representing
and updating knowledge in the presence of uncertainty.
Discriminative learning with maximum likelihood:
Discriminative learning, often contrasted with generative learning, focuses on modeling the
decision boundary between different classes directly. Maximum Likelihood Estimation (MLE) is
a common approach in discriminative learning, aiming to find the parameters that maximize the
likelihood of the observed data given the model.
Discriminative models focus on finding the boundary that separates different classes, making
them well-suited for classification tasks. Maximum Likelihood Estimation provides a principled
way to estimate the parameters of the model based on the observed data.
It's worth noting that while maximum likelihood is a powerful and widely used approach, other
methods, such as maximum a posteriori estimation (MAP) or Bayesian approaches, also play
important roles in statistical learning.
Probabilistic models with hidden variables:
Probabilistic models with hidden variables are models that involve both observable (measurable)
variables and unobservable or hidden variables. These models are widely used in various fields,
including machine learning, statistics, and artificial intelligence, to represent complex
relationships in data where not all variables can be directly observed.
Here are two common types of probabilistic models with hidden variables:
Hidden Markov Models (HMMs):
Hidden Markov Models are widely used in sequential data analysis, such as speech recognition,
natural language processing, and bioinformatics.
In an HMM, there are observable variables (emissions) and hidden states. The model assumes
that the observed data depend on an underlying sequence of hidden states.
The key components of an HMM include transition probabilities (probabilities of moving from
one hidden state to another), emission probabilities (probabilities of observing a particular value
given the hidden state), and an initial state distribution.
Latent Variable Models:
Latent variable models involve both observed variables and unobserved latent variables that help
explain the structure of the data.
Examples include Principal Component Analysis (PCA), Factor Analysis, and Gaussian Mixture
Models (GMMs).
In PCA and Factor Analysis, the latent variables represent underlying patterns or factors that
explain the observed variability in the data. In GMMs, each data point is assumed to be
generated by a mixture of different Gaussian distributions, and the latent variable indicates the
specific component (cluster) responsible for generating the data point.

Linear models and regression analysis, particularly using the method of least squares, are
fundamental concepts in statistics and machine learning. Let's break down these terms:

1. Linear Models:
Linear models are mathematical representations used to describe the relationship between a
dependent variable (response) and one or more independent variables (features or predictors) in a
linear way.
Linear models describe a continuous response variable as a function of one or more predictor
variables. They can help to understand and predict the behaviour of complex systems or analyse
experimental, financial, and biological data.
Regression Analysis:
- Regression analysis is a statistical technique that aims to model and analyze the relationship
between a dependent variable and one or more independent variables.
- The primary goal is to understand how changes in the independent variables are associated
with changes in the dependent variable.
- Regression analysis can be used for prediction, understanding the strength and nature of
relationships, and making inferences about the population.
Least Squares:
Least Squares is a method used to estimate the parameters (coefficients) of a linear model by
minimizing the sum of the squared differences between the observed and predicted values.

The ordinary least squares (OLS) method generalizes to multiple linear regression with multiple
independent variables.

The steps for performing least squares regression include:


1. Specify the model.
2. Collect data, including the values of the dependent and independent variables.
3. Estimate the model parameters by minimizing the sum of squared residuals.
4. Assess the fit of the model and make inferences about the relationships.

You might also like