Measures of Information: - Hartley Defined The First Information Measure

- Hartley defined the first information measure (H) based on message length and possible symbol values. Shannon proposed entropy (H) which weighs information based on outcome probabilities. - Entropy can represent the uncertainty of an outcome, the information an event provides, and the dispersion of a probability distribution. - Mutual information (I) measures dependence between distributions and is maximized when distributions are aligned. It has multiple definitions relating to entropy and joint entropy. - Properties of mutual information include being symmetric, non-negative, and equal to zero for independent variables. It can be used for feature selection by maximizing separability between features and classifications.

Uploaded by

Dinesh Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

57 views9 pages

Measures of Information: - Hartley Defined The First Information Measure

Uploaded by

Dinesh Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 9

Measures of Information

• Hartley defined the first information measure:

– H = n log s
– n is the length of the message and s is the number of
possible values for each symbol in the message
– Assumes all symbols equally likely to occur
• Shannon proposed variant (Shannon’s Entropy)
1
H   pi  log
i pi
• weighs the information based on the probability that an
outcome will occur
• second term shows the amount of information an event
provides is inversely proportional to its prob of occurring
Three Interpretations of Entropy
• The amount of information an event provides
– An infrequently occurring event provides more
information than a frequently occurring event
• The uncertainty in the outcome of an event
– Systems with one very common event have less entropy
than systems with many equally probable events
• The dispersion in the probability distribution
– An image of a single amplitude has a less disperse
histogram than an image of many greyscales
• the lower dispersion implies lower entropy
Definitions of Mutual Information
• Three commonly used definitions:
– 1) I(A,B) = H(B) - H(B|A) = H(A) - H(A|B)
• Mutual information is the amount that the uncertainty in B (or
A) is reduced when A (or B) is known.
– 2) I(A,B) = H(A) + H(B) - H(A,B)
• Maximizing the mutual info is equivalent to minimizing the
joint entropy (last term)
• Advantage in using mutual info over joint entropy is it includes
the individual input’s entropy
• Works better than simply joint entropy in regions of image
background (low contrast) where there will be low joint
entropy but this is offset by low individual entropies as well so
the overall mutual information will be low
Definitions of Mutual Information II

 p ( a, b) 
I ( A, B)   p(a, b)  log  
a ,b  p(a) p(b) 

• This definition is related to the Kullback-Leibler distance

between two distributions
• Measures the dependence of the two distributions
• In image registration I(A,B) will be maximized when the
images are aligned
• In feature selection choose the features that minimize I(A,B) to
ensure they are not related.
Additional Definitions of Mutual
Information
• Two definitions exist for normalizing Mutual
information:
– Normalized Mutual Information:
H ( A)  H ( B)
NMI ( A, B) 
H ( A, B)

– Entropy Correlation Coefficient:

2
ECC( A, B)  2 
NMI ( A, B)
Derivation of M. I. Definitions
H ( A, B)   p(a, b)  log( p (a, b)), where p(a, b)  p (a | b)  p (b)
a ,b

H ( A, B)   [ p(a | b)  p (b)]  log[ p(a | b)  p(b)]

a ,b

H ( A, B)   [ p(a | b)  p (b)] {log[ p (a | b)]  log[ p(b)]}

a ,b

H ( A, B)   p(a | b)  log[ p (a | b)]  p(b)   p(b)  log( p(b))  p (a | b)

a ,b a ,b

H ( A, B)   p(a | b)  log[ p (a | b)]   p(b)   p(a | b)  p(b)  log( p(b

a b b a

H ( A, B)   p(a | b)  log[ p (a | b)]   p(b)  log( p(b))

a b

H ( A, B)  H ( A | B)  H ( B)
therefore I ( A, B)  H ( A)  H ( B | A)  H ( A)  H ( B)  H ( A, B)
Properties of Mutual Information
• MI is symmetric: I(A,B) = I(B,A)
• I(A,A) = H(A)
• I(A,B) <= H(A), I(A,B) <= H(B)
– info each image contains about the other cannot be
greater than the info they themselves contain
• I(A,B) >= 0
– Cannot increase uncertainty in A by knowing B
• If A, B are independent then I(A,B) = 0
• If A, B are Gaussian then:
I ( A, B)   12 log( 1   2 )
Mutual Information based Feature
Selection
• Tested using 2-class Occupant sensing problem
– Classes are RFIS and everything else (children, adults,
etc).
– Use edge map of imagery and compute features
• Legendre Moments to order 36
• Generates 703 features, we select best 51 features.
• Tested 3 filter-based methods:
– Mann-Whitney statistic
– Kullback-Leibler statistic
– Mutual Information criterion
• Tested both single M.I., and Joint M.I. (JMI)
Mutual Information based Feature
Selection Method
• M.I. tests a feature’s ability to separate two
classes.
– Based on definition 3) for M.I.

 p ( a, b) 
I ( A, B)   p(a, b)  log  
a b  p(a) p(b) 

– Here A is the feature vector and B is the classification

• Note that A is continuous but B is discrete
– By maximizing the M.I. We maximize the separability
of the feature
• Note this method only tests each feature individually

MODDEMEIJER1989233 - On Estimation of Entropy and Mutual Information of Continuous Distributions
No ratings yet
MODDEMEIJER1989233 - On Estimation of Entropy and Mutual Information of Continuous Distributions
16 pages
Lecture - Information Theory - Part-1
No ratings yet
Lecture - Information Theory - Part-1
33 pages
Feature Selection
No ratings yet
Feature Selection
173 pages
C&C Combined Module Notes
No ratings yet
C&C Combined Module Notes
206 pages
ISRO Electronics Interview Questions
100% (2)
ISRO Electronics Interview Questions
3 pages
55 Ic PPT 5
No ratings yet
55 Ic PPT 5
27 pages
Iict Unit One
No ratings yet
Iict Unit One
35 pages
Notes 10 Mutual Information
No ratings yet
Notes 10 Mutual Information
5 pages
Chapter 10 Information-Theoretic Learning Models
No ratings yet
Chapter 10 Information-Theoretic Learning Models
31 pages
Grade 3 Antonyms Lesson Plan
100% (2)
Grade 3 Antonyms Lesson Plan
3 pages
Image Similarity Using Mutual Information of Regions 1st Edition by Daniel Russakoff, Carlo Tomasi, Torsten Rohlfing, Calvin Maurer ISBN 3540219828 9783540219828 Instant Download
100% (3)
Image Similarity Using Mutual Information of Regions 1st Edition by Daniel Russakoff, Carlo Tomasi, Torsten Rohlfing, Calvin Maurer ISBN 3540219828 9783540219828 Instant Download
40 pages
AT U I C C: Heory of Sable Nformation Under Omputational Onstraints
No ratings yet
AT U I C C: Heory of Sable Nformation Under Omputational Onstraints
24 pages
Mutual Information
No ratings yet
Mutual Information
3 pages
SummaryFeb5 2024
No ratings yet
SummaryFeb5 2024
2 pages
Unit 2
No ratings yet
Unit 2
153 pages
ITC-6 Sem - 1
No ratings yet
ITC-6 Sem - 1
66 pages
Decision Trees
No ratings yet
Decision Trees
26 pages
CS464 Ch5 FeatureSelection
No ratings yet
CS464 Ch5 FeatureSelection
31 pages
Entropy 24 01255
No ratings yet
Entropy 24 01255
13 pages
Introduction To Information Theory
No ratings yet
Introduction To Information Theory
20 pages
Speeches
No ratings yet
Speeches
14 pages
Syllabus For Grade 1 - LANGUAGE ARTS
No ratings yet
Syllabus For Grade 1 - LANGUAGE ARTS
15 pages
Visual & Multimedia in Texts
No ratings yet
Visual & Multimedia in Texts
3 pages
Curriculum Studies Module 1revised May 2010
No ratings yet
Curriculum Studies Module 1revised May 2010
99 pages
21ECE72 - Coding and Cryp Module 1
No ratings yet
21ECE72 - Coding and Cryp Module 1
34 pages
Module 5 Behavioral Learning Theories
No ratings yet
Module 5 Behavioral Learning Theories
35 pages
Lecture#4 - Resilience1
No ratings yet
Lecture#4 - Resilience1
37 pages
LeadershipSelfDeception PG
No ratings yet
LeadershipSelfDeception PG
11 pages
Lecture 03
No ratings yet
Lecture 03
33 pages
BI Lecture-Mod 2
No ratings yet
BI Lecture-Mod 2
113 pages
10 Class Notes On Creativity and Innovation
No ratings yet
10 Class Notes On Creativity and Innovation
3 pages
Elements of Information Theory 2006 Thomas M. Cover and Joy A. Thomas
No ratings yet
Elements of Information Theory 2006 Thomas M. Cover and Joy A. Thomas
16 pages
Quirino State University: Self-Paced Learning Module
No ratings yet
Quirino State University: Self-Paced Learning Module
18 pages
Information Measure
No ratings yet
Information Measure
22 pages
Unit 1
No ratings yet
Unit 1
94 pages
1.5mutual Information
No ratings yet
1.5mutual Information
25 pages
The Binary Entropy Function: ECE 7680 Lecture 2 - Definitions and Basic Facts
No ratings yet
The Binary Entropy Function: ECE 7680 Lecture 2 - Definitions and Basic Facts
8 pages
ITC Module2 1
No ratings yet
ITC Module2 1
34 pages
Ida Denmark For Dummies
100% (1)
Ida Denmark For Dummies
39 pages
Pennsylvania Verbal Behavior Project: Family Handbook
100% (2)
Pennsylvania Verbal Behavior Project: Family Handbook
31 pages
Self Information
No ratings yet
Self Information
3 pages
A Comparison of Multivariate Mutual Information Estimators For Feature Selection
No ratings yet
A Comparison of Multivariate Mutual Information Estimators For Feature Selection
10 pages
Logosynthesis State of The Art
100% (1)
Logosynthesis State of The Art
21 pages
Information Theory Basics & Examples
No ratings yet
Information Theory Basics & Examples
29 pages
The (Almost) Complete Machine Learning Roadmap: Milestone 0: Python 3 and Other Basic Stuff
No ratings yet
The (Almost) Complete Machine Learning Roadmap: Milestone 0: Python 3 and Other Basic Stuff
5 pages
Emotional Intelligence for Leaders
No ratings yet
Emotional Intelligence for Leaders
9 pages
Information Theory111
No ratings yet
Information Theory111
1 page
Pengertian Dan Contoh Kalimat Verb Phrase - Materi Bahasa Inggris PDF
No ratings yet
Pengertian Dan Contoh Kalimat Verb Phrase - Materi Bahasa Inggris PDF
4 pages
Information Theory Basics
No ratings yet
Information Theory Basics
26 pages
Information and Interaction Among Features: 36-350: Data Mining 9 September 2009
No ratings yet
Information and Interaction Among Features: 36-350: Data Mining 9 September 2009
16 pages
Bayesian Input Design For Linear Dynamical Model Discrimination 2019 Bania
No ratings yet
Bayesian Input Design For Linear Dynamical Model Discrimination 2019 Bania
13 pages
Humoral Immunity Across The SARS-CoV-2 Spike After
No ratings yet
Humoral Immunity Across The SARS-CoV-2 Spike After
16 pages
Machine Learning Course Overview
No ratings yet
Machine Learning Course Overview
124 pages
ID3 Lecture4
No ratings yet
ID3 Lecture4
25 pages
Training TOPICS
No ratings yet
Training TOPICS
18 pages
Unit 3 Newsletter - The Move Toward Freedom
No ratings yet
Unit 3 Newsletter - The Move Toward Freedom
5 pages
Invocation Lesson Plan
No ratings yet
Invocation Lesson Plan
2 pages
Information Society: by Prof. Liwayway Memije-Cruz
No ratings yet
Information Society: by Prof. Liwayway Memije-Cruz
30 pages
Speeding Up Feature Selection by Using An Information Theoretic Bound
No ratings yet
Speeding Up Feature Selection by Using An Information Theoretic Bound
8 pages
PRACTICUMOptionNo2 211027 203526
No ratings yet
PRACTICUMOptionNo2 211027 203526
2 pages
A Study On Mutual Information-Based Feature Selection For Text Categorization
No ratings yet
A Study On Mutual Information-Based Feature Selection For Text Categorization
6 pages
Tamilmani Nagar
No ratings yet
Tamilmani Nagar
2 pages
Mutual Information, Joint Entropy & Conditional Entropy
No ratings yet
Mutual Information, Joint Entropy & Conditional Entropy
13 pages
Module14 InformationTheoryandEntropy
No ratings yet
Module14 InformationTheoryandEntropy
24 pages
Theory Building in Management
No ratings yet
Theory Building in Management
18 pages
Information Processing Equalities and The Information-Risk Bridge
No ratings yet
Information Processing Equalities and The Information-Risk Bridge
53 pages
Business Analytics & Machine Learning: Decision Tree Classifiers
No ratings yet
Business Analytics & Machine Learning: Decision Tree Classifiers
60 pages
Past Continuous Tense Lesson Plan
No ratings yet
Past Continuous Tense Lesson Plan
15 pages
Cpte01 PDF
No ratings yet
Cpte01 PDF
10 pages
Information & Coding Theory Basics
No ratings yet
Information & Coding Theory Basics
34 pages
Genius - 10,000 Hours To Mastery
No ratings yet
Genius - 10,000 Hours To Mastery
1 page
Eng2611 Sup Exam
No ratings yet
Eng2611 Sup Exam
7 pages
Mutual Information for Data Analysis
No ratings yet
Mutual Information for Data Analysis
30 pages
Maximal Information Coefficient
No ratings yet
Maximal Information Coefficient
3 pages
OCN Worksheet I-V Unit
No ratings yet
OCN Worksheet I-V Unit
18 pages
Information Theory Final
No ratings yet
Information Theory Final
50 pages
Class Revision After Winter Holidays
No ratings yet
Class Revision After Winter Holidays
5 pages
I. Write True or False.: Gibson School Systems 2019/2020 Academic Year Answer Key Format
No ratings yet
I. Write True or False.: Gibson School Systems 2019/2020 Academic Year Answer Key Format
7 pages
Lecture 3 - Entropy
No ratings yet
Lecture 3 - Entropy
35 pages
Typological Analysis of Negation in Sa ̰ Wi, Kɔdɛ Adiukru & Alladian
No ratings yet
Typological Analysis of Negation in Sa ̰ Wi, Kɔdɛ Adiukru & Alladian
14 pages
Mit6 441s16 Course Notes
No ratings yet
Mit6 441s16 Course Notes
295 pages
A Cornputational Theory of Surprise: Transmission of Data, Shan
No ratings yet
A Cornputational Theory of Surprise: Transmission of Data, Shan
25 pages
2 Entropy and Mutual Information: I (A) F (P (A) )
No ratings yet
2 Entropy and Mutual Information: I (A) F (P (A) )
27 pages
Information Theory: Mark Van Rossum
No ratings yet
Information Theory: Mark Van Rossum
35 pages
Chebyshev PDF
No ratings yet
Chebyshev PDF
4 pages
Feature Ranking Methods Based On Information Entropy With Parzen Windows
No ratings yet
Feature Ranking Methods Based On Information Entropy With Parzen Windows
9 pages
Info Theory
No ratings yet
Info Theory
59 pages
Machine Learning Feature Selection
No ratings yet
Machine Learning Feature Selection
40 pages
Information T Information Theory and Coding: S.Chandramohan
No ratings yet
Information T Information Theory and Coding: S.Chandramohan
38 pages
Intro to Information Theory
No ratings yet
Intro to Information Theory
21 pages
ECE4007 Information Theory and Coding: DR - Sangeetha R.G
No ratings yet
ECE4007 Information Theory and Coding: DR - Sangeetha R.G
44 pages
It Lectures
No ratings yet
It Lectures
342 pages
On Measures of Entropy and Information
No ratings yet
On Measures of Entropy and Information
18 pages
Wei 2017 Thesis
No ratings yet
Wei 2017 Thesis
89 pages
Game Theory: Managerial Decision Process
No ratings yet
Game Theory: Managerial Decision Process
9 pages

Measures of Information: - Hartley Defined The First Information Measure

Uploaded by

Measures of Information: - Hartley Defined The First Information Measure

Uploaded by

Measures of Information

• Hartley defined the first information measure:

• This definition is related to the Kullback-Leibler distance

– Entropy Correlation Coefficient:

H ( A, B)   [ p(a | b)  p (b)]  log[ p(a | b)  p(b)]

H ( A, B)   [ p(a | b)  p (b)] {log[ p (a | b)]  log[ p(b)]}

H ( A, B)   p(a | b)  log[ p (a | b)]  p(b)   p(b)  log( p(b))  p (a | b)

H ( A, B)   p(a | b)  log[ p (a | b)]   p(b)   p(a | b)  p(b)  log( p(b

H ( A, B)   p(a | b)  log[ p (a | b)]   p(b)  log( p(b))

– Here A is the feature vector and B is the classification

You might also like