Information Theory
Nguyen Phuong Thai
About me
Name: Nguyễn Phương Thái
Address: Computer Science Department, Faculty of
Information Technology, VNU UET
Email:
[email protected] Phone number: 090 205 9945
Homepage: http://coltech.vnu.edu.vn/~thainp/
Research interests
Natural language processing
Machine learning
Linguistics annotation
Related Courses in the CS Program
Probability and statistics
Elective courses (Group 1)
Artificial intelligence
Machine learning
Natural language processing
Bioinformatics
Speech processing
…
What is Information Theory?
Information theory studies the quantification,
representation, and communication of information.
It was originally proposed by Claude Shannon in 1948
https://youtu.be/d9alWZRzBWk
Claude Shannon, 1916-2001
Communication
Communication
Communication Theory
In 1948, Claude E. Shannon of Bell Labs published a
paper “The Mathematical Theory of Communications”,
and single-handedly created this field. (paper is on the
web)
Shannon’s work grew out of solving problems of
electrical communication during WWII, but IT applies to
many other fields as well.
Key idea: Communication is sending information from
one place and/or time to another place and/or time over a
medium that might have errors.
Main Contributions of the
Paper
The first time probability was applied to deal with
communication problems.
The innovation idea was that “information” (from any
source) has a digital nature.
The concept of “entropy” was invented and used to
measure the “complexity” or the “randomness” of the
source.
The paper also set a foundation for data compression,
coding, and decoding with error detection and correction.
Shannon’s Communication Model
The paper introduced a model for communication:
encoding, sending signal though a channel, and
decoding.aper introduced a model for communication:
encoding, sending signal though a channel, and decoding.
Source
Voice
Words
Pictures
Music, art
Human cells about to reproduce
Human parents about to reproduce
Sensory input of biological organism
or any signal at all (any binary data)
Channel
Telephone line
High frequency radio link
Storage (disk, tape, internet, TCP/IP, facebook, twitter),
transmission through time rather than space, could be
degradation due to decay
Biological organism (send message from brain to foot, or
from ear to brain)
Receiver
The destination of the information transmitted
Person,
Computer
Disk
Analog Radio or TV
Internet streaming audio system
Noise
Represents our imperfect understanding of the universe.
Thus, we treat it as random, often however obeying some
rules, such as that of a probability distribution..
Encoder
Processing done before placing info into channel
First stage: data reduction (keep only important bits or
remove source redundancy), followed by redundancy
insertion catered to channel.
A code = a mechanism for representation of information of
one signal by another.
An encoding = representation of information in another
form.
Decoder
Exploit and then remove redundancy
Remove and fix any transmission errors
Restore the information in original form
Example: Human Speech
Source = human thought, speakers brain
Encoder = Human Vocal Tract
Channel = air, sound pressure waves
Noise = background noise
Decoder = human auditory system
Receiver = human thought, listeners brain
Example: Morse Code
Morse Code
Morse code, series of dots and
dashes to represent letters
Most frequent letter sent with the
shortest code, 1 dot
Note: codewords might be prefixes
of each other (e.g., “E” and “F”).
Uses only binary data (single
current telegraph, size two
“alphabet”), could use more (three,
double current telegraph), but this is
more susceptible to noise (binary in
computer rather than ternary).
On Error
How do we decrease errors in a communications system?
Physical: use more reliable components in circuitry, broaden
spectral bandwidth, use more precise and expensive
electronics, increase signal power.
All of this is more expensive.
Question: Given a fixed imperfect analog channel and
transmission equipment, can we achieve perfect
communication over an imperfect communication line?
Yes: Key is to add redundancy to signal.
Encoder adds redundancy appropriate for channel. Decoder
exploits and then removes redundancy.
The Meaning of Information
For years, telegraph messages were “compressed” by leaving
certain words out, or sending key words that stood for longer
messages, since costs were determined by the number of
words sent.
Yet people could easily read these abbreviated messages, since
they supplied these predictable words, such “a” and “the.”
For Shannon, information is symbols that contain
unpredictable news, like our sentence, “only infrmatn esentil
to understandn mst b tranmitd.”
The predictable symbols that we can leave out, which Shannon
calls redundancy, are not really news.
The Meaning of Information
Another example is coin flipping. Each time we flip a
coin, we can transmit which way it lands, “heads” or
“tails,” by transmitting a code of “zero” or “one.”
But what if the coin has two “heads” and everyone knows
it? Since there is no uncertainty concerning the outcome
of a flip, no message need be sent at all.
Outline of the Course
Probability and statistics
Entropy, relative entropy, and
mutual information
Data compression
Gambling
Channel capacity
Grading Policy
Class attendance (x 0.1)
Midterm examination (x 0.3)
Final examination (x 0.6)
THANK YOU!