0% found this document useful (0 votes)

25 views24 pages

Week 3

Natural Language Processing involves probabilistic language modeling to assign probabilities to sequences of words. N-gram models are commonly used, which make the Markov assumption that the probability of a word depends only on the previous N-1 words. The chain rule of probability is applied to compute the joint probability of a sentence as the product of conditional probabilities of each word given previous words. These conditional probabilities are estimated from large corpora by calculating maximum likelihood estimates of n-gram probabilities based on raw counts of n-grams in the data. N-gram models are simple but often sufficient for many NLP tasks.

Uploaded by

Khaled Tarek

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views24 pages

Week 3

Uploaded by

Khaled Tarek

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

Natural Language Processing

Mahmmoud Mahdi
Language
Modeling
N-Grams
Probabilistic Language Models
Goal: assign a probability to a sentence

● Machine Translation:
○ P(high winds tonite) > P(large winds tonite)
● Spell Correction
○ The office is about fifteen minuets from my house
■ P(about fifteen minutes from) > P(about fifteen minuets from)
● Speech Recognition
○ P(I saw a van) >> P(eyes awe of an)
● + Summarization, question-answering, etc., etc.!!
Probabilistic Language Modeling
● Goal: compute the probability of a sequence of words:

P(W) = P(w1,w2,w3,w4,w5…wn)

● Related task: probability of an upcoming word:

P(w5|w1,w2,w3,w4)

● A model that computes P(W) or P(wn|w1,w2…wn-1) is called a

language model.
● Better: the grammar But language model (LM) is standard
How to compute P(W)
● How to compute this joint probability:
○ P(its, water, is, so, transparent, that)

● Intuition: let’s rely on the Chain Rule of Probability

Reminder: The Chain Rule

● Recall the definition of conditional probabilities

p(B|A) = P(A,B)/P(A) Rewriting: P(A,B) = P(A)P(B|A)

● More variables:
P(A,B,C,D) = P(A)P(B|A)P(C|A,B)P(D|A,B,C)

● The Chain Rule in General

P(x1,x2,x3,…,xn) = P(x1)P(x2|x1)P(x3|x1,x2)…P(xn|x1,…,xn-1)
The Chain Rule applied to compute joint probability of
words in sentence

P(“its water is so transparent”) =

P(its) × P(water|its) × P(is|its water)

× P(so|its water is) × P(transparent|its water is so)

How to estimate these probabilities

● No! Too many possible sentences!

● We’ll never see enough data for estimating these
Markov Assumption

Simplifying assumption:
Andrei Markov

Or maybe
Markov Assumption

● In other words, we approximate each component in the

product
Simplest case: Unigram model

Some automatically generated sentences from a unigram model

Bigram model

Condition on the previous word

N-gram models

● We can extend to trigrams, 4-grams, 5-grams

● In general this is an insufficient model of
language
● But we can often get away with N-gram models
Language Modeling
Estimating N-gram
Probabilities
Estimating bigram probabilities

● The Maximum Likelihood Estimate

An example
<s> I am Sam </s>
<s> Sam I am </s>
<s> I do not like green eggs and ham </s>
More examples: Berkeley Restaurant Project sentences
● can you tell me about any good cantonese restaurants close
by
● mid priced thai food is what i’m looking for
● tell me about chez panisse
● can you give me a listing of the kinds of food that are
available
● i’m looking for a good place to eat breakfast
● when is caffe venezia open during the day
Raw bigram counts
● Out of 9222 sentences
Raw bigram probabilities
● Normalize by unigrams:

● We do everything in log space

○ Avoid underflow
○ (also adding is faster than multiplying)
Questions ?

CS 388: Natural Language Processing:: N-Gram Language Models
No ratings yet
CS 388: Natural Language Processing:: N-Gram Language Models
22 pages
Notes of NLP - Unit-2
No ratings yet
Notes of NLP - Unit-2
23 pages
Lecture 5: Language Modeling (N-Gram, BOW)
No ratings yet
Lecture 5: Language Modeling (N-Gram, BOW)
25 pages
NLP N-Grams for M.Sc Students
100% (2)
NLP N-Grams for M.Sc Students
93 pages
Lecture - 3 - Statistical Language Models
No ratings yet
Lecture - 3 - Statistical Language Models
56 pages
Language Modeling with N-grams
No ratings yet
Language Modeling with N-grams
79 pages
Language Modeling: Introduction To N-Grams
No ratings yet
Language Modeling: Introduction To N-Grams
88 pages
3 LM Jan 08 2021
No ratings yet
3 LM Jan 08 2021
77 pages
Language Modeling and Spelling Correction
No ratings yet
Language Modeling and Spelling Correction
97 pages
Language Modeling
No ratings yet
Language Modeling
88 pages
Introduction to N-grams in NLP
No ratings yet
Introduction to N-grams in NLP
88 pages
Introduction to N-gram Language Models
No ratings yet
Introduction to N-gram Language Models
77 pages
Introduction To Language Modeling Final
No ratings yet
Introduction To Language Modeling Final
69 pages
Language Modelling
No ratings yet
Language Modelling
3 pages
Adv. Natural Language Processing: Instructor: Dr. Muhammad Asfand-E-Yar
No ratings yet
Adv. Natural Language Processing: Instructor: Dr. Muhammad Asfand-E-Yar
54 pages
3 LM Jan 08 2021
No ratings yet
3 LM Jan 08 2021
77 pages
Language Model PDF
No ratings yet
Language Model PDF
76 pages
Ngrams
100% (1)
Ngrams
22 pages
NLP and Entropy
No ratings yet
NLP and Entropy
54 pages
Language Modeling: Prabhleen Juneja Thapar Institute of Engineering & Technology
No ratings yet
Language Modeling: Prabhleen Juneja Thapar Institute of Engineering & Technology
36 pages
Language Models & N-Gram Analysis
No ratings yet
Language Models & N-Gram Analysis
41 pages
Artificial Intelligence: N-Gram Models: Russell & Norvig: Section 22.1
No ratings yet
Artificial Intelligence: N-Gram Models: Russell & Norvig: Section 22.1
32 pages
NLP - N-Gram Language Model
No ratings yet
NLP - N-Gram Language Model
22 pages
Lecture 4
No ratings yet
Lecture 4
87 pages
N-Gram Language Models: Random Sentence Generated From A Jane Austen Trigram Model
No ratings yet
N-Gram Language Models: Random Sentence Generated From A Jane Austen Trigram Model
28 pages
A Phrasal Expressions List: Ron Martinez and Norbert Schmitt
No ratings yet
A Phrasal Expressions List: Ron Martinez and Norbert Schmitt
22 pages
3-Lecture Three - (Chapter Two-N-gram Language Models)
No ratings yet
3-Lecture Three - (Chapter Two-N-gram Language Models)
28 pages
Week 4
No ratings yet
Week 4
37 pages
1803 09288
No ratings yet
1803 09288
73 pages
Speech Recognition Course Guide
No ratings yet
Speech Recognition Course Guide
74 pages
N-Gram Language Models Lecture
No ratings yet
N-Gram Language Models Lecture
59 pages
Corpus Analysis (1) : Corpus Linguistics Richard Xiao
No ratings yet
Corpus Analysis (1) : Corpus Linguistics Richard Xiao
44 pages
3 LM 2024
No ratings yet
3 LM 2024
78 pages
Automatic Language Identification in Texts A Surve
No ratings yet
Automatic Language Identification in Texts A Surve
104 pages
CP5074 - SNA Unit V Notes
No ratings yet
CP5074 - SNA Unit V Notes
21 pages
Human Translation vs. Machine Translation
100% (1)
Human Translation vs. Machine Translation
20 pages
Introduction To NLP: What Is Natural Language Processing?
No ratings yet
Introduction To NLP: What Is Natural Language Processing?
14 pages
NLP Notes For Students
100% (2)
NLP Notes For Students
18 pages
14 Ngramlm
No ratings yet
14 Ngramlm
67 pages
The Features of Translationese
No ratings yet
The Features of Translationese
28 pages
Summer Internship Reports DSA Using C++
No ratings yet
Summer Internship Reports DSA Using C++
40 pages
Phonetic String Matching - Zobel and Dart PDF
No ratings yet
Phonetic String Matching - Zobel and Dart PDF
7 pages
N-Gram Language Models Lecture
No ratings yet
N-Gram Language Models Lecture
56 pages
N-Gram Language Model: Based On Speech and Language Processing. Daniel Jurafsky & James H. Martin Book, 2023
No ratings yet
N-Gram Language Model: Based On Speech and Language Processing. Daniel Jurafsky & James H. Martin Book, 2023
46 pages
Session 2-3 Language Modeling
No ratings yet
Session 2-3 Language Modeling
69 pages
TACL Users Guide PDF
No ratings yet
TACL Users Guide PDF
24 pages
13 Ngramlm
No ratings yet
13 Ngramlm
27 pages
Pami Im2Show and Tell: Lessons Learned From The 2015 MSCOCO Image Captioning Challenge
No ratings yet
Pami Im2Show and Tell: Lessons Learned From The 2015 MSCOCO Image Captioning Challenge
12 pages
Dancing Men Cipher
No ratings yet
Dancing Men Cipher
8 pages
3 LM 2024
No ratings yet
3 LM 2024
78 pages
N Grams
No ratings yet
N Grams
51 pages
M.Tech AI Syllabus - NIT Agartala
No ratings yet
M.Tech AI Syllabus - NIT Agartala
31 pages
Lecture 4
No ratings yet
Lecture 4
37 pages
Chapter 01
No ratings yet
Chapter 01
47 pages
Word N-Gram Based Approach For Word Sense Disambiguation in Telugu Natural Language Processing
No ratings yet
Word N-Gram Based Approach For Word Sense Disambiguation in Telugu Natural Language Processing
5 pages
Lecture13 Ngrams With SRILM
No ratings yet
Lecture13 Ngrams With SRILM
6 pages
JIFS179023 Author v2
No ratings yet
JIFS179023 Author v2
11 pages
Designing Punjabi Poetry Classifiers Using Machine Learning and Different Textual Features
No ratings yet
Designing Punjabi Poetry Classifiers Using Machine Learning and Different Textual Features
7 pages
N-Gram Probability Estimation
No ratings yet
N-Gram Probability Estimation
4 pages
Bag of Tricks For Text Classification
No ratings yet
Bag of Tricks For Text Classification
5 pages
6.chapter6 LanguageModel
No ratings yet
6.chapter6 LanguageModel
33 pages
A Paper On Stylometric Differences Between Authentic and Fabricated Ahadith
No ratings yet
A Paper On Stylometric Differences Between Authentic and Fabricated Ahadith
18 pages
Fine-Grained Arabic Dialect Identification
No ratings yet
Fine-Grained Arabic Dialect Identification
13 pages
Presidential Speech Analysis
No ratings yet
Presidential Speech Analysis
40 pages
N-Gram Language Models
No ratings yet
N-Gram Language Models
26 pages
N Grams - Nptel Notes
No ratings yet
N Grams - Nptel Notes
75 pages
Context-Based Bengali Next Word Prediction A Compa
No ratings yet
Context-Based Bengali Next Word Prediction A Compa
8 pages
Expert Systems With Applications: Aytu G Onan, Serdar Koruko Glu, Hasan Bulut
No ratings yet
Expert Systems With Applications: Aytu G Onan, Serdar Koruko Glu, Hasan Bulut
3 pages
LM 24 Aug
No ratings yet
LM 24 Aug
84 pages
Language Model
No ratings yet
Language Model
2 pages
Week5 Lab Langmodels
No ratings yet
Week5 Lab Langmodels
2 pages
NLP Unit-4
No ratings yet
NLP Unit-4
48 pages
Pentaho Data Deduplication
No ratings yet
Pentaho Data Deduplication
5 pages
Multimedia Application L6
No ratings yet
Multimedia Application L6
63 pages
CME4408 P5 N-Grams Smooting
No ratings yet
CME4408 P5 N-Grams Smooting
43 pages
04 Language Modeling
No ratings yet
04 Language Modeling
70 pages
Multimedia Application L5
No ratings yet
Multimedia Application L5
35 pages
Lecture 4 N Grams
No ratings yet
Lecture 4 N Grams
29 pages
5) Lecture Feb11&13&17&18
No ratings yet
5) Lecture Feb11&13&17&18
21 pages
NLP 1.2
No ratings yet
NLP 1.2
22 pages
08 NLP - N-Gram Language Models
No ratings yet
08 NLP - N-Gram Language Models
65 pages
Ngrams
No ratings yet
Ngrams
22 pages
LM 24 Aug
No ratings yet
LM 24 Aug
75 pages
2.1 Chap NLP Ngrams
No ratings yet
2.1 Chap NLP Ngrams
37 pages
Language Models
No ratings yet
Language Models
59 pages
Unit 2
No ratings yet
Unit 2
75 pages
Language Models L3-6
No ratings yet
Language Models L3-6
49 pages

Week 3

Uploaded by

Week 3

Uploaded by

Natural Language Processing

● Related task: probability of an upcoming word:

● A model that computes P(W) or P(wn|w1,w2…wn-1) is called a

● Intuition: let’s rely on the Chain Rule of Probability

● Recall the definition of conditional probabilities

● The Chain Rule in General

P(“its water is so transparent”) =

P(its) × P(water|its) × P(is|its water)

× P(so|its water is) × P(transparent|its water is so)

● No! Too many possible sentences!

● In other words, we approximate each component in the

Some automatically generated sentences from a unigram model

Condition on the previous word

● We can extend to trigrams, 4-grams, 5-grams

● The Maximum Likelihood Estimate

● We do everything in log space

You might also like