0% found this document useful (0 votes)

15 views161 pages

Week 7

The document introduces distributional semantics, which studies the meaning of words based on their usage in language and the statistical patterns of word co-occurrence. It outlines two main methods in computational semantics: formal semantics and distributional semantics, emphasizing the distributional hypothesis that similar meanings correlate with similar linguistic contexts. Additionally, it discusses the construction of word spaces and distributional semantic models that represent word meanings as vectors derived from context data.

Uploaded by

rahejarhythm29

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views161 pages

Week 7

Uploaded by

rahejarhythm29

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 161

Distributional Semantics - Introduction

EL
Pawan Goyal

PT CSE, IIT Kharagpur

Week 7, Lecture 1
N

Pawan Goyal (IIT Kharagpur) Distributional Semantics - Introduction Week 7, Lecture 1 1 / 14

Introduction

EL
What is Semantics?

PT
N

Pawan Goyal (IIT Kharagpur) Distributional Semantics - Introduction Week 7, Lecture 1 2 / 14

Introduction

EL
What is Semantics?
The study of meaning: Relation between symbols and their denotata.

PT
N

Pawan Goyal (IIT Kharagpur) Distributional Semantics - Introduction Week 7, Lecture 1 2 / 14

Introduction

EL
What is Semantics?
The study of meaning: Relation between symbols and their denotata.

PT
John told Mary that the train moved out of the station at 3 o’clock.
N

Pawan Goyal (IIT Kharagpur) Distributional Semantics - Introduction Week 7, Lecture 1 2 / 14

Computational Semantics

EL
PT
N

Pawan Goyal (IIT Kharagpur) Distributional Semantics - Introduction Week 7, Lecture 1 3 / 14

Computational Semantics

Computational Semantics
The study of how to automate the process of constructing and reasoning with

EL
meaning representations of natural language expressions.

PT
N

Pawan Goyal (IIT Kharagpur) Distributional Semantics - Introduction Week 7, Lecture 1 3 / 14

Computational Semantics

Computational Semantics
The study of how to automate the process of constructing and reasoning with

EL
meaning representations of natural language expressions.

Methods in Computational Semantics generally fall in two categories:

PT
Formal Semantics: Construction of precise mathematical models of the
relations between expressions in a natural language and the world.
N

Pawan Goyal (IIT Kharagpur) Distributional Semantics - Introduction Week 7, Lecture 1 3 / 14

Computational Semantics

Computational Semantics
The study of how to automate the process of constructing and reasoning with

EL
meaning representations of natural language expressions.

Methods in Computational Semantics generally fall in two categories:

PT
Formal Semantics: Construction of precise mathematical models of the
relations between expressions in a natural language and the world.
John chases a bat → ∃x[bat(x) ∧ chase(john, x)]
N

Pawan Goyal (IIT Kharagpur) Distributional Semantics - Introduction Week 7, Lecture 1 3 / 14

Computational Semantics

Computational Semantics
The study of how to automate the process of constructing and reasoning with

EL
meaning representations of natural language expressions.

Methods in Computational Semantics generally fall in two categories:

PT
Formal Semantics: Construction of precise mathematical models of the
relations between expressions in a natural language and the world.
John chases a bat → ∃x[bat(x) ∧ chase(john, x)]
N
Distributional Semantics: The study of statistical patterns of human
word usage to extract semantics.

Pawan Goyal (IIT Kharagpur) Distributional Semantics - Introduction Week 7, Lecture 1 3 / 14

Distributional Hypothesis

EL
PT
N

Pawan Goyal (IIT Kharagpur) Distributional Semantics - Introduction Week 7, Lecture 1 4 / 14

Distributional Hypothesis

Distributional Hypothesis: Basic Intuition

“The meaning of a word is its use in language.” (Wittgenstein,

EL
1953)

“You know a word by the company it keeps.” (Firth, 1957)

PT
N

Pawan Goyal (IIT Kharagpur) Distributional Semantics - Introduction Week 7, Lecture 1 4 / 14

Distributional Hypothesis

Distributional Hypothesis: Basic Intuition

“The meaning of a word is its use in language.” (Wittgenstein,

EL
1953)

“You know a word by the company it keeps.” (Firth, 1957)

PT
→ Word meaning (whatever it might be) is reflected in linguistic distributions.
N

Pawan Goyal (IIT Kharagpur) Distributional Semantics - Introduction Week 7, Lecture 1 4 / 14

Distributional Hypothesis

Distributional Hypothesis: Basic Intuition

“The meaning of a word is its use in language.” (Wittgenstein,

EL
1953)

“You know a word by the company it keeps.” (Firth, 1957)

PT
→ Word meaning (whatever it might be) is reflected in linguistic distributions.
“Words that occur in the same contexts tend to have similar
N
meanings.” (Zellig Harris, 1968)

Pawan Goyal (IIT Kharagpur) Distributional Semantics - Introduction Week 7, Lecture 1 4 / 14

Distributional Hypothesis

Distributional Hypothesis: Basic Intuition

“The meaning of a word is its use in language.” (Wittgenstein,

EL
1953)

“You know a word by the company it keeps.” (Firth, 1957)

PT
→ Word meaning (whatever it might be) is reflected in linguistic distributions.
“Words that occur in the same contexts tend to have similar
N
meanings.” (Zellig Harris, 1968)

→ Semantically similar words tend to have similar distributional patterns.

Pawan Goyal (IIT Kharagpur) Distributional Semantics - Introduction Week 7, Lecture 1 4 / 14

Distributional Semantics: a linguistic perspective

“If linguistics is to deal with meaning, it can only do so through

distributional analysis.” (Zellig Harris)

EL
PT
N

Pawan Goyal (IIT Kharagpur) Distributional Semantics - Introduction Week 7, Lecture 1 5 / 14

Distributional Semantics: a linguistic perspective

“If linguistics is to deal with meaning, it can only do so through

distributional analysis.” (Zellig Harris)

EL
“If we consider words or morphemes A and B to be more
different in meaning than A and C, then we will often find that the
distributions of A and B are more different than the distributions of A

PT
and C. In other words, difference in meaning correlates with
difference of distribution.” (Zellig Harris, “Distributional Structure”)
N

Pawan Goyal (IIT Kharagpur) Distributional Semantics - Introduction Week 7, Lecture 1 5 / 14

Distributional Semantics: a linguistic perspective

“If linguistics is to deal with meaning, it can only do so through

distributional analysis.” (Zellig Harris)

EL
“If we consider words or morphemes A and B to be more
different in meaning than A and C, then we will often find that the
distributions of A and B are more different than the distributions of A

PT
and C. In other words, difference in meaning correlates with
difference of distribution.” (Zellig Harris, “Distributional Structure”)
N
Differential and not referential

Pawan Goyal (IIT Kharagpur) Distributional Semantics - Introduction Week 7, Lecture 1 5 / 14

Distributional Semantics: a cognitive perspective

Contextual representation

EL
A word’s contextual representation is an abstract cognitive structure that
accumulates from encounters with the word in various linguistic contexts.

PT
N

Pawan Goyal (IIT Kharagpur) Distributional Semantics - Introduction Week 7, Lecture 1 6 / 14

Distributional Semantics: a cognitive perspective

Contextual representation

EL
A word’s contextual representation is an abstract cognitive structure that
accumulates from encounters with the word in various linguistic contexts.

PT
We learn new words based on contextual cues
N

Pawan Goyal (IIT Kharagpur) Distributional Semantics - Introduction Week 7, Lecture 1 6 / 14

Distributional Semantics: a cognitive perspective

Contextual representation

EL
A word’s contextual representation is an abstract cognitive structure that
accumulates from encounters with the word in various linguistic contexts.

PT
We learn new words based on contextual cues
He filled the wampimuk with the substance, passed it around and we all drunk
some.
N

Pawan Goyal (IIT Kharagpur) Distributional Semantics - Introduction Week 7, Lecture 1 6 / 14

Distributional Semantics: a cognitive perspective

Contextual representation

EL
A word’s contextual representation is an abstract cognitive structure that
accumulates from encounters with the word in various linguistic contexts.

PT
We learn new words based on contextual cues
He filled the wampimuk with the substance, passed it around and we all drunk
some.
N
We found a little wampimuk sleeping behind the tree.

Pawan Goyal (IIT Kharagpur) Distributional Semantics - Introduction Week 7, Lecture 1 6 / 14

Distributional Semantic Models (DSMs)

Computational models that build contextual semantic repesentations from

corpus data

EL
PT
N

Pawan Goyal (IIT Kharagpur) Distributional Semantics - Introduction Week 7, Lecture 1 7 / 14

Distributional Semantic Models (DSMs)

Computational models that build contextual semantic repesentations from

corpus data
DSMs are models for semantic representations

EL
I The semantic content is represented by a vector
I Vectors are obtained through the statistical analysis of the linguistic
contexts of a word

PT
N

Pawan Goyal (IIT Kharagpur) Distributional Semantics - Introduction Week 7, Lecture 1 7 / 14

Distributional Semantic Models (DSMs)

Computational models that build contextual semantic repesentations from

corpus data
DSMs are models for semantic representations

EL
I The semantic content is represented by a vector
I Vectors are obtained through the statistical analysis of the linguistic
contexts of a word
Alternative names
I PT
corpus-based semantics
statistical semantics
N
I
I geometrical models of meaning
I vector semantics
I word space models

Pawan Goyal (IIT Kharagpur) Distributional Semantics - Introduction Week 7, Lecture 1 7 / 14

Distributional Semantics: The general intuition

EL
Distributions are vectors in a multidimensional semantic space, that is,
objects with a magnitude and a direction.

PT
The semantic space has dimensions which correspond to possible
contexts, as gathered from a given corpus.
N

Pawan Goyal (IIT Kharagpur) Distributional Semantics - Introduction Week 7, Lecture 1 8 / 14

Vector Space

EL
PT
N

Pawan Goyal (IIT Kharagpur) Distributional Semantics - Introduction Week 7, Lecture 1 9 / 14

Vector Space

EL
PT
N

Pawan Goyal (IIT Kharagpur) Distributional Semantics - Introduction Week 7, Lecture 1 9 / 14

Vector Space

EL
PT
N
In practice, many more dimensions are used.
cat = [...dog 0.8, eat 0.7, joke 0.01, mansion 0.2,...]

Pawan Goyal (IIT Kharagpur) Distributional Semantics - Introduction Week 7, Lecture 1 9 / 14

Word Space

Small Dataset
An automobile is a wheeled motor vehicle used for transporting passengers .
A car is a form of transport , usually with four wheels and the capacity to carry around

EL
five passengers .
Transport for the London games is limited , with spectators strongly advised to avoid
the use of cars .

PT
The London 2012 soccer tournament began yesterday , with plenty of goals in the
opening matches .
Giggs scored the first goal of the football tournament at Wembley , North London .
Bellamy was largely a passenger in the football match , playing no part in either goal .
N
Target words: hautomobile, car, soccer, footballi
Term vocabulary : hwheel, transport, passenger, tournament, London, goal,
matchi

Pawan Goyal (IIT Kharagpur) Distributional Semantics - Introduction Week 7, Lecture 1 10 / 14

Constructing Word spaces

Informal algorithm for constructing word spaces

Pick the words you are interested in: target words

EL
PT
N

Pawan Goyal (IIT Kharagpur) Distributional Semantics - Introduction Week 7, Lecture 1 11 / 14

Constructing Word spaces

Informal algorithm for constructing word spaces

Pick the words you are interested in: target words

EL
Define a context window, number of words surrounding target word

PT
N

Pawan Goyal (IIT Kharagpur) Distributional Semantics - Introduction Week 7, Lecture 1 11 / 14

Constructing Word spaces

Informal algorithm for constructing word spaces

Pick the words you are interested in: target words

EL
Define a context window, number of words surrounding target word
I The context can in general be defined in terms of documents, paragraphs
or sentences.

PT
N

Pawan Goyal (IIT Kharagpur) Distributional Semantics - Introduction Week 7, Lecture 1 11 / 14

Constructing Word spaces

Informal algorithm for constructing word spaces

Pick the words you are interested in: target words

EL
Define a context window, number of words surrounding target word
I The context can in general be defined in terms of documents, paragraphs
or sentences.

PT
Count number of times the target word co-occurs with the context words:
co-occurrence matrix
N

Pawan Goyal (IIT Kharagpur) Distributional Semantics - Introduction Week 7, Lecture 1 11 / 14

Constructing Word spaces

Informal algorithm for constructing word spaces

Pick the words you are interested in: target words

EL
Define a context window, number of words surrounding target word
I The context can in general be defined in terms of documents, paragraphs
or sentences.

PT
Count number of times the target word co-occurs with the context words:
co-occurrence matrix
N
Build vectors out of (a function of) these co-occurrence counts

Pawan Goyal (IIT Kharagpur) Distributional Semantics - Introduction Week 7, Lecture 1 11 / 14

Constructing Word spaces: distributional vectors

distributional matrix = targets X contexts

EL
wheel transport passenger tournament London goal match
automobile 1 1 1 0 0 0 0
car
soccer
football
1
0
0
PT2
0
0
1
0
1
0
1
1
1
1
1
0
1
2
0
1
1
N

Pawan Goyal (IIT Kharagpur) Distributional Semantics - Introduction Week 7, Lecture 1 12 / 14

2.5

2 football (0,2)

EL
1.5
goal

1 soccer (0,1)

0.5

0
0
PT
0.5
automobile (1,0)
1 1.5
car (2,0)
2 2.5
N
transport

Pawan Goyal (IIT Kharagpur) Distributional Semantics - Introduction Week 7, Lecture 1 13 / 14

Computing similarity

wheel transport passenger tournament London goal match

automobile 1 1 1 0 0 0 0

EL
car 1 2 1 0 1 0 0
soccer 0 0 0 1 1 1 1
football 0 0 1 1 1 2 1

Using simple vector product

automobile . car = 4
PT car . soccer = 1
N
automobile . soccer = 0 car . football = 2
automobile . football = 1 soccer . football = 5

Pawan Goyal (IIT Kharagpur) Distributional Semantics - Introduction Week 7, Lecture 1 14 / 14

Distributional Models of Semantics

EL
Pawan Goyal

PT CSE, IIT Kharagpur

Week 7, Lecture 2
N

Pawan Goyal (IIT Kharagpur) Distributional Models of Semantics Week 7, Lecture 2 1 / 19

Vector Space Model without distributional similarity

Words are treated as atomic symbols

EL
PT
N

Pawan Goyal (IIT Kharagpur) Distributional Models of Semantics Week 7, Lecture 2 2 / 19

Vector Space Model without distributional similarity

Words are treated as atomic symbols

EL
One-hot representation

PT
N

Pawan Goyal (IIT Kharagpur) Distributional Models of Semantics Week 7, Lecture 2 2 / 19

Distributional Similarity Based Representations

You know a word by the company it keeps

EL
PT
N

Pawan Goyal (IIT Kharagpur) Distributional Models of Semantics Week 7, Lecture 2 3 / 19

Distributional Similarity Based Representations

You know a word by the company it keeps

EL
PT
N

Pawan Goyal (IIT Kharagpur) Distributional Models of Semantics Week 7, Lecture 2 3 / 19

Distributional Similarity Based Representations

You know a word by the company it keeps

EL
PT
These words will represent banking
N

Pawan Goyal (IIT Kharagpur) Distributional Models of Semantics Week 7, Lecture 2 3 / 19

Building a DSM step-by-step

The “linguistic” steps

Pre-process a corpus (to define targets and contexts)
⇓
Select the targets and the contexts

EL
PT
N

Pawan Goyal (IIT Kharagpur) Distributional Models of Semantics Week 7, Lecture 2 4 / 19

Building a DSM step-by-step

The “linguistic” steps

Pre-process a corpus (to define targets and contexts)
⇓
Select the targets and the contexts

EL
The “mathematical” steps
Count the target-context co-occurrences

PT ⇓
Weight the contexts (optional)
⇓
N
Build the distributional matrix
⇓
Reduce the matrix dimensions (optional)
⇓
Compute the vector distances on the (reduced) matrix

Pawan Goyal (IIT Kharagpur) Distributional Models of Semantics Week 7, Lecture 2 4 / 19

Many design choices

EL
PT
N

Pawan Goyal (IIT Kharagpur) Distributional Models of Semantics Week 7, Lecture 2 5 / 19

Many design choices

EL
PT
N
General Questions
How do the rows (words, ...) relate to each other?
How do the columns (contexts, documents, ...) relate to each other?

Pawan Goyal (IIT Kharagpur) Distributional Models of Semantics Week 7, Lecture 2 5 / 19

The parameter space

A number of parameters to be fixed

EL
Which type of context?
Which weighting scheme?
Which similarity measure?
...
PT
A specific parameter setting determines a particular type of DSM (e.g. LSA,
N
HAL, etc.)

Pawan Goyal (IIT Kharagpur) Distributional Models of Semantics Week 7, Lecture 2 6 / 19

Documents as context: Word × document

EL
PT
N

Pawan Goyal (IIT Kharagpur) Distributional Models of Semantics Week 7, Lecture 2 7 / 19

Words as context: Word × Word

EL
PT
N

Pawan Goyal (IIT Kharagpur) Distributional Models of Semantics Week 7, Lecture 2 8 / 19

Words as contexts

EL
Parameters
Window size

PT
Window shape - rectangular/triangular/other
N

Pawan Goyal (IIT Kharagpur) Distributional Models of Semantics Week 7, Lecture 2 9 / 19

Words as contexts

Parameters
Window size

EL
Window shape - rectangular/triangular/other

Consider the following passage

PT
Suspected communist rebels on 4 July 1989 killed Col. Herminio Taylo, police
chief of Makati, the Philippines major financial center, in an escalation of street
violence sweeping the Capitol area. The gunmen shouted references to the
N
rebel New People’s Army. They fled in a commandeered passenger jeep. The
military says communist rebels have killed up to 65 soldiers and police in the
Capitol region since January.

Pawan Goyal (IIT Kharagpur) Distributional Models of Semantics Week 7, Lecture 2 9 / 19

Words as contexts

Parameters
Window size

EL
Window shape - rectangular/triangular/other

5 words window (unfiltered): 2 words either side of the target word

Pawan Goyal (IIT Kharagpur) Distributional Models of Semantics Week 7, Lecture 2 9 / 19

Words as contexts

Parameters
Window size

EL
Window shape - rectangular/triangular/other

5 words window (filtered): 2 words either side of the target word

Pawan Goyal (IIT Kharagpur) Distributional Models of Semantics Week 7, Lecture 2 9 / 19

Context weighting: documents as context

EL
PT
N

Pawan Goyal (IIT Kharagpur) Distributional Models of Semantics Week 7, Lecture 2 10 / 19

Context weighting: documents as context

Indexing function F: Essential factors

Word frequency (fij ): How many times a word appears in the document?
F ∝ fij

EL
Document length (|Di |): How many words appear in the document?
1
F∝ |Di |

appears. F ∝ N1
j PT
Document frequency (Nj ): Number of documents in which a word
N

Pawan Goyal (IIT Kharagpur) Distributional Models of Semantics Week 7, Lecture 2 10 / 19

Context weighting: documents as context

Indexing function F: Essential factors

Word frequency (fij ): How many times a word appears in the document?
F ∝ fij

EL
Document length (|Di |): How many words appear in the document?
1
F∝ |Di |

appears. F ∝ N1
j PT
Document frequency (Nj ): Number of documents in which a word
N
Indexing Weight: tf-Idf
fij ∗ log( NNj ) for each term, normalize the weight in a document with
respect to L2 -norm.