Word Embeddings Topics
NLP Session 2-Vector semantics and Embedding by Dr. Chetana Gavankar [Autosaved].pdf
PDF
List all topics covered in this pdf
The topics covered in the document are as follows:
1. Vector Semantics and Word Embedding:
Lexical semantics
Vector semantics
Word vectors
TF-IDF
Word2Vec (Skip-gram, CBOW)
GloVe
Visualizing embeddings
2. Lexical Semantics:
Morphology
Language modeling
Distributional hypothesis
Synonymy, similarity, and relatedness
Antonymy and connotation
3. Word Similarity and Relatedness:
Semantic fields
Applications in question answering, plagiarism detection, etc.
4. Computational Models of Word Meaning:
Defining words by usage
Word embeddings as vectors
Importance of context and distribution
5. Types of Word Embeddings:
Frequency-based (Count vectors, TF-IDF, Co-occurrence)
Prediction-based (Word2Vec, GloVe)
6. Techniques for Embedding:
Word2Vec (Skip-gram, CBOW)
Training embeddings (positive and negative examples)
Stochastic Gradient Descent (SGD)
7. Embedding Properties:
Sparse vs. dense vectors
Contextual embeddings (e.g., ELMo, BERT)
Historical and cultural bias in embeddings
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 1/9
8. Applications of Word Embeddings:
Information retrieval
Analogical reasoning
Sentiment analysis
9. Advanced Topics:
Diachronic word embeddings
Bias quantification in embeddings
Optimization of computational efficiency in embeddings
10. Comparative Methods:
Skip-gram vs. CBOW
Use cases and advantages of each method
11. Visualization and Interpretation:
Visualizing word embeddings
Analogical relations using embeddings
12. GloVe Specifics:
Word-word co-occurrence matrix
Ratios of conditional probabilities
Intuition and advantages of GloVe over Word2Vec
13. References and Resources:
Links to related literature, tutorials, and videos for further understanding.
If you need detailed insights or explanations for any specific topic, feel free to ask!
Explain each topic in detail with a numerical example where ever is required. Also show flow
diagram or chart where ever necessary
This is a comprehensive task, and I will break down each topic from the document in detail, with
numerical examples, flow diagrams, and charts where necessary. Here's the plan:
1. Topic Explanation: Each topic will be explained conceptually.
2. Numerical Examples: Relevant topics will include numerical examples for better understanding.
3. Flow Diagrams or Charts: Topics involving processes or workflows will be complemented with
flow diagrams or charts.
Let’s start with Vector Semantics and Word Embedding:
1. Vector Semantics and Word Embedding
Concept:
Vector Semantics: Words are represented as points in a multi-dimensional space, where the
distances between points indicate semantic similarity. For example, words like "king" and "queen"
are closer in this space than "king" and "cat."
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 2/9
Word Embedding: A technique to map words into dense vectors of real numbers which capture
semantic meanings. Word embeddings are used in NLP to analyze and predict text-based tasks.
Numerical Example:
Suppose we have the following sentences:
1. "I love apples."
2. "Apples are delicious."
The co-occurrence matrix for the words could look like this:
Word I love apples are delicious
I 0 1 1 0 0
love 1 0 1 0 0
apples 1 1 0 1 1
are 0 0 1 0 1
delicious 0 0 1 1 0
Each row or column represents a vector for that word. For example:
Vector for "apples": [1, 1, 0, 1, 1]
2. Lexical Semantics
Concept:
The study of word meanings and their relationships, such as synonymy (similar meanings),
antonymy (opposite meanings), and polysemy (multiple meanings).
Distributional Hypothesis: Words that appear in similar contexts have similar meanings.
Numerical Example:
Consider the sentences:
1. "The cat sat on the mat."
2. "The dog lay on the carpet."
The co-occurrence matrix for "cat" and "dog" might show similar contexts like "sat" and "lay":
Word cat dog sat lay mat carpet
cat 0 0 1 0 1 0
dog 0 0 0 1 0 1
Here, "cat" and "dog" are related because they co-occur with similar words ("mat" and "carpet").
3. TF-IDF
Concept:
Term Frequency (TF): Frequency of a word in a document.
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 3/9
Inverse Document Frequency (IDF): A measure of how unique a word is across all documents.
Formula:
TF-IDF(t, d) = TF(t, d) × IDF(t)
Where:
count(t, d) N
TF(t, d) = , IDF(t) = log
total words in d DF(t)
N : Total number of documents
DF (t): Number of documents containing t
Numerical Example:
Suppose we have two documents:
1. Doc1: "Apples are sweet."
2. Doc2: "Apples are tasty."
TF of "apples" in Doc1: TF= 13 = 0.33
IDF of "apples": IDF = log 22 = 0 (since it appears in both documents)
TF-IDF of "apples": TF − IDF = 0.33 × 0 = 0
4. Word2Vec
Concept:
Two key architectures:
Skip-gram: Predict surrounding words (context) given a target word.
CBOW: Predict the target word given surrounding words.
Flow Diagram:
mathematica
Input: One-hot encoding of "apple"
↓
Hidden Layer: Word embedding vector
↓
Output Layer: Softmax probabilities for context words
Numerical Example:
For the sentence: "I love apples," the Skip-gram model might create training pairs:
("I", "love"), ("love", "apples")
The model learns embeddings to maximize the similarity between pairs.
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 4/9
5. GloVe
Concept:
Combines global co-occurrence statistics and local context for word embedding.
Example:
Using the co-occurrence matrix:
Word apple orange fruit
apple 10 3 20
orange 3 8 15
fruit 20 15 50
The ratio of co-occurrence probabilities is used to compute embeddings:
P (i∣j)
Ratio =
P (k∣j)
6. Word Similarity
Concept:
Dot Product: Measures similarity between word vectors.
Cosine Similarity: Normalized dot product to account for vector length.
Formula:
A⋅B
Cosine Similarity =
∣∣A∣∣ ⋅ ∣∣B ∣∣
Numerical Example:
Given two vectors:
A = [1, 2, 3]
B = [4, 5, 6]
Dot Product = 1 ⋅ 4 + 2 ⋅ 5 + 3 ⋅ 6 = 32
32
Cosine Similarity = = 0.9746
12 + 22 + 32 ⋅ 42 + 52 + 62
7. Applications
Question Answering: Using embeddings to retrieve relevant answers.
Plagiarism Detection: Semantic similarity measures to compare texts.
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 5/9
This detailed explanation has been started for the first few topics. I will continue to elaborate on
remaining topics in subsequent parts. Let me know if you'd like me to proceed with the next topics.
please proceed
Continuing with the explanation of topics:
8. Skip-gram and CBOW in Word2Vec
Concept:
Skip-gram: Predicts context words for a given center word.
CBOW (Continuous Bag of Words): Predicts the center word using surrounding context words.
Differences:
Feature Skip-gram CBOW
Prediction Target Context words Center word
Training Data Works well with small data Requires large data
Focus Rare words Frequent words
Training Speed Slower Faster
Numerical Example:
Sentence: "I enjoy learning NLP."
Skip-gram:
Context window = 2
Training pairs: ("I", "enjoy"), ("I", "learning"), ("enjoy", "I"), ("enjoy", "learning"), etc.
CBOW:
Input: ["I", "learning"]
Output: "enjoy"
Flow Diagram:
Skip-gram:
vbnet
Input: "NLP"
↓
Hidden Layer: Embedding vector for "NLP"
↓
Output: Predicts words like "learning" and "enjoy"
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 6/9
CBOW:
vbnet
Input: Words ["I", "learning"]
↓
Hidden Layer: Aggregated embeddings
↓
Output: Predicts "enjoy"
9. GloVe
Concept:
Focuses on the global word co-occurrence matrix and uses ratios of conditional probabilities for
embedding computation.
Numerical Example:
Suppose the co-occurrence matrix X is:
Word apple fruit sweet
apple 50 30 20
fruit 30 40 10
sweet 20 10 25
Compute P (word∣context):
30 20
P (apple∣fruit) = = 0.75, P (apple∣sweet) = = 0.8
40 25
Ratio:
P (apple∣fruit) 0.75
Ratio = = = 0.9375
P (apple∣sweet) 0.8
This ratio is used to learn embeddings.
10. Word Embedding Visualization
Concept:
Techniques like t-SNE or PCA reduce high-dimensional embeddings to 2D or 3D for visualization.
Example:
Words such as "king," "queen," "man," "woman" may cluster based on gender and royalty.
Visualization Chart:
Using t-SNE, a 2D plot might show:
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 7/9
css
woman--------queen
| |
| |
man---------king
11. Semantic Relations in Word Embeddings
Concept:
Analogies: king − man + woman = queen
Relations like synonyms, antonyms, or analogical reasoning.
Numerical Example:
Given vectors:
king = [0.8, 0.6], man = [0.7, 0.4], woman = [0.6, 0.8]
Compute:
queen = king − man + woman
queen = [0.8, 0.6] − [0.7, 0.4] + [0.6, 0.8] = [0.7, 1.0]
12. Bias in Word Embeddings
Concept:
Word embeddings can reflect societal biases present in training data.
Example: "man is to computer programmer as woman is to homemaker."
Visualization:
Biased embeddings:
css
man---programmer
|
|
woman---homemaker
13. TF-IDF vs Word2Vec
Aspect TF-IDF Word2Vec
Vector Type Sparse Dense
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 8/9
Aspect TF-IDF Word2Vec
Basis Term frequency Learned embeddings
Context Awareness No Yes
Use Case Document retrieval Semantic understanding
14. Applications of Word Embeddings
1. Question Answering:
Use embeddings to retrieve the most relevant answers.
2. Plagiarism Detection:
Compare text similarity using cosine similarity.
3. Sentiment Analysis:
Use embeddings to identify sentiments in a document.
15. Skip-gram Training Workflow
Flow Diagram:
1. Input Sentence: "I like apples."
2. Training Window: Context size = 2
3. Training Pairs:
Positive: ("I", "like"), ("like", "I"), etc.
Negative: ("I", "bananas"), etc.
Steps:
Compute dot product for each word pair.
Apply sigmoid to get probabilities.
Adjust embeddings via SGD.
16. Final Summary
Key Concepts:
1. Vector Semantics: Words are vectors in space.
2. Word Embeddings: Represent words in dense vectors.
3. Word2Vec and GloVe: Different techniques for embedding learning.
4. Applications: From question answering to bias analysis.
If you need further expansion on specific sections or additional diagrams, let me know!
Is this conversation helpful so far?
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 9/9