Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
19 views60 pages

Text Classification

The document outlines the content of Lecture 3 for COMP 3361 Natural Language Processing, focusing on text classification techniques such as Naive Bayes and logistic regression. It includes announcements about assignments, course resources, and a survey link, along with a lecture plan covering language modeling and text classification methods. The lecture also discusses the advantages and disadvantages of Naive Bayes compared to more complex models like ChatGPT.

Uploaded by

9gt5rqjjnq
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views60 pages

Text Classification

The document outlines the content of Lecture 3 for COMP 3361 Natural Language Processing, focusing on text classification techniques such as Naive Bayes and logistic regression. It includes announcements about assignments, course resources, and a survey link, along with a lecture plan covering language modeling and text classification methods. The lecture also discusses the advantages and disadvantages of Naive Bayes compared to more complex models like ChatGPT.

Uploaded by

9gt5rqjjnq
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 60

COMP 3361 Natural Language Processing

Lecture 3: Text Classification

Spring 2024

Many materials from CS224n@Stanford and COS484@Princeton with special thanks!


Announcements
● Assignment 1 has been released (due in 4 weeks: 9:00 am, Feb 20)
● Once more, please sign up for the course's Slack workspace. This is included in your
class participation grade.
https://join.slack.com/t/slack-fdv4728/shared_invite/zt-2asgddr0h-6wIXbRndwKhBw2IX2~ZrJQ

● You should be able to access the course Moodle page now.


● The course page has updated details on the tentative schedule

Google form survey


https://forms.gle/FMQvFCuzUyJ3pB93A
Lecture plan
● Recap of language modeling
● Naive Bayes and sentiment classification
● Logistic Regression for text classification
Generating from language models
● Deterministic approach: Temperature=0, always selects the word with the highest
probability in each iteration

How ChatGPT completes a sentence with temperature=0

https://www.atmosera.com/ai/understanding-chatgpt/
Generating from language models
● Probabilistic or stochastic approach: e.g., temperature=0.7, the next word is chosen
based on a probability distribution over the possible words. More creative!

How ChatGPT completes a sentence with temperature=0.7

https://www.atmosera.com/ai/understanding-chatgpt/
Why text classification?

Spam email detection Sentiment analysis

Q: any other examples?


Text classification
Prompting ChatGPT for text classification
Prompting ChatGPT for text classification

Parse ChatGPT’s output


Rule-based text classification
IF there exists word w in document d such that w in [good, great, extra-ordinary, …],
THEN output Positive
IF email address ends in [ithelpdesk.com, makemoney.com, spinthewheel.com, …]
THEN output SPAM

● + Can be very accurate (if rules carefully refined by expert)


● - Rules may be hard to define (and some even unknown to us!)
● - Labor intensive and expensive
● - Hard to generalize and keep up-to-date
Supervised Learning: Let’s use statistics!
Let the machine figure out the best patterns using data

Key questions:
● What is the form of F?
● How do we learn F?
Types of supervised classifiers

Logistic regression
Naive Bayes

Support vector machines Neural networks


Naive Bayes

Naive Bayes
Naive Bayes classifier
Simple classification model making use of Bayes rule
● Bayes rule:
Naive Bayes classifier
Naive Bayes classifier
How to represent ?
● Option 1: represent the entire sequence of words
○ Too many sequences!
How to represent ?
● Option 1: represent the entire sequence of words
○ Too many sequences!
● Option 2: Bag of words

○ Assume position of each word doesn’t matter


○ Probability of each word is conditionally independent of the other words given
class c
Bag of words (BoW)
Predicting with Naive Bayes
How to estimate probabilities?
Data sparsity problem

��
This sounds familiar…
Solution: Smoothing!
Overall process
Overall process
Overall process
A worked example for sentiment analysis
A worked example for sentiment analysis
A worked example for sentiment analysis
Naive Bayes vs. language models
Naive Bayes vs. language models
Naive Bayes vs. language models
Naive Bayes vs. language models
Naive Bayes vs. language models
Naive Bayes vs. language models
Naive Bayes: pros and cons
Naive Bayes can use any features!

● In general, Naive Bayes can use


any set of features, not just
words:
○ URLs, email addresses,
Capitalization, …
○ Domain knowledge crucial
to performance

Top features for spam detection


Wait, we already have ChatGPT, why still NB?

Naive Bayes Transformers, neural networks and many others


e.g., ChatGPT
Wait, we already have ChatGPT, why still NB?
● Computational efficiency, cost
● Simplicity and interpretability
● Small data performance
● Out of domain
○ Requires domain experts to design
features
● …

Naive Bayes Transformers, neural networks and many others


e.g., ChatGPT
Logistic regression

Logistic regression

Study yourself!
Logistic regression

https://machine-learning.paperspace.com/wiki/logistic-regression
Generative vs. discriminative models
Generative classifiers
Discriminative classifiers
Overall process: Discriminative classifiers
1. Feature representation

Bag of words
Example: Sentiment classification
2. Classification function
Example: Sentiment classification
3. Loss function
Example: Computing CE loss
Properties of CE loss
Properties of CE loss
4. Optimization
Gradient for logistic regression
Regularization
Multinomial logistic regression
Features in multinomial LR
Learning
Next lecture: word embeddings

You might also like