0% found this document useful (0 votes)

88 views8 pages

Natural Language Processing-Course Handout September 2022

The document provides details about a course on natural language processing including course objectives, textbooks, reference materials, modular content structure, and contact session plans. The course covers fundamental NLP concepts and techniques including language models, word embeddings, part-of-speech tagging, parsing, and applications.

Uploaded by

Mahesh Anem

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

88 views8 pages

Natural Language Processing-Course Handout September 2022

Uploaded by

Mahesh Anem

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

BIRLA INSTITUTE OF

TECHNOLOGY & SCIENCE, PILANI

WORK INTEGRATED LEARNING PROGRAMMES
COURSE HANDOUT
Part A: Content Design
Course Title Natural Language Processing
Course No(s)
Credit Units 4 units
Course Author Dr. Chetana Gavankar
Version No 1.0
Date September 2022

Course Objectives
No Course Objective

CO1 To learn the fundamental concepts and techniques of natural language processing (NLP)
including Language Models, Word Embedding, Part pf speech Tagging, Parsing

CO2 To learn computational properties of natural languages and the commonly used algorithms
for processing linguistic information

CO3 To introduce basic mathematical models and methods used in NLP applications to
formulate computational solutions.

CO4 To introduce students research and development work in Natural language Processing
Text Book(s)
T1 Jurafsky and Martin, SPEECH and LANGUAGE PROCESSING: An Introduction to
Natural Language Processing, Computational Linguistics, and Speech Recognition,
McGraw Hill
T2 Manning and Schütze, Foundations of Statistical Natural Language Processing, MIT Press.
Cambridge, MA

Reference Book(s) & other resources

R1 Allen James, Natural Language Understanding
R2 Neural Machine Translation by Philipp Koehn
R3 Semantic Web Primer (Information Systems) By Antoniou, Grigoris; Van Harmelen, Frank
Modular Content Structure

1. Natural Language Understanding and Generation

 The Study of Language.
 Applications of Natural Language Understanding.
 Evaluating Language Understanding Systems.
 The Different Levels of Language Analysis.
 The Organization of Natural Language Understanding Systems.

2. N-gram Language Modelling

 N-Grams
 Generalization and Zeros.
 Smoothing
 The Web and Stupid Backoff
 Evaluating Language Models
 Smoothing
 The Web and Stupid Backoff

3 Neural networks and Neural language Models

 Units
 The XOR problem
 Feed-Forward Neural Networks
 Training Neural Nets
 Neural Language Models -expand spend more time

4. Part-of-Speech Tagging
 (Mostly) English Word Classes
 The Penn Treebank Part-of-Speech Tag set
 Part-of-Speech Tagging
 Markov Chains
 The Hidden Markov Model
 HMM Part-of-Speech Tagging
 Part-of-Speech Tagging for Morphological Rich Languages

5. Hidden Markov Models and MEMM

 The Hidden Markov Model
 Likelihood Computation: The Forward Algorithm
 Decoding: The Viterbi Algorithm
 HMM Training: The Forward-Backward Algorithm
 Maximum Entropy Markov Models
 Bidirectionality

6. Topic Modelling
 Mathematical foundations for LDA : Multinomial and Dirichlet distributions
 Intuition behind LDA
 LDA Generative model
 Latent Dirichlet Allocation Algorithm and Implementation
 Gibbs Sampling

7. Vector semantics and Embedding

 Lexical semantics
 Vector semantics
 Word and Vectors
 TFIDF
 Word2Vec, Skip gram and CBOW
 Glove
 Visualizing Embedding’s

8. Grammars and Parsing.

 Grammars and Sentence Structure.
 What Makes a Good Grammar
 A Top-Down Parser.
 Bottom-Up Chart Parser.
 Top-Down Chart Parsing.
 Finite State Models and Morphological Processing.
 Grammars and Logic Programming.
9. Statistical Constituency Parsing
 Probabilistic Context-Free Grammars
 Probabilistic CKY Parsing of PCFGs
 Ways to Learn PCFG Rule Probabilities
 Problems with PCFGs
 Improving PCFGs by Splitting Non-Terminals
 Probabilistic Lexicalized CFGs
10. Dependency Parsing
 Dependency Relations
 Dependency Formalisms
 Dependency Treebanks
 Transition-Based Dependency Parsing
 Graph-Based Dependency Parsing
 Dependency parser using neural network

11. Encoder-Decoder Models, Attention and Contextual Embeddings

 Neural Language Models and Generation
 Encoder-Decoder Networks, Attention
 Applications of Encoder-Decoder Networks
 Self-Attention and Transformer Networks
 BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
 Contextual Word Representations: A Contextual Introduction
 The Illustrated BERT, ELMo, and co.
 XLM

12. Word sense disambiguation

 Word Senses
 Relations between Senses
 WordNet: A Database of Lexical Relations
 Word Sense Disambiguation
 Alternate WSD algorithms and Tasks
 Using Thesauruses to Improve Embedding’s
 Word Sense Induction
13. Semantic web ontology and Knowledge Graph
 Introduction to semantic web
 Semantic web ontology
 Semantic web languages
 Ontology Engineering
 Ontology Learning
 Knowledge graph –construction of graph

14. Introduction to NLP Applications

 Brief introduction of state of art applications
 Text Summarization
 Machine Translation

Part B: Contact Session Plan

Academic Term
Course Title
Course No
Lead Instructor
Course Contents

Contact List of Topic Title Topic # Text /

session (from content structure in Part A) (from Ref Book /
content External
structure resource
in Part A)

1 Natural Language Understanding and Generation Chapter1 T2

1.1 The Study of Language.
1.2 Applications of Natural Language Understanding.
1.3 Evaluating Language Understanding Systems.
1.4 The Different Levels of Language Analysis.
1.5 The Organization of Natural Language Understanding
Systems.

2 N-gram Language Modelling Chapter 3 T1

 N-Grams
 Generalization and Zeros.
 Smoothing
 The Web and Stupid Backoff
 Evaluating Language Models
 Smoothing
 The Web and Stupid Backoff

3 Neural Network and Neural Language Modelling Chapter 4 R2

 Units
 The XOR problem
 Feed-Forward Neural Networks
 Training Neural Nets
 Neural Language Models

4 Vector semantics and Embedding Chapter 6 T1 and

 Lexical semantics lecture notes
 Vector semantics
 Word and Vectors https://www.
youtube.com
 TFIDF
/watch?v=hQ
 Word2Vec, Skip gram and CBOW wFeIupNP0
 Glove
 Visualizing Embedding’s

5 Part-of-Speech Tagging Chapter8 T1 and class

 (Mostly) English Word Classes notes
 The Penn Treebank Part-of-Speech Tag set
 Part-of-Speech Tagging
 Markov Chains
 The Hidden Markov Model
 HMM Part-of-Speech Tagging
 Part-of-Speech Tagging for Morphological Rich
Languages

6 Hidden Markov Model Algorithms Appendix T1 and class

 Likelihood Computation: The Forward Algorithm chapter A notes
 Decoding: The Viterbi Algorithm
 HMM Training: The Forward-Backward Algorithm
 Maximum Entropy Markov Model
 Bidirectionality

7 Topic modelling Class Notes

 Mathematical foundations for LDA
 Multinomial and Dirichlet distributions
 Intuition behind LDA
 LDA Generative model
 Latent Dirichlet Allocation Algorithm and
Implementation
 Gibbs Sampling

Review of M1 to M7

9 Grammars and Parsing Chapter3 T2

 Grammars and Sentence Structure.
 What Makes a Good Grammar
 A Top-Down Parser.
 A Bottom-Up Chart Parser.
 Top-Down Chart Parsing.
 Finite State Models and Morphological Processing.
 Grammars and Logic Programming.
 Parsing
10 Statistical Constituency Parsing Chapter 14 T1
 Probabilistic Context-Free Grammars
 Probabilistic CKY Parsing of PCFGs
 Ways to Learn PCFG Rule Probabilities
 Problems with PCFGs
 Improving PCFGs by Splitting Non-Terminals
 Probabilistic Lexicalized CFGs

11 Dependency Parsing Chapter 19 T1 and class

 Dependency Relations notes
 Dependency Formalisms
 Dependency Treebanks
 Transition-Based Dependency Parsing
 Graph-Based Dependency Parsing
 Dependency parsers using neural network

12 Encoder-Decoder Models, Attention and Contextual Chapter10 T1

Embeddings https://colab.
 Neural Language Models and Generation research.goo
 Encoder-Decoder Networks, Attention gle.com/driv
 Applications of Encoder-Decoder Networks e/1iqs9Y5_z
LI6R6mAwl
 Self-Attention and Transformer Networks
napcxcUbKj
 BERT: Pre-training of Deep Bidirectional Transformers pv2CC?usp=
for Language Understanding sharing
 Contextual Word Representations: A Contextual
Introduction
 The Illustrated BERT, ELMo, and co.
 XLM

13 Word sense and word net Chapter15 T1

14 Semantic web ontology and Knowledge Graphs Chapter 24 R1 and class

 Introduction notes
 Ontology and Ontologies
 Ontology Engineering
 Ontology Learning

15 State of art applications Class Notes

and web
references
16 Review of session 9 to session 15

Detailed Plan for Lab work

Lab Lab Sheet Session

Lab Objective Reference
No. Access URL
1 Introduction to NLTK, Spacy and other open 1
source tools
2 Language Modelling- Neural 2,3

3 Part of speech tagging 4,5

4 Topic Modeling 7
5 Parsing-Dependency-neural 9,10,11
6 Wordnet, Ontology and Knowledge Graph 12,13,14

Evaluation Scheme
Evaluation Name Type Weight Duration Day, Date, Session,
Component (Quiz, Lab, Project, (Open book, Time
Midterm exam, End Closed book,
semester exam, etc) Online, etc.)

EC – 1 Quiz 10% To be announced

EC – 2 Assignment 20% To be announced

EC – 3 Mid-term Exam Open book 30% To be announced

EC – 4 End Semester Exam Open book 40% To be announced

Important Information
Syllabus for Mid-Semester Test (Closed Book): Topics in Weeks 1-8 (1-18 Hours)
Syllabus for Comprehensive Exam (Open Book): All topics given in plan of study

Notes
 Quiz and Assignments timelines will be announced on the canvas portal.
 Deadlines for evaluation components will NOT be extended and the student is requested not
to wait for the deadline to start working on Quiz/Assignment
 Syllabus for Mid-Semester Test (Closed Book): Topics in Session Nos. 1 to 8
 Syllabus for Comprehensive Exam (Open Book): All topics (Session Nos. 1 to 16)
 Strictly NO MAKEUPS for Quiz and Assignments and all submissions after the announced
deadlines will not be considered for evaluation.
 All assignments will be subjected to plagiarism check, and if violated will be subject to
disciplinary action apart from nullifying all the marks/grades assigned.

Important links and information:

Canvas: Students are expected to visit the Canvas portal on a regular basis and stay up to date with
the latest announcements and deadlines.

Contact sessions: Students should attend the online lectures as per the schedule provided.
Evaluation Guidelines:
1. EC-1 consists of Assignments and Quizzes. Announcements regarding the same will be made
in a timely manner.
2. For Closed Book tests: No books or reference material of any kind will be permitted.
Laptops/Mobiles of any kind are not allowed. Exchange of any material is not allowed.
3. For Open Book exams: Use of prescribed and reference text books, in original (not photocopies)
is permitted. Class notes/slides as reference material in filed or bound form is permitted.
However, loose sheets of paper will not be allowed. Use of calculators is permitted in all exams.
Laptops/Mobiles of any kind are not allowed. Exchange of any material is not allowed.
4. If a student is unable to appear for the Regular Test/Exam due to genuine exigencies, the student
should follow the procedure to apply for the Make-Up Test/Exam. The genuineness of the
reason for absence in the Regular Exam shall be assessed prior to giving permission to appear
for the Make-up Exam. Make-Up Test/Exam will be conducted only at selected exam centres.
It shall be the responsibility of the individual student to be regular in maintaining the self-study schedule
as given in the course handout, attend the lectures, and take all the prescribed evaluation components
such as Assignment/Quiz, Mid-Semester Test and Comprehensive Exam according to the evaluation
scheme provided in the handout.

Learning Outcomes:
No Learning Outcomes

LO1 Should have a good understanding of the field of natural language processing.

LO2 Should have knowledge of important techniques like language modelling, parsing, used
in natural language processing

LO3 Should be able to apply NLP algorithms along with deep learning algorithms for state of
art areas like word embedding

NLP Session 16 - Post Midsem Review
No ratings yet
NLP Session 16 - Post Midsem Review
189 pages
Eisenstein
No ratings yet
Eisenstein
305 pages
NLP Semester 7
100% (1)
NLP Semester 7
1,072 pages
Natural Language processing-Regular-HO
No ratings yet
Natural Language processing-Regular-HO
10 pages
Speech and Language Processing - J&M
No ratings yet
Speech and Language Processing - J&M
599 pages
Natural Language Processing: Zhao Hai 赵海 Department of Computer Science and Engineering Shanghai Jiao Tong University
No ratings yet
Natural Language Processing: Zhao Hai 赵海 Department of Computer Science and Engineering Shanghai Jiao Tong University
61 pages
2020 NLPDeepLearning
No ratings yet
2020 NLPDeepLearning
72 pages
Natural Language Processing Handout
No ratings yet
Natural Language Processing Handout
8 pages
Draft: Natural Language Processing For The Working Programmer
No ratings yet
Draft: Natural Language Processing For The Working Programmer
79 pages
Unit III 1
No ratings yet
Unit III 1
11 pages
CCS369
No ratings yet
CCS369
2 pages
NLP A
No ratings yet
NLP A
6 pages
Unit 1a
No ratings yet
Unit 1a
53 pages
Ai Unit - 5
No ratings yet
Ai Unit - 5
12 pages
NLP Short Que Ans
No ratings yet
NLP Short Que Ans
21 pages
CD AAT (Techtalk)
No ratings yet
CD AAT (Techtalk)
22 pages
Unit I NLP
No ratings yet
Unit I NLP
5 pages
CS702B
No ratings yet
CS702B
114 pages
Unit 5 NLP
No ratings yet
Unit 5 NLP
24 pages
01 Introduction
No ratings yet
01 Introduction
13 pages
Ima 2000
No ratings yet
Ima 2000
56 pages
Session 6 - Part-Of-Speech Tagging, Sequence Labeling
No ratings yet
Session 6 - Part-Of-Speech Tagging, Sequence Labeling
86 pages
2023 07 28 Evolution of Language Models
No ratings yet
2023 07 28 Evolution of Language Models
73 pages
NLP Final
No ratings yet
NLP Final
33 pages
Lect1 Intro 3jan08
No ratings yet
Lect1 Intro 3jan08
94 pages
NLP Book
No ratings yet
NLP Book
599 pages
Thermo Dynamics Assignment
No ratings yet
Thermo Dynamics Assignment
13 pages
NLP for Computer Science Students
No ratings yet
NLP for Computer Science Students
16 pages
NLP Unit1
No ratings yet
NLP Unit1
24 pages
Natural Language Processing Notes
No ratings yet
Natural Language Processing Notes
61 pages
M.Tech DSE Batch 8 - Dissertation Guidlines - 06.12.2023 - Mid Semester
No ratings yet
M.Tech DSE Batch 8 - Dissertation Guidlines - 06.12.2023 - Mid Semester
2 pages
VND Openxmlformats-Officedocument Wordprocessingml Document&rendition 1
No ratings yet
VND Openxmlformats-Officedocument Wordprocessingml Document&rendition 1
5 pages
Mod 1
No ratings yet
Mod 1
71 pages
M.Tech DSE Batch 8 - Dissertation Guidlines - 06.12.2023 - Abstract Outline
No ratings yet
M.Tech DSE Batch 8 - Dissertation Guidlines - 06.12.2023 - Abstract Outline
4 pages
Introduction To NLP - First - Week - Lecture - 1st
No ratings yet
Introduction To NLP - First - Week - Lecture - 1st
6 pages
Hocken Maier 25
No ratings yet
Hocken Maier 25
46 pages
AI M3 Merged PDF
No ratings yet
AI M3 Merged PDF
98 pages
NLP Assignment Notes
No ratings yet
NLP Assignment Notes
28 pages
MScIT Sem4
No ratings yet
MScIT Sem4
8 pages
NLP Syllabus
No ratings yet
NLP Syllabus
1 page
NLP Defaulter Assignment
No ratings yet
NLP Defaulter Assignment
2 pages
Introduction To NLPAbebe Zerihun
No ratings yet
Introduction To NLPAbebe Zerihun
45 pages
Natural Language Processing 5
No ratings yet
Natural Language Processing 5
24 pages
ChatGPT-NLP Course Summary
No ratings yet
ChatGPT-NLP Course Summary
34 pages
Intro To Language Models - Soumyasis Mishra - 191001021003 - BCS4C
No ratings yet
Intro To Language Models - Soumyasis Mishra - 191001021003 - BCS4C
10 pages
Ai in Natural Language Processing
No ratings yet
Ai in Natural Language Processing
4 pages
UNIT 4 New
No ratings yet
UNIT 4 New
14 pages
Wisdom Natural Language Processing
No ratings yet
Wisdom Natural Language Processing
4 pages
CB3591 - Engineering Ssecure Software Systems - Notes
No ratings yet
CB3591 - Engineering Ssecure Software Systems - Notes
50 pages
Unit-3NaturalLanguageProcessing (NLP) 1 T1743588944524
No ratings yet
Unit-3NaturalLanguageProcessing (NLP) 1 T1743588944524
83 pages
Natural Language Processing
No ratings yet
Natural Language Processing
6 pages
Unit 1
No ratings yet
Unit 1
99 pages
TSA Book
No ratings yet
TSA Book
154 pages
CSR 322 Syllabus
No ratings yet
CSR 322 Syllabus
2 pages
Course Code: Course Title Credit CSDO7011 Atural Language Processing 3
No ratings yet
Course Code: Course Title Credit CSDO7011 Atural Language Processing 3
4 pages
Syllabus DSA4213
No ratings yet
Syllabus DSA4213
6 pages
BIT4133 Natural Language Processing Course Outline and Week 1 Introduction
No ratings yet
BIT4133 Natural Language Processing Course Outline and Week 1 Introduction
4 pages
Important Questions and Answer NLP
No ratings yet
Important Questions and Answer NLP
10 pages
Al3501 - Teaching Content
No ratings yet
Al3501 - Teaching Content
3 pages
CS-416 Natural Language Processing
No ratings yet
CS-416 Natural Language Processing
1 page
Cs383 Lecture16 PDF
No ratings yet
Cs383 Lecture16 PDF
46 pages
NLP Notes JNTUH
No ratings yet
NLP Notes JNTUH
2 pages
Unit 3 Jntu
No ratings yet
Unit 3 Jntu
9 pages

Natural Language Processing-Course Handout September 2022

Uploaded by

Natural Language Processing-Course Handout September 2022

Uploaded by

BIRLA INSTITUTE OF

TECHNOLOGY & SCIENCE, PILANI

Reference Book(s) & other resources

1. Natural Language Understanding and Generation

2. N-gram Language Modelling

3 Neural networks and Neural language Models

5. Hidden Markov Models and MEMM

7. Vector semantics and Embedding

8. Grammars and Parsing.

11. Encoder-Decoder Models, Attention and Contextual Embeddings

12. Word sense disambiguation

14. Introduction to NLP Applications

Part B: Contact Session Plan

Contact List of Topic Title Topic # Text /

1 Natural Language Understanding and Generation Chapter1 T2

2 N-gram Language Modelling Chapter 3 T1

3 Neural Network and Neural Language Modelling Chapter 4 R2

4 Vector semantics and Embedding Chapter 6 T1 and

5 Part-of-Speech Tagging Chapter8 T1 and class

6 Hidden Markov Model Algorithms Appendix T1 and class

7 Topic modelling Class Notes

9 Grammars and Parsing Chapter3 T2

11 Dependency Parsing Chapter 19 T1 and class

12 Encoder-Decoder Models, Attention and Contextual Chapter10 T1

13 Word sense and word net Chapter15 T1

14 Semantic web ontology and Knowledge Graphs Chapter 24 R1 and class

15 State of art applications Class Notes

Detailed Plan for Lab work

Lab Lab Sheet Session

3 Part of speech tagging 4,5

EC – 1 Quiz 10% To be announced

EC – 2 Assignment 20% To be announced

EC – 3 Mid-term Exam Open book 30% To be announced

EC – 4 End Semester Exam Open book 40% To be announced

Important links and information:

You might also like