0% found this document useful (0 votes)

23 views5 pages

NLP Lab Manual

The Natural Language Processing Lab Manual (AIP-101) outlines a course designed to teach students various NLP programming techniques using NLTK and spaCy. The manual includes a series of experiments focusing on tasks such as tokenization, N-gram modeling, word relationships, and text preprocessing, with specific objectives and outcomes for each experiment. By the end of the course, students will have developed practical skills in NLP application development and text analysis.

Uploaded by

classxiichemistry4

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views5 pages

NLP Lab Manual

Uploaded by

classxiichemistry4

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Natural Language Processing Lab Manual (AIP-101)

L:T:P: 0:0:2 Credits: 1

Course Outcomes

At the end of the course, the student will be able to:

1. Use the NLTK and spaCy toolkit for NLP programming.

2. Analyze various corpora for developing programs.
3. Develop various pre-processing techniques for a given corpus.
4. Develop programming logic using NLTK functions.
5. Build applications using various NLP techniques for a given corpus.

List of Programs

Experiment 1: Installation and exploring features of NLTK and spaCy tools.

Download Word Cloud and few corpora.

Objective

 To install and configure the NLTK and spaCy libraries.

 To explore and utilize basic features such as downloading corpora and generating word
clouds.

Outcomes

 Understand the process of installing and setting up NLP libraries in Python.

 Learn how to download and use corpora for further NLP tasks.
 Explore the concept of word clouds and how to generate them for corpus analysis.

Experiment 2: (i) Write a program to implement word Tokenizer, Sentence,

and Paragraph Tokenizers.
(ii) Check how many words are there in any corpus. Also, check how many
distinct words are there.

Objective

 To implement word, sentence, and paragraph tokenization using NLTK and spaCy.
 To analyze the number of total and distinct words in a given corpus.

Outcomes

 Learn how tokenization works in NLP and how to break text into smaller units.
 Gain knowledge about counting word frequencies and understanding the diversity of
vocabulary in a corpus.
Experiment 3: (i) Write a program to implement both user-defined and pre-
defined functions to generate (a) Uni-grams (b) Bi-grams (c) Tri-grams (d) N-
grams.
(ii) Write a program to calculate the highest probability of a word (w2)
occurring after another word (w1).

Objective

 To implement various N-gram models to generate unigrams, bigrams, trigrams, and general
N-grams.
 To calculate the conditional probability of one word occurring after another.

Outcomes

 Understand N-gram models and their application in NLP.

 Learn how to calculate and work with word probabilities for language modeling.

Experiment 4: (i) Write a program to identify word collocations.

(ii) Write a program to print all words beginning with a given sequence of
letters.
(iii) Write a program to print all words longer than four characters.

Objective

 To identify word collocations and patterns in a corpus.

 To extract words based on specific criteria (prefix and word length).

Outcomes

 Learn how to find common word pairs or collocations.

 Develop skills to filter words based on given patterns or word length constraints.

Experiment 5: (i) Write a program to identify the mathematical expression in

a given sentence.
(ii) Write a program to identify different components of an email address.

Objective

 To write regular expressions to identify mathematical expressions and components of email

addresses.

Outcomes

 Learn to use regular expressions for pattern matching.

 Identify structured data within unstructured text, such as email components and
mathematical expressions.

Experiment 6: (i) Write a program to identify all antonyms and synonyms of a

word.
(ii) Write a program to find hyponymy, homonymy, polysemy for a given word.
Objective

 To identify synonyms and antonyms using lexical databases like WordNet.

 To explore word relationships such as hyponymy, homonymy, and polysemy.

Outcomes

 Understand how to find relationships between words using NLP libraries.

 Explore word semantics and develop an understanding of lexical ambiguity.

Experiment 7: (i) Write a program to find all the stop words in any given text.
(ii) Write a function that finds the 50 most frequently occurring words of a text
that are not stopwords.

Objective

 To identify and remove stop words from a given text.

 To analyze the frequency distribution of non-stop words.

Outcomes

 Learn how to filter stopwords from text data.

 Understand word frequency analysis and its importance in text mining.

Experiment 8: Write a program to implement various stemming techniques and

prepare a chart with the performance of each method.

Objective

 To implement and compare different stemming techniques (e.g., Porter Stemmer, Lancaster
Stemmer).

Outcomes

 Understand stemming and its role in text preprocessing.

 Evaluate the performance of various stemming algorithms based on accuracy and efficiency.

Experiment 9: Write a program to implement various lemmatization

techniques and prepare a chart with the performance of each method.

Objective

 To implement lemmatization techniques and compare their effectiveness.

Outcomes

 Understand lemmatization and its importance in reducing words to their base form.
 Compare lemmatization techniques based on performance.
Experiment 10: (i) Write a program to implement Conditional Frequency
Distributions (CFD) for any corpus.
(ii) Find all the four-letter words in any corpus. With the help of a frequency
distribution (FreqDist), show these words in decreasing order of frequency.
(iii) Define a conditional frequency distribution over the names corpus that
allows you to see which initial letters are more frequent for males versus
females.

Objective

 To implement conditional frequency distributions and analyze text corpus.

 To explore frequency distribution and patterns in text data.

Outcomes

 Understand the concept of conditional frequency distributions and its applications.

 Analyze the frequency of words and patterns in specific corpora.

Experiment 11: (i) Write a program to implement Part-of-Speech (PoS) tagging

for any corpus.
(ii) Write a program to identify which word has the greatest number of distinct
tags. What are they, and what do they represent?
(iii) Write a program to list tags in order of decreasing frequency and what do
the 20 most frequent tags represent?
(iv) Write a program to identify which tags are nouns most commonly found
after. What do these tags represent?

Objective

 To implement part-of-speech tagging and analyze word classifications.

 To explore the distribution of PoS tags and understand their significance.

Outcomes

 Understand part-of-speech tagging and its application in NLP.

 Analyze the distribution and frequency of different PoS tags in a corpus.

Experiment 12: Write a program to implement TF-IDF for any corpus.

Objective

 To implement Term Frequency-Inverse Document Frequency (TF-IDF) and use it for text
vectorization.

Outcomes

 Understand how TF-IDF is used for text representation.

 Implement the TF-IDF algorithm for text classification or retrieval.
Experiment 13: Write a program to implement chunking and chinking for any
corpus.

Objective

 To implement chunking and chinking for identifying specific structures in text, such as noun
phrases or verb phrases.

Outcomes

 Understand the concepts of chunking and chinking in text analysis.

 Learn how to extract syntactic structures from text data.

Experiment 14: (i) Write a program to find all the mis-spelled words in a
paragraph.
(ii) Write a program to prepare a table with the frequency of mis-spelled tags
for any given text.

Objective

 To identify and correct spelling errors in text.

 To analyze mis-spelled words and their frequency distribution.

Outcomes

 Learn to detect and handle spelling mistakes in NLP tasks.

 Develop skills in error analysis and frequency distribution.

Experiment 15: Write a program to implement all the NLP Pre-Processing

Techniques required to perform further NLP tasks.

Objective

 To implement common pre-processing techniques such as tokenization, stemming,

lemmatization, stop-word removal, etc.

Outcomes

 Gain a comprehensive understanding of the essential pre-processing steps in NLP.

 Develop a pipeline for preparing text for downstream NLP tasks.

Offerletter Infinity Applicationid 529 46202111739370
No ratings yet
Offerletter Infinity Applicationid 529 46202111739370
3 pages
Lab Syllabus NLP Lab
No ratings yet
Lab Syllabus NLP Lab
2 pages
List of Experiments: Experiment No. Experiment Name Page No
No ratings yet
List of Experiments: Experiment No. Experiment Name Page No
1 page
NLP Previous Sem
No ratings yet
NLP Previous Sem
5 pages
NLP File
No ratings yet
NLP File
21 pages
NLP MTE Syllabus and Practice Problems
No ratings yet
NLP MTE Syllabus and Practice Problems
2 pages
NLP Previous Sem-1-3
No ratings yet
NLP Previous Sem-1-3
3 pages
NLP Lab Manual - 1
No ratings yet
NLP Lab Manual - 1
40 pages
NLP Manual Final
No ratings yet
NLP Manual Final
28 pages
Manual Ai Lab Final
No ratings yet
Manual Ai Lab Final
22 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
13 pages
NLP Lab Manual - Final
No ratings yet
NLP Lab Manual - Final
15 pages
NLP Lab Codes Till Mod3
No ratings yet
NLP Lab Codes Till Mod3
7 pages
NLP Report
No ratings yet
NLP Report
175 pages
Statistical NLP and Sequence Labeling: ML4291 Natural Language Processing LTPC 2 0 2 3 Course Objectives
No ratings yet
Statistical NLP and Sequence Labeling: ML4291 Natural Language Processing LTPC 2 0 2 3 Course Objectives
3 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
21 pages
A34 NLP Expt 02
No ratings yet
A34 NLP Expt 02
7 pages
123 NLP 456
No ratings yet
123 NLP 456
4 pages
NLP Lab Guide for Students
No ratings yet
NLP Lab Guide for Students
103 pages
CSR 322 Syllabus
No ratings yet
CSR 322 Syllabus
2 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
15 pages
2 - 6N302 Natural Language Processing
No ratings yet
2 - 6N302 Natural Language Processing
6 pages
NLP Syllabus R21
100% (1)
NLP Syllabus R21
2 pages
NLP Study Material
No ratings yet
NLP Study Material
8 pages
AIPT LAB 24-25 MANUAL EXPE 4 To8
No ratings yet
AIPT LAB 24-25 MANUAL EXPE 4 To8
15 pages
Questions - 1stmodule For NLP With Probable Exam Questions
No ratings yet
Questions - 1stmodule For NLP With Probable Exam Questions
2 pages
MTE Practice Set
No ratings yet
MTE Practice Set
4 pages
Text Modication Methods For Natural Language Generation: Universitat Autònoma de Barcelona
No ratings yet
Text Modication Methods For Natural Language Generation: Universitat Autònoma de Barcelona
44 pages
NLP A
No ratings yet
NLP A
6 pages
Index
No ratings yet
Index
6 pages
Course Projects
No ratings yet
Course Projects
10 pages
iNLP Assignment1
No ratings yet
iNLP Assignment1
7 pages
NLP Previous Sem-4-5
No ratings yet
NLP Previous Sem-4-5
2 pages
Batch 2
No ratings yet
Batch 2
13 pages
Chapter 7.1 - Introducing Natural Language Processing
No ratings yet
Chapter 7.1 - Introducing Natural Language Processing
39 pages
Ai&Ml Bai601 NLP Lab Manual
No ratings yet
Ai&Ml Bai601 NLP Lab Manual
48 pages
6 Aimlsyll
No ratings yet
6 Aimlsyll
9 pages
NLP Course Syllabus and Details
No ratings yet
NLP Course Syllabus and Details
53 pages
Bai601 NLP
No ratings yet
Bai601 NLP
5 pages
FRM Course Syllabus IPDownload
No ratings yet
FRM Course Syllabus IPDownload
2 pages
NLP Essentials for AI Enthusiasts
No ratings yet
NLP Essentials for AI Enthusiasts
4 pages
Minor Assignment-3 (NLP)
No ratings yet
Minor Assignment-3 (NLP)
2 pages
NLP Smitpatel
No ratings yet
NLP Smitpatel
32 pages
Sample Questions NLP
No ratings yet
Sample Questions NLP
2 pages
Comprehensive NLP Practice Assignment
No ratings yet
Comprehensive NLP Practice Assignment
2 pages
Aiml P4
No ratings yet
Aiml P4
12 pages
NLP Lab Manual LP 6
No ratings yet
NLP Lab Manual LP 6
43 pages
220 Bot
No ratings yet
220 Bot
105 pages
NLP Exp4
No ratings yet
NLP Exp4
10 pages
NLPL Exp 4 2025-2026
No ratings yet
NLPL Exp 4 2025-2026
4 pages
NLP Practical
No ratings yet
NLP Practical
27 pages
NLP Lab - Manual
No ratings yet
NLP Lab - Manual
33 pages
Natural Language Processing Lab Manual 2025
No ratings yet
Natural Language Processing Lab Manual 2025
45 pages
Machine Learning NLP LAB Sayak Mallick
No ratings yet
Machine Learning NLP LAB Sayak Mallick
4 pages
Session2 3
No ratings yet
Session2 3
18 pages
BLC 2 BLC 1nlp12erged
No ratings yet
BLC 2 BLC 1nlp12erged
11 pages
VND Openxmlformats-Officedocument Wordprocessingml Document&rendition 1
No ratings yet
VND Openxmlformats-Officedocument Wordprocessingml Document&rendition 1
5 pages
NLP 2
No ratings yet
NLP 2
45 pages
NLP Lab Manual-1
No ratings yet
NLP Lab Manual-1
18 pages
AOS Pratical File3
No ratings yet
AOS Pratical File3
28 pages
Ankita
No ratings yet
Ankita
10 pages
UNIT-4 Information Retrieval Notes
No ratings yet
UNIT-4 Information Retrieval Notes
16 pages
Unit Five
No ratings yet
Unit Five
3 pages
Monisha Dhamodaran - Types of Bidri Work
No ratings yet
Monisha Dhamodaran - Types of Bidri Work
16 pages
Difficult Employees
No ratings yet
Difficult Employees
3 pages
London Tests of English: Label
No ratings yet
London Tests of English: Label
20 pages
Entrepreneur 3
No ratings yet
Entrepreneur 3
24 pages
Unit Operations in Mineral Processing: Prof. Rodrigo Serna and Dr. Robert Hartmann Spring 2019 Aalto University
No ratings yet
Unit Operations in Mineral Processing: Prof. Rodrigo Serna and Dr. Robert Hartmann Spring 2019 Aalto University
46 pages
The Islamic-Byzantine Frontier
100% (1)
The Islamic-Byzantine Frontier
372 pages
ASSIGNMENT
No ratings yet
ASSIGNMENT
8 pages
A+ Blog SSLC Biology Chapter 1 Genetics of Life PDF Note (Em)
No ratings yet
A+ Blog SSLC Biology Chapter 1 Genetics of Life PDF Note (Em)
5 pages
Piping Codes & Standards Guide
100% (1)
Piping Codes & Standards Guide
17 pages
Digital Natives Digital Immigrants - II
No ratings yet
Digital Natives Digital Immigrants - II
24 pages
English Exam Video Guide
No ratings yet
English Exam Video Guide
8 pages
Wireless Sensing and Networking For The Internet of Things Zihuai Lin and Wei Xiang Download
No ratings yet
Wireless Sensing and Networking For The Internet of Things Zihuai Lin and Wei Xiang Download
79 pages
Social Studies Lesson Exemplar
100% (1)
Social Studies Lesson Exemplar
8 pages
Final IInd Year Syllabus of BAMS
67% (3)
Final IInd Year Syllabus of BAMS
22 pages
Narrative Kelas Xi
No ratings yet
Narrative Kelas Xi
10 pages
NEUPANE - Richa - Biochar Production Process Optimisation and Product Characterisation
No ratings yet
NEUPANE - Richa - Biochar Production Process Optimisation and Product Characterisation
114 pages
PLSQL
100% (1)
PLSQL
195 pages
2 Rem2 Brondial Syllabus
No ratings yet
2 Rem2 Brondial Syllabus
18 pages
Essay Questions
100% (1)
Essay Questions
5 pages
Parts Manual Parts Manual Parts Manual Parts Manual: Mfg. No: 122Q02-0001-H1
No ratings yet
Parts Manual Parts Manual Parts Manual Parts Manual: Mfg. No: 122Q02-0001-H1
25 pages
Biogeochemical Cycle
No ratings yet
Biogeochemical Cycle
29 pages
Hana'a Makahle: Project Coordinator Resume
No ratings yet
Hana'a Makahle: Project Coordinator Resume
3 pages
Top Down Network Design
100% (1)
Top Down Network Design
10 pages
Verrier Elwin, Sarat Chandra Roy - The Agaria (1992, Oxford University Press, USA)
No ratings yet
Verrier Elwin, Sarat Chandra Roy - The Agaria (1992, Oxford University Press, USA)
383 pages
EPI New Application Form
No ratings yet
EPI New Application Form
9 pages
Chap 12 PM-BB Multiple Choice Type Questions
No ratings yet
Chap 12 PM-BB Multiple Choice Type Questions
24 pages
Eng9045 3683
No ratings yet
Eng9045 3683
4 pages
Madrid Vs Mapoy
No ratings yet
Madrid Vs Mapoy
2 pages
KDAY
No ratings yet
KDAY
23 pages

NLP Lab Manual

Uploaded by

NLP Lab Manual

Uploaded by

Natural Language Processing Lab Manual (AIP-101)

L:T:P: 0:0:2 Credits: 1

At the end of the course, the student will be able to:

1. Use the NLTK and spaCy toolkit for NLP programming.

Experiment 1: Installation and exploring features of NLTK and spaCy tools.

 To install and configure the NLTK and spaCy libraries.

 Understand the process of installing and setting up NLP libraries in Python.

Experiment 2: (i) Write a program to implement word Tokenizer, Sentence,

 Understand N-gram models and their application in NLP.

Experiment 4: (i) Write a program to identify word collocations.

 To identify word collocations and patterns in a corpus.

 Learn how to find common word pairs or collocations.

Experiment 5: (i) Write a program to identify the mathematical expression in

 To write regular expressions to identify mathematical expressions and components of email

 Learn to use regular expressions for pattern matching.

Experiment 6: (i) Write a program to identify all antonyms and synonyms of a

 To identify synonyms and antonyms using lexical databases like WordNet.

 Understand how to find relationships between words using NLP libraries.

 To identify and remove stop words from a given text.

 Learn how to filter stopwords from text data.

Experiment 8: Write a program to implement various stemming techniques and

 Understand stemming and its role in text preprocessing.

Experiment 9: Write a program to implement various lemmatization

 To implement lemmatization techniques and compare their effectiveness.

 To implement conditional frequency distributions and analyze text corpus.

 Understand the concept of conditional frequency distributions and its applications.

Experiment 11: (i) Write a program to implement Part-of-Speech (PoS) tagging

 To implement part-of-speech tagging and analyze word classifications.

 Understand part-of-speech tagging and its application in NLP.

Experiment 12: Write a program to implement TF-IDF for any corpus.

 Understand how TF-IDF is used for text representation.

 Understand the concepts of chunking and chinking in text analysis.

 To identify and correct spelling errors in text.

 Learn to detect and handle spelling mistakes in NLP tasks.

Experiment 15: Write a program to implement all the NLP Pre-Processing

 To implement common pre-processing techniques such as tokenization, stemming,

 Gain a comprehensive understanding of the essential pre-processing steps in NLP.

You might also like