Exercise 2

The document provides a step-by-step guide on using the NLTK library for natural language processing. It covers installation, text searching, vocabulary counting, frequency distribution, and collocation analysis. Each section includes code snippets and expected outputs to illustrate the functionality.

Uploaded by

SUJITHA M

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views2 pages

Exercise 2

Uploaded by

SUJITHA M

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

PROGRAM:

1. Installing NLTK:
pip install nltk
2. Importing NLTK and downloading data:
import nltk
nltk.download('punkt')
nltk.download('stopwords')

OUTPUT:
True

3. Searching text:
text="natural language processing with NLTK is fun and eductional."
word="NLTK"
if word in text:
print(f"'{word}'found in the text!")
else:
print(f"'{word}'not found in the text in the text")

OUTPUT:
'NLTK'found in the text!

4. Counting Vocabulary:
from nltk.tokenize import word_tokenize
text="natural language processing with NLTK is fun and eductional."
tokens=word_tokenize(text)
vocabulary=set(tokens)
print(f"vocabulary size:{len(vocabulary)}")
OUTPUT:
vocabulary size:10

5. Frequency Distribution:
from nltk import FreqDist
fdist=FreqDist(tokens)
print(f"Most Common Words:{fdist.most_common(5)}")

OUTPUT:
Most Common Words:[('natural', 1), ('language', 1), ('processing', 1), ('with', 1), ('NLTK',
1)]

6. Collocation:
import nltk from nltk.collocations
import BigramCollocationFinder
from nltk.metrics import BigramAssocMeasures
text="Natural Language Processing with NLTK is fun and educational."
tokens=nltk.word_tokenize(text.lower())
finder=BigramCollocationFinder.from_words(tokens)
bigrams=finder.nbest(BigramAssocMeasures.likelihood_ratio,5)
print(f"Top 5 bigrams:{bigrams}")

OUTPUT:
Top 5 bigrams:[('and', 'educational'), ('educational', '.'), ('fun', 'and'), ('is', 'fun'), ('language',
'processing')]

CCS369-Text and Speech Analysis Lab (1-9)
No ratings yet
CCS369-Text and Speech Analysis Lab (1-9)
37 pages
Ccs339 Text and Speech Analysis Lab Manual
No ratings yet
Ccs339 Text and Speech Analysis Lab Manual
51 pages
Module1 NLP of Vtu Autonomous College of Mysore
No ratings yet
Module1 NLP of Vtu Autonomous College of Mysore
42 pages
Tsa Ex-2
No ratings yet
Tsa Ex-2
4 pages
Lesson 5 NLP Libraries
No ratings yet
Lesson 5 NLP Libraries
69 pages
Natural Language Processing Journal
No ratings yet
Natural Language Processing Journal
73 pages
TSA Lab Manual New
No ratings yet
TSA Lab Manual New
14 pages
Text Analysis With NLTK Cheatsheet
No ratings yet
Text Analysis With NLTK Cheatsheet
3 pages
Ass 3
No ratings yet
Ass 3
3 pages
SPR 05 NLTK
No ratings yet
SPR 05 NLTK
18 pages
Ccs369-Lab Ex 3,4,5
No ratings yet
Ccs369-Lab Ex 3,4,5
8 pages
Natural Language Processing in Python - Exploring Word Frequencies With NLTK
No ratings yet
Natural Language Processing in Python - Exploring Word Frequencies With NLTK
5 pages
NLP Lab Codes Till Mod3
No ratings yet
NLP Lab Codes Till Mod3
7 pages
NLP - Record (Weeks 1-12)
No ratings yet
NLP - Record (Weeks 1-12)
41 pages
CCS369 - Text and Speech Analysis
No ratings yet
CCS369 - Text and Speech Analysis
31 pages
Bavya NLP 0.1
No ratings yet
Bavya NLP 0.1
5 pages
Tsa Labmanual
No ratings yet
Tsa Labmanual
26 pages
Text Analysis With NLTK Cheatsheet PDF
No ratings yet
Text Analysis With NLTK Cheatsheet PDF
3 pages
Atharv 22 NLP 04
No ratings yet
Atharv 22 NLP 04
2 pages
Ccs369 - Text and Speech Analysis - Lab Manual
100% (1)
Ccs369 - Text and Speech Analysis - Lab Manual
23 pages
Ex4 Lab
No ratings yet
Ex4 Lab
4 pages
NLP Techniques for Developers
No ratings yet
NLP Techniques for Developers
3 pages
Unit 5
No ratings yet
Unit 5
50 pages
Ai&Ml Bai601 NLP Lab Manual
No ratings yet
Ai&Ml Bai601 NLP Lab Manual
48 pages
Hca 1
No ratings yet
Hca 1
71 pages
Tokenization (Breaking Text Into Words) : Import From Import From Import From Import
No ratings yet
Tokenization (Breaking Text Into Words) : Import From Import From Import From Import
11 pages
Assignment - 7: Import Import Import Import
No ratings yet
Assignment - 7: Import Import Import Import
3 pages
Tsarecord
No ratings yet
Tsarecord
22 pages
All Practicals
No ratings yet
All Practicals
33 pages
Record
No ratings yet
Record
6 pages
NLTK Cheatsheet for Text Analysis
No ratings yet
NLTK Cheatsheet for Text Analysis
3 pages
3.Nlp Lab Manual
No ratings yet
3.Nlp Lab Manual
18 pages
Unit Ii
No ratings yet
Unit Ii
59 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
15 pages
University of Science & Technology, Bannu: Lab Report No: 04 Artificial Intelligence Lab
No ratings yet
University of Science & Technology, Bannu: Lab Report No: 04 Artificial Intelligence Lab
7 pages
NLP EXP 3 (A) - Word Analysis
No ratings yet
NLP EXP 3 (A) - Word Analysis
2 pages
NLP - (Natural Language Processing Lab Manual)
No ratings yet
NLP - (Natural Language Processing Lab Manual)
12 pages
NLP Analysis for AI Students
No ratings yet
NLP Analysis for AI Students
5 pages
Language Engineering - Section
No ratings yet
Language Engineering - Section
20 pages
Big Data Unit 4 Own
No ratings yet
Big Data Unit 4 Own
18 pages
Ba Unit 4 UA
No ratings yet
Ba Unit 4 UA
19 pages
Text Analysis for Data Scientists
No ratings yet
Text Analysis for Data Scientists
2 pages
Unit 5 UA
No ratings yet
Unit 5 UA
19 pages
TSA Student
No ratings yet
TSA Student
20 pages
x0 Process
No ratings yet
x0 Process
4 pages
Practical Scientific Computing in Python A Workbook
No ratings yet
Practical Scientific Computing in Python A Workbook
43 pages
CloudComputing Unit 3
No ratings yet
CloudComputing Unit 3
31 pages
Natural Language Processing
No ratings yet
Natural Language Processing
22 pages
NLP Practical Journal 2023-24
No ratings yet
NLP Practical Journal 2023-24
22 pages
A7 NLP Exp2
No ratings yet
A7 NLP Exp2
11 pages
Ba Unit 1 UA
No ratings yet
Ba Unit 1 UA
13 pages
Ba Unit 4
No ratings yet
Ba Unit 4
13 pages
PY0101EN 3 5 Practice - Lab 20230526 1685059200.jupyterlite
No ratings yet
PY0101EN 3 5 Practice - Lab 20230526 1685059200.jupyterlite
7 pages
CCS369 Two Marks
No ratings yet
CCS369 Two Marks
9 pages
Https Raw - Githubusercontent.com Joelgrus Data-Science-From-Scratch Master Code Natural Language Processing
No ratings yet
Https Raw - Githubusercontent.com Joelgrus Data-Science-From-Scratch Master Code Natural Language Processing
5 pages
Exe 10
No ratings yet
Exe 10
10 pages
Internn
No ratings yet
Internn
9 pages
BA Unit2 Own
No ratings yet
BA Unit2 Own
10 pages
CC Unit 5 Own Notes
No ratings yet
CC Unit 5 Own Notes
13 pages
DSBD 7 Ass
No ratings yet
DSBD 7 Ass
9 pages
Batch 2
No ratings yet
Batch 2
13 pages
Exercise 4
No ratings yet
Exercise 4
2 pages
BD Unit 1
No ratings yet
BD Unit 1
5 pages
Embedd Iat
No ratings yet
Embedd Iat
6 pages
Medical Imaging Techniques - Hca
No ratings yet
Medical Imaging Techniques - Hca
3 pages
Unit 3 - Desktop, Network, Storage Virtualization
No ratings yet
Unit 3 - Desktop, Network, Storage Virtualization
8 pages
Ba Unit 3 Own
No ratings yet
Ba Unit 3 Own
7 pages
Obt Unit 3 IAT2
No ratings yet
Obt Unit 3 IAT2
3 pages
Lab3 IR BIM
No ratings yet
Lab3 IR BIM
14 pages
CC, IAM Design Challengs
No ratings yet
CC, IAM Design Challengs
3 pages
Bag of Words
No ratings yet
Bag of Words
19 pages
Shubham Jade MSC It 31031420010 NLP Practical Journal
No ratings yet
Shubham Jade MSC It 31031420010 NLP Practical Journal
17 pages
VM Security Attacks and Real Case Studies
No ratings yet
VM Security Attacks and Real Case Studies
4 pages
Exercise 1 Changes
No ratings yet
Exercise 1 Changes
3 pages
Exercise 1
No ratings yet
Exercise 1
3 pages
AI Phash3
No ratings yet
AI Phash3
11 pages
CC Unit 3 (Virtual Clusters and Resource Management)
No ratings yet
CC Unit 3 (Virtual Clusters and Resource Management)
3 pages
Migration No SQL
No ratings yet
Migration No SQL
4 pages
BBC Sports Text Preprocessing Guide
No ratings yet
BBC Sports Text Preprocessing Guide
6 pages
Semantic Analysis Theory1
No ratings yet
Semantic Analysis Theory1
16 pages
Lecture 4
No ratings yet
Lecture 4
37 pages
Feature Extraction Techniques in NLP
No ratings yet
Feature Extraction Techniques in NLP
10 pages
NLP Challenges & Techniques
No ratings yet
NLP Challenges & Techniques
45 pages

Exercise 2

Uploaded by

Exercise 2

Uploaded by

PROGRAM:

You might also like