0% found this document useful (0 votes)

235 views51 pages

Ccs339 Text and Speech Analysis Lab Manual

The document outlines various Python experiments focused on text processing using regular expressions and the NLTK library. It includes detailed descriptions of the aims, software and hardware specifications, algorithms, and example programs for tasks such as detecting word patterns, searching text, counting vocabulary, and accessing text corpora. Each experiment concludes with the successful execution of the program and verification of outputs.

Uploaded by

dhanushmathi22

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

235 views51 pages

Ccs339 Text and Speech Analysis Lab Manual

Uploaded by

dhanushmathi22

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 51

EXP NO: 1

Create Regular Expressions in Python for Detecting Word

Date:
Patterns and Tokenizing Text

AIM:
To create Regular expressions in Python for detecting word patterns and tokenizing text.

SOFTWARE AND HARDWARE SPECIFICATIONS:

SOFTWARE SPECIFICATIONS:

• Anaconda Navigator
• Jupyter Notebook
• Google Colab

HARDWARE SPECIFICATIONS:

• Windows 10/11
• RAM - 16 GB
• Hard-disk - 1 TB
• Processor - Intel i5/i7

ALGORITHM:

• Import the Regular Expression module (re).

• Define detect_ word_patterns function to find all word patterns in the text using a
regular expression.
• Define tokenize_text function to tokenize the text using a regular expression pattern.
• Define the main function to demonstrate the functionality of the above two functions.
• Check if the script is being run as the main program.

1
• Execute the main function if the script is being run directly.
• Display the detected words and tokens in the example text.

PROGRAM:

import re

def detect_word_patterns(text):
# Regular expression pattern for detecting words
word_pattern = re.compile(r'\b\w+\b')
# Find all matches for the word pattern in the text
words = word_pattern.findall(text)
return words

def tokenize_text(text):
# Regular expression pattern for tokenizing text
token_pattern = re.compile(r'\b\w+\b|\s|[^\w\s]')
# Find all matches for the token pattern in the
text tokens = token_pattern.findall(text)
return tokens

def main():
# Example text
text = "Rajalakshmi Institute of Technology was established in 2008. RIT is accredited with
highest grade of A++ by NAAC. RIT is affiliated with Anna University Chennai. "
# Detect word patterns
words = detect_word_patterns(text)

2
print("Words:", words)

# Tokenize text
tokens = tokenize_text(text)
print("Tokens:", tokens)
if name == " main ":
main()

OUTPUT:

Words: ['Rajalakshmi', 'Institute', 'of', 'Technology', 'was', 'established', 'in', '2008', 'RIT', 'is',
'accredited', 'with', 'highest', 'grade', 'of', 'A', 'by', 'NAAC', 'RIT', 'is', 'affiliated', 'with', 'Anna',
'University', 'Chennai']

Tokens: ['Rajalakshmi', ' ', 'Institute', ' ', 'of', ' ', 'Technology', ' ', 'was', ' ', 'established', ' ', 'in', ' ',
'2008', '.', ' ', 'RIT', ' ', 'is', ' ', 'accredited', ' ', 'with', ' ', 'highest', ' ', 'grade', ' ', 'of', ' ', 'A', '+', '+', ' ',
'by', ' ', 'NAAC', '.', ' ', 'RIT', ' ', 'is', ' ', 'affiliated', ' ', 'with', ' ', 'Anna', ' ', 'University', ' ',
'Chennai', '.', ' ']

RESULT:

Therefore, the Python program for generating regular expressions to detect word patterns and
tokenize text is completed.

3
EXP NO: 2a
Searching Text
Date:

AIM:

To create a python program for searching text.

SOFTWARE AND HARDWARE SPECIFICATIONS:

SOFTWARE SPECIFICATIONS:

• Anaconda Navigator
• Jupyter Notebook
• Google Colab

HARDWARE SPECIFICATIONS:

• Windows 10/11
• RAM - 16 GB
• Hard-disk - 1 TB
• Processor - Intel i5/i7

ALGORITHM:

• Import the NLTK module (nltk) and install it if not already installed.
• Define the example text.
• Use NLTK's regular expression (nltk.re) module to find all occurrences of the word
"text" in the example text.
• Tokenize the text into words using NLTK's word_tokenize function and convert
them to lowercase.

4
• Create an NLTK Text object from the tokenized text.
• Use the similar method of the Text object to find similar words to "text".
• Print the occurrences of "text" and the similar words.

PROGRAM:

!pip install nltk

import nltk
nltk.download('punkt')
text = "This is a sample text to demonstrate text searching in NLTK."
# Find all occurrences of the word "text"
text_occurrences = nltk.re.findall(r"\btext\b", text)
print(text_occurrences) # Output: ['text', 'text']
# Find similar words using a different approach
tokenized_text = nltk.word_tokenize(text.lower())
text_index = nltk.Text(tokenized_text)
similar_words = text_index.similar("text")
print(similar_words)

OUTPUT:

Requirement already satisfied: nltk in

c:\users\user\appdata\local\programs\python\python311\lib\ site-packages (3.8.1)

Requirement already satisfied: click in

c:\users\user\appdata\local\programs\python\python311\ lib\site-packages (from nltk) (8.1.7)

Requirement already satisfied: joblib in c:\users\user\appdata\local\programs\python\python311\

5
lib\site-packages (from nltk) (1.2.0)

Requirement already satisfied: regex>=2021.8.3 in

c:\users\user\appdata\local\programs\python\ python311\lib\site-packages (from nltk)
(2023.12.25)

Requirement already satisfied: tqdm in

c:\users\user\appdata\local\programs\python\python311\ lib\site-packages (from nltk) (4.66.1)

Requirement already satisfied: colorama in

c:\users\user\appdata\local\programs\python\ python311\lib\site-packages (from click-
>nltk) (0.4.6)

[nltk_data] Downloading package punkt to

[nltk_data] C:\Users\User\AppData\Roaming\nltk_data...

[notice] A new release of pip is available: 23.3.1 -> 23.3.2

[notice] To update, run: python.exe -m pip install --upgrade

pip [nltk_data] Unzipping tokenizers\punkt.zip.

['text',

'text']

None

RESULT:
The searching text program is executed and output got verified.

6
EXP NO:2b
Counting Vocabulary
Date:

AIM:
To create python program for Counting Vocabulary.

SOFTWARE AND HARDWARE SPECIFICATIONS:

SOFTWARE SPECIFICATIONS:

• Anaconda Navigator
• Jupyter Notebook
• Google Colab

HARDWARE SPECIFICATIONS:

• Windows 10/11
• RAM - 16 GB
• Hard-disk - 1 TB
• Processor - Intel i5/i7

ALGORITHM:

• Import the word_tokenize function from NLTK's tokenize module.

• Define the example text.
• Tokenize the text into words using the word_tokenize function and convert them to
lowercase.
• Create a set of unique words (vocabulary) from the tokenized text.
• Calculate the length of the vocabulary set.

7
• Print the length of the vocabulary set.
• Display the number of unique words in the example text.

PROGRAM:

from nltk.tokenize import word_tokenize

text = "This is a sample text with some repeated words."
# Tokenize the text into words
tokens = word_tokenize(text.lower())
# Count the unique words (vocabulary)
vocabulary = set(tokens) print(len(vocabulary))
# Output: 9

OUTPUT:

RESULT:
The Counting Vocabulary program is executed successfully and output got verified.

8
EXP NO:2c
Frequency Distribution
Date:

AIM:
To create a python program for Frequency Distribution.

SOFTWARE AND HARDWARE SPECIFICATIONS:

SOFTWARE SPECIFICATIONS:

• Anaconda Navigator
• Jupyter Notebook
• Google Colab

HARDWARE SPECIFICATIONS:

• Windows 10/11
• RAM - 16 GB
• Hard-disk - 1 TB
• Processor - Intel i5/i7

ALGORITHM:

• Import NLTK and download the necessary resources (e.g., punkt tokenizer).
• Import the FreqDist class from NLTK's probability module and the word_tokenize
function from the tokenize module.
• Define the example text.
• Tokenize the text into words using the word_tokenize function and convert them to
lowercase.
• Create a frequency distribution (FreqDist) for the tokens.

9
• Print the most common n words, where n is the desired number of most common words.
• Plot the frequency distribution of the tokens.

PROGRAM:
Import nltk
nltk.download('punkt')
from nltk.probability import FreqDist
from nltk.tokenize import word_tokenize
text = "This is a sample text with some repeated words."
tokens = word_tokenize(text.lower())
# Create a frequency distribution for the words
fdist = FreqDist(tokens)
# Print the most frequent words
print(fdist.most_common(3)) # Output: [('is', 2), ('this', 1), ('a', 1)]
# Plot the frequency distribution
fdist.plot(cumulative=False)

10
OUTPUT:
[nltk_data] Downloading package punkt to /root/nltk_data...

[nltk_data] Unzipping tokenizers/punkt.zip.

[('this', 1), ('is', 1), ('a', 1)]

<Axes: xlabel='Samples', ylabel='Counts'>

RESULT:

The frequency distribution program is executed successfully and output got verified.

11
EXP NO:2d
Collocations
Date:

AIM:

To create a python program for Collocations.

SOFTWARE AND HARDWARE SPECIFICATIONS:

SOFTWARE SPECIFICATIONS:

• Anaconda Navigator
• Jupyter Notebook
• Google Colab

HARDWARE SPECIFICATIONS:

• Windows 10/11
• RAM - 16 GB
• Hard-disk - 1 TB
• Processor - Intel i5/i7

ALGORITHM:

• Import NLTK and the necessary modules (collocations, tokenize).

• Define the example text.
• Tokenize the text into words using the word tokenize function and convert them to
lowercase.
• Initialize a Bigram Assoc Measures object to measure association between bigrams.

12
• Create a Bigram Collocation Finder from the tokenized words.
• Use the Nbest method to find the top n bigrams with the highest Pointwise Mutual
Information (PMI), where n is the desired number of top bigrams.
• Print the top n bigrams with the highest PMI.

PROGRAM:

from nltk.collocations import *

import nltk
from nltk.tokenize import word_tokenize
text = "Natural language processing is an exciting field with many applications."
tokens = word_tokenize(text.lower())
bigram_measures = nltk.collocations.BigramAssocMeasures()
finder = BigramCollocationFinder.from_words(tokens)
# Find the top 5 bigrams with the highest pointwise mutual information (PMI)
print(finder.nbest(bigram_measures.pmi, 5))

OUTPUT:

[('an', 'exciting'),
('applications', '.'),
('exciting', 'field'), ('field', 'with'), ('is', 'an')]

RESULT:
The Collocations program executed successfully, and the output was verified.

13
EXP NO:2e
Bigrams
Date:

AIM:
To create a python program for Bigrams.

SOFTWARE AND HARDWARE SPECIFICATIONS:

SOFTWARE SPECIFICATIONS:

• Anaconda Navigator
• Jupyter Notebook
• Google Colab

HARDWARE SPECIFICATIONS:

• Windows 10/11
• RAM - 16 GB
• Hard-disk - 1 TB
• Processor - Intel i5/i7

ALGORITHM:
• Import NLTK and download the necessary resources (e.g., punkt tokenizer).
• Import the ngrams function from NLTK's util module and the word_tokenize function
from the tokenize module.
• Define the example text.
• Tokenize the text into words using the word_tokenize function and convert them to
lowercase.
• Generate bigrams (sequences of two consecutive words) from the tokenized words
using the ngrams function.

14
• Convert the bigrams generator into a list to display the bigrams.
• Print the list of generated bigrams.

PROGRAM:

import nltk
nltk.download('punkt')
from nltk.util import ngrams
from nltk.tokenize import word_tokenize
text = "This is a sample text to demonstrate bigrams."
tokens = word_tokenize(text.lower())
# Generate bigrams (sequences of two consecutive words)
bigrams = ngrams(tokens, 2)
print(list(bigrams))

OUTPUT:

[('this', 'is'), ('is', 'a'), ('a', 'sample'), ('sample', 'text'), ('text', 'to'), ('to', 'demonstrate'),
('demonstrate', 'bigrams'), ('bigrams', '.')]
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data] Package punkt is already up-to-date!

RESULT:

The Bigrams program is executed successfully and Output got verified.

15
EXP NO:3
Accessing Text Corpora using NLTK in Python
Date:

AIM:

To create a python program for Accessing Text Corpora using NLTK in Python

SOFTWARE AND HARDWARE SPECIFICATIONS:

SOFTWARE SPECIFICATIONS:

• Anaconda Navigator
• Jupyter Notebook
• Google Colab

HARDWARE SPECIFICATIONS:

• Windows 10/11
• RAM - 16 GB
• Hard-disk - 1 TB
• Processor - Intel i5/i7

ALGORITHM:

• Import the nltk module and install it if not already installed.

• Import the gutenberg corpus from NLTK's corpus module.
• Download the Gutenberg corpus if not already downloaded.
• Define a function access_gutenberg_corpus to access and display information about the

16
Gutenberg corpus.
• List available files in the Gutenberg corpus using the fileids method.
• Access and print the text of a specific document in the corpus (in this case,
"shakespeare- hamlet.txt").
• Call the access_gutenberg_corpus function from the main function.

PROGRAM:

# !pip install nltk

import nltk

from nltk.corpus import gutenberg

# Download the Gutenberg corpus (if not already downloaded)
nltk.download('gutenberg')
def access_gutenberg_corpus():
# List available files in the Gutenberg corpus
print("Available files in Gutenberg Corpus:")
print(gutenberg.fileids())
# Access and print the text of a specific document in the corpus
document_name = '/content/shakespeare-hamlet.txt.txt'
document_text = gutenberg.raw(document_name)
print(f"\nText of '{document_name}':\n{document_text[:500]}...")
def main():

# Access the Gutenberg corpus

access_gutenberg_corpus()

17
if name_== " main ":
main()

OUTPUT:

Available files in Gutenberg Corpus:

['austen-emma.txt', 'austen-persuasion.txt', 'austen-sense.txt', 'bible-kjv.txt', 'blake-poems.txt',
'bryant-stories.txt', 'burgess-busterbrown.txt', 'carroll-alice.txt', 'chesterton-ball.txt', 'chesterton-
brown.txt', 'chesterton-thursday.txt', 'edgeworth-parents.txt', 'melville-moby_dick.txt', 'milton-
paradise.txt', 'shakespeare-caesar.txt', 'shakespeare-hamlet.txt', 'shakespeare-macbeth.txt',
'whitman- leaves.txt']
Text of '/content/shakespeare-hamlet.txt.txt':
THE TRAGEDY OF HAMLET, PRINCE OF DENMARK
by William Shakespeare
Dramatis Personae
Claudius, King of Denmark.
Marcellus, Officer.
Hamlet, son to the former, and nephew to the present king.
Polonius, Lord Chamberlain.
Horatio, friend to Hamlet.
Laertes, son to Polonius.
Voltemand, courtier.

18
Cornelius, courtier.
Rosencrantz, courtier.
Guildenstern, courtier.
Osric, courtier.
A Gentleman, courtier.
A Priest.
Marcellus, officer.
Bernardo, off...
[nltk_data] Downloading package gutenberg to /root/nltk_data...
[nltk_data] Package gutenberg is already up-to-date!

RESULT:

The Accessing Text Corpora using NLTK in Python program is executed successfully and
output got verified.

19
EXP NO:4 Write a Function that Finds The 50 Most Frequently
Date: Occurring Words of a Text Words.

AIM:

To create python program for creating a function that finds the 50 most frequently occurring
words of a text words.

SOFTWARE AND HARDWARE SPECIFICATIONS:

SOFTWARE SPECIFICATIONS:

• Anaconda Navigator
• Jupyter Notebook
• Google Colab

HARDWARE SPECIFICATIONS:

• Windows 10/11
• RAM - 16 GB
• Hard-disk - 1 TB
• Processor - Intel i5/i7

ALGORITHM:

• Import the nltk module and install it if not already installed.

• Download the stopwords corpus from NLTK.

20
• Import the necessary modules (stopwords, FreqDist) from NLTK.
• Define the find_frequent_words function to find the most frequent words in the text.
• Load English stopwords using stopwords.words('english').
• Tokenize the text into words using nltk.word_tokenize and convert them to lowercase.
• Filter out stopwords from the tokenized words, create a frequency distribution using
FreqDist, and return the most common words.

PROGRAM:

# !pip install nltk

import nltk
nltk.download('stopwords')
import nltk
from nltk.corpus import stopwords from
nltk.probability import FreqDist
def find_frequent_words(text, num_words=50):
stop_words = set(stopwords.words('english'))
# Load English stop words
words = nltk.word_tokenize(text.lower())
# Tokenize text and lowercase words
filtered_words = [word for word in words if word not in stop_words]
# Filter stop words
fdist = FreqDist(filtered_words) # Create frequency distribution
return fdist.most_common(num_words) # Return the most common words

# Example usage:
text = "This is a sample text with some common words and some less common words."
frequent_words = find_frequent_words(text)
print(frequent_words)

21
OUTPUT:

[('common', 2), ('words', 2), ('sample', 1), ('text', 1), ('less', 1), ('.',
1)] [nltk_data] Downloading package stopwords to
/root/nltk_data... [nltk_data] Unzipping corpora/stopwords.zip.

RESULT:

The Write a function that finds the 50 most frequently occurring words of a text words program
is executed successfully and output got verified.

22
EXP NO:5
Implement the Word2Vec Mode
Date:

AIM:

To create a python program for Implement the Word2Vec mode.

SOFTWARE AND HARDWARE SPECIFICATIONS:

SOFTWARE SPECIFICATIONS:

• Anaconda Navigator
• Jupyter Notebook
• Google Colab

HARDWARE SPECIFICATIONS:

• Windows 10/11
• RAM - 16 GB
• Hard-disk - 1 TB
• Processor - Intel i5/i7

ALGORITHM:

• Install the gensim library if not already installed.

• Import the Word2Vec class from the models module of gensim, and import the
word_tokenize function from nltk.tokenize.
• Download the Punkt tokenizer from NLTK.

23
• Define the sample sentences.
• Tokenize the sentences into words using word_tokenize, and convert them to lowercase.
• Set up and train the Word2Vec model with specified parameters (vector_size=100,
window=5, min_count=1, workers=4).
• Save the trained Word2Vec model to a file named "word2vec_model_sentences.bin", and
load the saved model.

PROGRAM:

!pip install gensim

# !pip install nltk

from gensim.models import Word2Vec
from nltk.tokenize import word_tokenize
import nltk
nltk.download('punkt') # Download the Punkt tokenizer
# Sample sentences
sentences = [

"Rajalakshmi Institute of Technology (An Autonomous Institution) is one of the best

engineering colleges in Chennai and is part of Rajalakshmi Institution.",]

# Tokenize the sentences into words

tokenized_sentences = [word_tokenize(sentence.lower()) for sentence in sentences]

# Set up and train the Word2Vec model

model = Word2Vec(sentences=tokenized_sentences, vector_size=100, window=5, min_count=1,

workers=4)

24
# Save the trained model to a file

model.save("word2vec_model_sentences.bin")

# Load the saved model

loaded_model = Word2Vec.load("word2vec_model_sentences.bin")

# Example of accessing word embeddings

word_embedding = loaded_model.wv['engineering']

print("Word embedding for 'engineering':", word_embedding)

OUTPUT:

Requirement already satisfied: gensim in /usr/local/lib/python3.10/dist-packages (4.3.2)

Requirement already satisfied: numpy>=1.18.5 in /usr/local/lib/python3.10/dist-packages
(from gensim) (1.25.2)
Requirement already satisfied: scipy>=1.7.0 in /usr/local/lib/python3.10/dist-packages (from
gensim) (1.11.4)
Requirement already satisfied: smart-open>=1.8.1 in /usr/local/lib/python3.10/dist-packages
(from gensim) (6.4.0)
Word embedding for 'engineering': [ 8.1337420e-03 -4.4567669e-03 -1.0677723e-03
1.0070275e- 03
-1.9067159e-04 1.1478068e-03 6.1138864e-03 -1.9740148e-05
-3.2462610e-03 -1.5113860e-03 5.8977930e-03 1.5140529e-03
-7.2417606e-04 9.3328813e-03 -4.9208086e-03 -
8.3867664e-04 9.1757430e-03 6.7491201e-03
1.5023022e-03 -8.8838348e-03
1.1485356e-03 -2.2885958e-03 9.3682129e-03 1.2085887e-03
1.4890181e-03 2.4062216e-03 -1.8367700e-03 -4.9984655e-03
2.3243426e-04 -2.0137802e-03 6.6000852e-03 8.9397114e-03

25
-6.7476230e-04 2.9767421e-03 -6.1080824e-03 1.6988355e-03
-6.9264821e-03 -8.6941104e-03 -5.9012529e-03 -8.9566577e-03
7.2771502e-03 -5.7719820e-03 8.2766823e-03 -7.2425883e-03
3.4219360e-03 9.6746497e-03 -7.7848872e-03 -9.9454839e-03
-4.3290602e-03 -2.6821969e-03 -2.7132613e-04 -8.8319331e-03
-8.6167511e-03 2.7997096e-03 -8.2072075e-03 -9.0692798e-03
-2.3409016e-03 -8.6309426e-03 -7.0565986e-03 -8.4008174e-03
-3.0119700e-04 -4.5645908e-03 6.6272104e-03 1.5276786e-03
-3.3420518e-03 6.1100693e-03 -6.0128779e-03 -4.6551023e-03
-7.2083715e-03 -4.3364055e-03 -1.8094820e-03 6.4903200e-03
-2.7698609e-03 4.9190638e-03 6.9043743e-03 -
7.4632545e-03 4.5653125e-03 6.1272969e-03 -2.9546837e-
03 6.6242618e-03
6.1250199e-03 -6.4425734e-03 -6.7656934e-03 2.5390687e-03
-1.6231104e-03 -6.0651163e-03 9.4992034e-03 -5.1304861e-03
-6.5529565e-03 -1.1961181e-04 -2.7010120e-03 4.4384925e-04
-3.5381056e-03 -4.1872010e-04 -7.0809841e-04 8.2218763e-
04 8.1943199e-03 -5.7367464e-03 -1.6597145e-03
5.5715367e-03]
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data] Package punkt is already up-to-date!

RESULT:

The implementation of the Word2Vec Mode program is executed successfully and Output got
verified.

26
EXP NO:6
Use a Transformer for Implementing Classification
Date:

AIM:

To create a python program for use a transformer for implementing classification.

SOFTWARE AND HARDWARE SPECIFICATIONS:

SOFTWARE SPECIFICATIONS:

• Anaconda Navigator
• Jupyter Notebook
• Google Colab

HARDWARE SPECIFICATIONS:

• Windows 10/11
• RAM - 16 GB
• Hard-disk - 1 TB
• Processor - Intel i5/i7

ALGORITHM:

• Install the required libraries: torch, transformers, sklearn, and tqdm.

• Import the necessary modules from the installed libraries.
• Define sample data for text classification (texts and corresponding labels).

27
• Split the data into training and testing sets using train_test_split from
sklearn.model_selection.
• Load the pre-trained BERT model and tokenizer using BertTokenizer.from_pretrained
and BertForSequenceClassification.from_pretrained from transformers.
• Tokenize and encode the training and testing data using the BERT tokenizer.
• Define the optimizer, loss function, and set up the training loop using PyTorch.

PROGRAM:

!pip install torch

!pip install transformers
!pip install sklearn
!pip install tqdm
import torch

from transformers import BertTokenizer, BertForSequenceClassification

from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from torch.utils.data import DataLoader, TensorDataset
from tqdm import tqdm
# Sample data for text classification
texts = ["This is a positive example.", "This is a negative example.", "Another positive one.",
"Negative text here."]
labels = [1, 0, 1, 0]
# Split data into training and testing sets

train_texts, test_texts, train_labels, test_labels = train_test_split(texts, labels, test_size=0.2,

random_state=42)

28
# Load pre-trained BERT model and tokenizer
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertForSequenceClassification.from_pretrained('bert-base-uncased', num_labels=2)
# Tokenize and encode the training data

train_encodings = tokenizer(train_texts, truncation=True, padding=True, return_tensors='pt')

train_labels = torch.tensor(train_labels)
# Tokenize and encode the testing data
test_encodings = tokenizer(test_texts, truncation=True, padding=True, return_tensors='pt')
test_labels = torch.tensor(test_labels)
# Create DataLoader for training and testing data
train_dataset = TensorDataset(train_encodings['input_ids'], train_encodings['attention_mask'],
train_labels)
test_dataset = TensorDataset(test_encodings['input_ids'], test_encodings['attention_mask'],
test_labels)
train_dataloader = DataLoader(train_dataset, batch_size=2, shuffle=True)
test_dataloader = DataLoader(test_dataset, batch_size=2, shuffle=False)
# Define optimizer and loss function
optimizer = torch.optim.AdamW(model.parameters(), lr=1e-5)
criterion = torch.nn.CrossEntropyLoss()
# Training loop
device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')
model.to(device)
for epoch in range(3):

29
model.train()
for batch in tqdm(train_dataloader, desc=f"Epoch {epoch + 1}"):
input_ids, attention_mask, labels = batch
input_ids, attention_mask, labels = input_ids.to(device), attention_mask.to(device),
labels.to(device)
optimizer.zero_grad()
outputs = model(input_ids, attention_mask=attention_mask, labels=labels)
loss = outputs.loss
loss.backward()
optimizer.step()

# Evaluation
model.eval()
predictions = []
true_labels = []
with torch.no_grad():
for batch in tqdm(test_dataloader, desc="Evaluating"):
input_ids, attention_mask, labels = batch
input_ids, attention_mask, labels = input_ids.to(device), attention_mask.to(device),
labels.to(device)
outputs = model(input_ids, attention_mask=attention_mask)
logits = outputs.logits
predicted_labels = torch.argmax(logits, dim=1).cpu().numpy()
predictions.extend(predicted_labels)
true_labels.extend(labels.cpu().numpy())

30
# Calculate accuracy
accuracy = accuracy_score(true_labels, predictions)
print(f"Accuracy: {accuracy * 100:.2f}%")

OUTPUT:

Requirement already satisfied: torch in /usr/local/lib/python3.10/dist-packages (2.2.1+cu121)

Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from torch)
(3.13.3)
Requirement already satisfied: typing-extensions>=4.8.0 in /usr/local/lib/python3.10/dist-packages
(from torch) (4.10.0)
Requirement already satisfied: sympy in /usr/local/lib/python3.10/dist-packages (from torch) (1.12)
Requirement already satisfied: networkx in /usr/local/lib/python3.10/dist-packages (from torch)
(3.2.1)
Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from torch) (3.1.3)
Requirement already satisfied: fsspec in /usr/local/lib/python3.10/dist-packages (from torch)
(2023.6.0)
Collecting nvidia-cuda-nvrtc-cu12==12.1.105 (from torch)
Downloading nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (23.7 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 23.7/23.7 MB 28.0 MB/s eta 0:00:00
Collecting nvidia-cuda-runtime-cu12==12.1.105 (from torch)
Downloading nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (823 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 823.6/823.6 kB 29.2 MB/s eta
0:00:00
Collecting nvidia-cuda-cupti-cu12==12.1.105 (from torch)
Downloading nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (14.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 14.1/14.1 MB 33.2 MB/s eta 0:00:00
Collecting nvidia-cudnn-cu12==8.9.2.26 (from torch)
Downloading nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64.whl (731.7 MB)

31
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 731.7/731.7 MB 987.8 kB/s eta
0:00:00
Collecting nvidia-cublas-cu12==12.1.3.1 (from torch)
Downloading nvidia_cublas_cu12-12.1.3.1-py3-none-manylinux1_x86_64.whl (410.6 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 410.6/410.6 MB 1.2 MB/s eta
0:00:00
Collecting nvidia-cufft-cu12==11.0.2.54 (from torch)
Downloading nvidia_cufft_cu12-11.0.2.54-py3-none-manylinux1_x86_64.whl (121.6 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 121.6/121.6 MB 8.3 MB/s eta
0:00:00
Collecting nvidia-curand-cu12==10.3.2.106 (from torch)
Downloading nvidia_curand_cu12-10.3.2.106-py3-none-manylinux1_x86_64.whl (56.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 56.5/56.5 MB 11.4 MB/s eta 0:00:00
Collecting nvidia-cusolver-cu12==11.4.5.107 (from torch)
Downloading nvidia_cusolver_cu12-11.4.5.107-py3-none-manylinux1_x86_64.whl (124.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 124.2/124.2 MB 6.2 MB/s eta
0:00:00
Collecting nvidia-cusparse-cu12==12.1.0.106 (from torch)
Downloading nvidia_cusparse_cu12-12.1.0.106-py3-none-manylinux1_x86_64.whl (196.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 196.0/196.0 MB 2.4 MB/s eta
0:00:00
Collecting nvidia-nccl-cu12==2.19.3 (from torch)
Downloading nvidia_nccl_cu12-2.19.3-py3-none-manylinux1_x86_64.whl (166.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 166.0/166.0 MB 2.3 MB/s eta
0:00:00
Collecting nvidia-nvtx-cu12==12.1.105 (from torch)
Downloading nvidia_nvtx_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (99 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 99.1/99.1 kB 11.1 MB/s eta 0:00:00
Requirement already satisfied: triton==2.2.0 in /usr/local/lib/python3.10/dist-packages (from torch)
(2.2.0)

32
Collecting nvidia-nvjitlink-cu12 (from nvidia-cusolver-cu12==11.4.5.107->torch)
Downloading nvidia_nvjitlink_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl (21.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 21.1/21.1 MB 45.3 MB/s eta 0:00:00
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from
jinja2->torch) (2.1.5)
Requirement already satisfied: mpmath>=0.19 in /usr/local/lib/python3.10/dist-packages (from
sympy->torch) (1.3.0)
Installing collected packages: nvidia-nvtx-cu12, nvidia-nvjitlink-cu12, nvidia-nccl-cu12, nvidia-
curand-cu12, nvidia-cufft-cu12, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-
cupti-cu12, nvidia-cublas-cu12, nvidia-cusparse-cu12, nvidia-cudnn-cu12, nvidia-cusolver-cu12
Successfully installed nvidia-cublas-cu12-12.1.3.1 nvidia-cuda-cupti-cu12-12.1.105 nvidia-cuda-
nvrtc-cu12-12.1.105 nvidia-cuda-runtime-cu12-12.1.105 nvidia-cudnn-cu12-8.9.2.26 nvidia-cufft-
cu12-11.0.2.54 nvidia-curand-cu12-10.3.2.106 nvidia-cusolver-cu12-11.4.5.107 nvidia-cusparse-
cu12-12.1.0.106 nvidia-nccl-cu12-2.19.3 nvidia-nvjitlink-cu12-12.4.127 nvidia-nvtx-cu12-12.1.105
Requirement already satisfied: transformers in /usr/local/lib/python3.10/dist-packages (4.38.2)
Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from
transformers) (3.13.3)
Requirement already satisfied: huggingface-hub<1.0,>=0.19.3 in /usr/local/lib/python3.10/dist-
packages (from transformers) (0.20.3)
Requirement already satisfied: numpy>=1.17 in /usr/local/lib/python3.10/dist-packages (from
transformers) (1.25.2)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from
transformers) (24.0)
Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.10/dist-packages (from
transformers) (6.0.1)
Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.10/dist-packages (from
transformers) (2023.12.25)
Requirement already satisfied: requests in /usr/local/lib/python3.10/dist-packages (from
transformers) (2.31.0)
Requirement already satisfied: tokenizers<0.19,>=0.14 in /usr/local/lib/python3.10/dist-packages

33
(from transformers) (0.15.2)
Requirement already satisfied: safetensors>=0.4.1 in /usr/local/lib/python3.10/dist-packages (from
transformers) (0.4.2)
Requirement already satisfied: tqdm>=4.27 in /usr/local/lib/python3.10/dist-packages (from
transformers) (4.66.2)
Requirement already satisfied: fsspec>=2023.5.0 in /usr/local/lib/python3.10/dist-packages (from
huggingface-hub<1.0,>=0.19.3->transformers) (2023.6.0)
Requirement already satisfied: typing-extensions>=3.7.4.3 in /usr/local/lib/python3.10/dist-packages
(from huggingface-hub<1.0,>=0.19.3->transformers) (4.10.0)
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages
(from requests->transformers) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from
requests->transformers) (3.6)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from
requests->transformers) (2.0.7)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from
requests->transformers) (2024.2.2)
Collecting sklearn
Downloading sklearn-0.0.post12.tar.gz (2.6 kB)
error: subprocess-exited-with-error

× python setup.py egg_info did not run successfully.

│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
Preparing metadata (setup.py) ... error
error: metadata-generation-failed

× Encountered error while generating package metadata.

34
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.
Requirement already satisfied: tqdm in /usr/local/lib/python3.10/dist-packages (4.66.2)
/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_token.py:88: UserWarning:
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab
(https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your
session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.
warnings.warn(
tokenizer_config.json: 100%
48.0/48.0 [00:00<00:00, 1.74kB/s]
vocab.txt: 100%
232k/232k [00:00<00:00, 6.39MB/s]
tokenizer.json: 100%
466k/466k [00:00<00:00, 19.9MB/s]
config.json: 100%
570/570 [00:00<00:00, 21.2kB/s]
model.safetensors: 100%
440M/440M [00:01<00:00, 240MB/s]
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at
bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions
and inference.
Epoch 1: 100%|██████████| 2/2 [00:04<00:00, 2.31s/it]
Epoch 2: 100%|██████████| 2/2 [00:02<00:00, 1.38s/it]
Epoch 3: 100%|██████████| 2/2 [00:02<00:00, 1.40s/it]

35
Evaluating: 100%|██████████| 1/1 [00:00<00:00, 7.83it/s]Accuracy: 0.00%

RESULT:

Using a transformer for implementing classification program is executed successfully and

Output got verified.

36
EXP NO:7 Design a Chatbot with a Simple Dialog System
Date:

AIM:

To create a python python for design a Chatbot with a simple dialog system.

SOFTWARE AND HARDWARE SPECIFICATIONS:

SOFTWARE SPECIFICATIONS:

• Anaconda Navigator
• Jupyter Notebook
• Google Colab

HARDWARE SPECIFICATIONS:

• Windows 10/11
• RAM - 16 GB
• Hard-disk - 1 TB
• Processor - Intel i5/i7

ALGORITHM:

• Define a class named Simple Chatbot.

• Initialize the chatbot with lists of greetings, goodbyes, and a dictionary of responses.

37
• Define a method get_response to generate responses based on user input.
• Define a main function to interact with the chatbot.
• Create an instance of the Simple Chabot class.
• Start a conversation loop where the user can input questions or statements.
• Provide appropriate responses based on user input, including handling greetings,
goodbyes, and predefined queries.

PROGRAM:

import random
class SimpleChatbot:
def init (self):
self.greetings = ['hello', 'hi', 'hey', 'greetings', 'howdy']
self.goodbyes = ['bye', 'goodbye', 'see you', 'farewell']
self.responses = {
'tell me a joke': 'Why did the chicken cross the road? To get to the other
side!', 'how are you': 'I am just a computer program, but thanks for
asking!',
'default': 'I\'m sorry, I don\'t understand that. Can you ask me something else?'
}
def get_response(self, user_input):
user_input = user_input.lower()
if any(greeting in user_input for greeting in self.greetings):
return 'Hello! How can I help you today?'
elif any(goodbye in user_input for goodbye in self.goodbyes):
return 'Goodbye! Have a great day.'

38
else:
for key in self.responses:

if key in user_input:
return self.responses[key]
return self.responses['default']
def main():
chatbot = SimpleChatbot()
print("Simple Chatbot: Hello! Ask me anything or say goodbye to end the conversation.")
while True:
user_input = input("You: ")
if user_input.lower() in ['bye', 'goodbye', 'exit']:
print("Simple Chatbot: Goodbye! Have a great day.")
break
response =chatbot.get_response(user_input)
print("Simple Chatbot:", response)

if name == " main ":

main()

39
OUTPUT:

Simple Chatbot: Hello! Ask me anything or say goodbye to end the

conversation. You: hi
Simple Chatbot: Hello! How can I help you
today? You: how are you!?
Simple Chatbot: I am just a computer program, but thanks for
asking! You: bye
Simple Chatbot: Goodbye! Have a great day.

RESULT:

The Design of a chatbot with a simple dialog system program is executed successfully and
Output got verified.

40
EXP NO:8
Convert Text to Speech and Find accuracy
Date:

AIM:

To create a python program for converting text to speech and find accuracy.

SOFTWARE AND HARDWARE SPECIFICATIONS:

SOFTWARE SPECIFICATIONS:

• Anaconda Navigator
• Jupyter Notebook
• Google Colab

HARDWARE SPECIFICATIONS:

• Windows 10/11
• RAM - 16 GB
• Hard-disk - 1 TB
• Processor - Intel i5/i7

ALGORITHM:

• Install the required libraries: SpeechRecognition, gTTS, and pyaudio.

• Import the necessary modules (speech_recognition, gTTS, os) for speech recognition
and text- to-speech conversion.
• Define a function text_to_speech to convert text to speech using gTTS.

41
• Define a function speech_to_text to recognize speech using the microphone and Google's
speech recognition service.
• Define a function evaluate_accuracy to compare the original text with the recognized
text and calculate accuracy.
• Execute the text-to-speech conversion for the original text.
• Use the microphone to capture speech input, recognize it, and evaluate the
accuracy of the recognition.

PROGRAM:

!pip install SpeechRecognition

!pip install gTTS
!pip install pyaudio

import speech_recognition as sr
from gtts import gTTS
import os
def text_to_speech(text, language='en'):

tts = gTTS(text=text, lang=language, slow=False)

tts.save("output.mp3")
os.system("start output.mp3") # This opens the file using the default media player
def speech_to_text():
recognizer = sr.Recognizer()
with sr.Microphone() as source:
print("Say something:")
audio = recognizer.listen(source)

42
try:
text = recognizer.recognize_google(audio)
return text
except sr.UnknownValueError:
print("Sorry, could not understand audio.")
return None
except sr.RequestError as e:
print(f"Could not request results from Google Speech Recognition service; {e}")
return None
def evaluate_accuracy(original_text, recognized_text):
if recognized_text:
print(f"Original Text: {original_text}")
print(f"Recognized Text: {recognized_text}")
original_words = set(original_text.lower().split())
recognized_words = set(recognized_text.lower().split())
common_words = original_words.intersection(recognized_words)
accuracy = len(common_words) / len(original_words)
print(f"Accuracy: {accuracy * 100:.2f}%")
else:
print("No text recognized. Accuracy cannot be calculated.")
if name_== " main ":

original_text = "Hello, how are you today?"

# Convert text to speech
text_to_speech(original_text)

43
# Speech to text
recognized_text = speech_to_text()
# Evaluate accuracy
evaluate_accuracy(original_text, recognized_text)

OUTPUT:

Requirement already satisfied: SpeechRecognition in c:\users\user\appdata\local\programs\python\

python311\lib\site-packages (3.10.0)
Requirement already satisfied: requests>=2.26.0 in c:\users\user\appdata\local\programs\python\
python311\lib\site-packages (from SpeechRecognition) (2.31.0)
Requirement already satisfied: charset-normalizer<4,>=2 in c:\users\user\appdata\local\programs\
python\python311\lib\site-packages (from requests>=2.26.0->SpeechRecognition) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in c:\users\user\appdata\local\programs\python\
python311\lib\site-packages (from requests>=2.26.0->SpeechRecognition) (3.4)
Requirement already satisfied: urllib3<3,>=1.21.1 in c:\users\user\appdata\local\programs\python\
python311\lib\site-packages (from requests>=2.26.0->SpeechRecognition) (2.1.0)
Requirement already satisfied: certifi>=2017.4.17 in c:\users\user\appdata\local\programs\python\
python311\lib\site-packages (from requests>=2.26.0->SpeechRecognition) (2023.7.22)

[notice] A new release of pip is available: 23.1.2 -> 23.3.2

[notice] To update, run: python.exe -m pip install --upgrade pip
Collecting gTTS
Downloading gTTS-2.5.0-py3-none-any.whl (29 kB)
Requirement already satisfied: requests<3,>=2.27 in c:\users\user\appdata\local\programs\python\
python311\lib\site-packages (from gTTS) (2.31.0)
Collecting click<8.2,>=7.1 (from gTTS)

44
Downloading click-8.1.7-py3-none-any.whl (97 kB)
0.0/97.9 kB ? eta -:--:--
------------------------- 61.4/97.9 kB 1.1 MB/s eta 0:00:01
- ----------------------------------- - 97.9/97.9 kB 933.0 kB/s eta 0:00:00
Requirement already satisfied: colorama in c:\users\user\appdata\local\programs\python\
python311\lib\site-packages (from click<8.2,>=7.1->gTTS) (0.4.6)
Requirement already satisfied: charset-normalizer<4,>=2 in c:\users\user\appdata\local\programs\
python\python311\lib\site-packages (from requests<3,>=2.27->gTTS) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in c:\users\user\appdata\local\programs\python\
python311\lib\site-packages (from requests<3,>=2.27->gTTS) (3.4)
Requirement already satisfied: urllib3<3,>=1.21.1 in c:\users\user\appdata\local\programs\python\
python311\lib\site-packages (from requests<3,>=2.27->gTTS) (2.1.0)
Requirement already satisfied: certifi>=2017.4.17 in c:\users\user\appdata\local\programs\python\
python311\lib\site-packages (from requests<3,>=2.27->gTTS) (2023.7.22)
Installing collected packages: click, gTTS
Successfully installed click-8.1.7 gTTS-2.5.0

[notice] A new release of pip is available: 23.1.2 -> 23.3.2

[notice] To update, run: python.exe -m pip install --upgrade pip

[notice] A new release of pip is available: 23.1.2 -> 23.3.2

[notice] To update, run: python.exe -m pip install --upgrade pip
Collecting pyaudio
Downloading PyAudio-0.2.14-cp311-cp311-win_amd64.whl (164 kB)
0.0/164.1 kB ? eta -:--:--
------------------------------------- 163.8/164.1 kB 5.0 MB/s eta 0:00:01
-------------------------------------- 164.1/164.1 kB 3.3 MB/s eta 0:00:00
Installing collected packages: pyaudio
Successfully installed pyaudio-0.2.14
Say something:

45
Original Text: Hello, how are you today?
Recognized Text: hello how are you today
Accuracy: 60.00%

RESULT:

The Conversion of text to speech and find the accuracy program is executed successfully
and Output got verified.

46
EXP NO:9
Design a speech recognition system and find the error rate
Date:

AIM:

To create a python program for design a speech recognition system and find the error rate.

SOFTWARE AND HARDWARE SPECIFICATIONS:

SOFTWARE SPECIFICATIONS:

• Anaconda Navigator
• Jupyter Notebook
• Google Colab

HARDWARE SPECIFICATIONS:

• Windows 10/11
• RAM - 16 GB
• Hard-disk - 1 TB
• Processor - Intel i5/i7

ALGORITHM:

• Install the jiwer library using pip.

• Import the required modules: speech_recognition for audio processing and jiwer for
calculating Word Error Rate (WER).
• Define a function recognize_speech to recognize speech from an audio file using Google's

47
speech recognition service.
• Define a function calculate_word_error_rate to calculate the WER between a reference
text and recognized text.
• Simulate a reference text and provide the path to the audio file containing the recognized
speech.
• Use the recognize_speech function to get the recognized text from the audio file.
• Calculate the Word Error Rate (WER) between the reference text and the recognized
text using the calculate_word_error_rate function.

PROGRAM:

!pip install jiwer

import speech_recognition as sr
import jiwer
def recognize_speech(audio_file, language='en-US'):
recognizer = sr.Recognizer()
with sr.AudioFile(audio_file) as source:
audio = recognizer.record(source)
try:
recognized_text = recognizer.recognize_google(audio, language=language)
return recognized_text
except sr.UnknownValueError:
print("Speech recognition could not understand the audio.")
return None
except sr.RequestError as e:
print(f"Could not request results from Google Speech Recognition service; {e}")
return None

48
def calculate_word_error_rate(reference_text, recognized_text):
wer = jiwer.wer(reference_text, recognized_text)
return wer
if name _== " main ":

# Simulating a reference text

reference_text = "hello how are you"
# Simulating a recognized text (replace 'audio_file.wav' with the path to your actual audio file)
audio_file_path = 'audio_file.wav'
recognized_text = recognize_speech(audio_file_path)
if recognized_text:

print(f"Reference Text: {reference_text}")

print(f"Recognized Text: {recognized_text}")
# Calculate Word Error Rate (WER)
wer = calculate_word_error_rate(reference_text, recognized_text)
print(f"Word Error Rate (WER): {wer * 100:.2f}%")
else:
print("No text recognized.")

49
OUTPUT:

Original Text: hello How are you

Recognized Text : hello How are you
Error rate : 0.00%

RESULT:

The Design of a speech recognition system and find the error rate program is

executed successfully and Output got verified.

50
51

Python Code Examples
100% (1)
Python Code Examples
30 pages
Dutro Hybrid
92% (12)
Dutro Hybrid
169 pages
Text Analysis With NLTK Cheatsheet
No ratings yet
Text Analysis With NLTK Cheatsheet
3 pages
Ccs369 - Text and Speech Analysis - Lab Manual
100% (1)
Ccs369 - Text and Speech Analysis - Lab Manual
23 pages
NLP Exercises
No ratings yet
NLP Exercises
2 pages
Text Analysis With NLTK Cheatsheet PDF
No ratings yet
Text Analysis With NLTK Cheatsheet PDF
3 pages
NLTK Cheatsheet for Text Analysis
No ratings yet
NLTK Cheatsheet for Text Analysis
3 pages
Transfer Force - Connection Design
No ratings yet
Transfer Force - Connection Design
4 pages
Oath of Office (CS Form No. 32) (2 Original) W/ Documentary Stamp Oath of Office (CS Form No. 32) (2 Original) W/ Documentary Stamp
No ratings yet
Oath of Office (CS Form No. 32) (2 Original) W/ Documentary Stamp Oath of Office (CS Form No. 32) (2 Original) W/ Documentary Stamp
1 page
Guide To Senior Executive Service Qualifications PDF
No ratings yet
Guide To Senior Executive Service Qualifications PDF
41 pages
NLP Lab Manual (R20)
50% (2)
NLP Lab Manual (R20)
24 pages
ISO 20022 Migration Guide
No ratings yet
ISO 20022 Migration Guide
16 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
15 pages
Batch 2
No ratings yet
Batch 2
13 pages
Language Engineering - Section
No ratings yet
Language Engineering - Section
20 pages
Trainer's Role and Functions
100% (1)
Trainer's Role and Functions
20 pages
TSA Student
No ratings yet
TSA Student
20 pages
DSBD 7 Ass
No ratings yet
DSBD 7 Ass
9 pages
CCS369 - Text and Speech Analysis
No ratings yet
CCS369 - Text and Speech Analysis
31 pages
CSIT366-Lab File
No ratings yet
CSIT366-Lab File
17 pages
NLP Record
No ratings yet
NLP Record
15 pages
Detail NLP
No ratings yet
Detail NLP
5 pages
Fire Alarm Standards Update
No ratings yet
Fire Alarm Standards Update
15 pages
Facilitating Conversations That Matter Using Coordinated Management of Meaning Theory
No ratings yet
Facilitating Conversations That Matter Using Coordinated Management of Meaning Theory
7 pages
Python Regex & NLTK Guide
No ratings yet
Python Regex & NLTK Guide
53 pages
Gas Turbine Blade Analysis
No ratings yet
Gas Turbine Blade Analysis
6 pages
My CVs
No ratings yet
My CVs
3 pages
PTW Questionnaire 1 Nov. 2020
No ratings yet
PTW Questionnaire 1 Nov. 2020
4 pages
D22dce179 Ai Practical-3,4
No ratings yet
D22dce179 Ai Practical-3,4
6 pages
NLP Final Review
No ratings yet
NLP Final Review
32 pages
RIM vs. Apple Case Study1
No ratings yet
RIM vs. Apple Case Study1
8 pages
Unit 5
No ratings yet
Unit 5
4 pages
NLP Practical Journal 2023-24
No ratings yet
NLP Practical Journal 2023-24
22 pages
Conventions: Cisco Technical Tips Conventions
No ratings yet
Conventions: Cisco Technical Tips Conventions
6 pages
An Ontology of Technology
100% (1)
An Ontology of Technology
11 pages
All Practicals
No ratings yet
All Practicals
33 pages
Python NLP
No ratings yet
Python NLP
15 pages
Customer Contact List
No ratings yet
Customer Contact List
1 page
Selection Sort
No ratings yet
Selection Sort
5 pages
Ai & ML Week-11
No ratings yet
Ai & ML Week-11
32 pages
Python NLP Tasks with NLTK
No ratings yet
Python NLP Tasks with NLTK
17 pages
NLP Final
No ratings yet
NLP Final
26 pages
EP Brake System vs Air Brake
No ratings yet
EP Brake System vs Air Brake
14 pages
14.student Information Sy
No ratings yet
14.student Information Sy
7 pages
Cryptography Cipher Case Study
No ratings yet
Cryptography Cipher Case Study
6 pages
Microscopic Traffic Flow Modeling: T N T N T n+1 T n+1
No ratings yet
Microscopic Traffic Flow Modeling: T N T N T n+1 T n+1
9 pages
NLP with Python Lab Manual
No ratings yet
NLP with Python Lab Manual
15 pages
Rajeev Mishra 20 SCSE1180087
No ratings yet
Rajeev Mishra 20 SCSE1180087
29 pages
NLP Practicals
No ratings yet
NLP Practicals
6 pages
SMA (TASK1 AND 2) ... HARDCOPY (Final) ..Pranchal..
No ratings yet
SMA (TASK1 AND 2) ... HARDCOPY (Final) ..Pranchal..
11 pages
Form XII
No ratings yet
Form XII
2 pages
Paket Data & Nelpon XL dan AXIS
No ratings yet
Paket Data & Nelpon XL dan AXIS
11 pages
Known: Comments:: Bending Impact-Effect of Compound Springs
No ratings yet
Known: Comments:: Bending Impact-Effect of Compound Springs
2 pages
Opt Sim
No ratings yet
Opt Sim
1 page
NLP Lab Manual for CSE Students
No ratings yet
NLP Lab Manual for CSE Students
28 pages
NLP FinAL
No ratings yet
NLP FinAL
27 pages
Experiment: 1
No ratings yet
Experiment: 1
28 pages
HVDC vs HVAC Transmission Seminar
No ratings yet
HVDC vs HVAC Transmission Seminar
8 pages
Ccs369-Lab Ex 3,4,5
No ratings yet
Ccs369-Lab Ex 3,4,5
8 pages
NLP Lab - Manual
No ratings yet
NLP Lab - Manual
33 pages
R22 NLP Python Programs
No ratings yet
R22 NLP Python Programs
15 pages
Index CCS369
No ratings yet
Index CCS369
1 page
Life Signals
No ratings yet
Life Signals
3 pages
Omkar Nimbalkar Ass3
No ratings yet
Omkar Nimbalkar Ass3
14 pages
For Assignment-10 (Machine Learning With Python - NLP-2)
No ratings yet
For Assignment-10 (Machine Learning With Python - NLP-2)
37 pages
Aim - Procedure - Result - Single Side
No ratings yet
Aim - Procedure - Result - Single Side
18 pages
Tsa Ex-2
No ratings yet
Tsa Ex-2
4 pages
SCMS for Yorkshire Councils
No ratings yet
SCMS for Yorkshire Councils
2 pages
CCS369-Text and Speech Analysis Lab (1-9)
No ratings yet
CCS369-Text and Speech Analysis Lab (1-9)
37 pages
NLP Record
No ratings yet
NLP Record
16 pages
Tsarecord
No ratings yet
Tsarecord
22 pages
Tsa Labmanual
No ratings yet
Tsa Labmanual
26 pages
NLP - Record (Weeks 1-12)
No ratings yet
NLP - Record (Weeks 1-12)
41 pages
Ai&Ml Bai601 NLP Lab Manual
No ratings yet
Ai&Ml Bai601 NLP Lab Manual
48 pages
Natural Language Processing Lab Manual
No ratings yet
Natural Language Processing Lab Manual
24 pages
DSBA+Master+Codebook+ +Text+Mining+&+TSF
No ratings yet
DSBA+Master+Codebook+ +Text+Mining+&+TSF
11 pages
Entop Academy
No ratings yet
Entop Academy
15 pages
SQL Notes
No ratings yet
SQL Notes
2 pages
CSE Namelist NotAttend
No ratings yet
CSE Namelist NotAttend
1 page
Cloud Computing
No ratings yet
Cloud Computing
1 page
Suggested Domains For Mini
No ratings yet
Suggested Domains For Mini
1 page
Career Fair - 2025 April-1
No ratings yet
Career Fair - 2025 April-1
4 pages
Teams
No ratings yet
Teams
6 pages
Bitcoin and Cryptocurrency Extended Seminar Report
No ratings yet
Bitcoin and Cryptocurrency Extended Seminar Report
7 pages
Ipo Input Process Output
No ratings yet
Ipo Input Process Output
2 pages
All Evennts
No ratings yet
All Evennts
2 pages
Ccs347 Game Development: Dhanush B 420422104016
No ratings yet
Ccs347 Game Development: Dhanush B 420422104016
9 pages
Oose IAT 2
No ratings yet
Oose IAT 2
3 pages
IAT 2 SET (1) CCBT
No ratings yet
IAT 2 SET (1) CCBT
3 pages
Ccs347 GD Iat2 QP - BTL
No ratings yet
Ccs347 GD Iat2 QP - BTL
4 pages
DHANUSH B - Certificate
No ratings yet
DHANUSH B - Certificate
1 page
Front MA
No ratings yet
Front MA
1 page
Lancer Ilean Modern Industrial
No ratings yet
Lancer Ilean Modern Industrial
54 pages
TSA Lab Manual New
No ratings yet
TSA Lab Manual New
14 pages
Natural Language Processing Journal
No ratings yet
Natural Language Processing Journal
73 pages
NLP Core Using NLTK: Dr. Muhammad Nouman Durrani
No ratings yet
NLP Core Using NLTK: Dr. Muhammad Nouman Durrani
42 pages
NLP Lab Codes Till Mod3
No ratings yet
NLP Lab Codes Till Mod3
7 pages
NLP Lab Manual - Final
No ratings yet
NLP Lab Manual - Final
15 pages
Ass 3
No ratings yet
Ass 3
3 pages
Natural Language Processing in Python - Exploring Word Frequencies With NLTK
No ratings yet
Natural Language Processing in Python - Exploring Word Frequencies With NLTK
5 pages
NLP Day1
No ratings yet
NLP Day1
4 pages
JD FS Developer
No ratings yet
JD FS Developer
2 pages
Organizational Culture Environment POM
No ratings yet
Organizational Culture Environment POM
7 pages