Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
235 views51 pages

Ccs339 Text and Speech Analysis Lab Manual

The document outlines various Python experiments focused on text processing using regular expressions and the NLTK library. It includes detailed descriptions of the aims, software and hardware specifications, algorithms, and example programs for tasks such as detecting word patterns, searching text, counting vocabulary, and accessing text corpora. Each experiment concludes with the successful execution of the program and verification of outputs.

Uploaded by

dhanushmathi22
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
235 views51 pages

Ccs339 Text and Speech Analysis Lab Manual

The document outlines various Python experiments focused on text processing using regular expressions and the NLTK library. It includes detailed descriptions of the aims, software and hardware specifications, algorithms, and example programs for tasks such as detecting word patterns, searching text, counting vocabulary, and accessing text corpora. Each experiment concludes with the successful execution of the program and verification of outputs.

Uploaded by

dhanushmathi22
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

EXP NO: 1

Create Regular Expressions in Python for Detecting Word


Date:
Patterns and Tokenizing Text

AIM:
To create Regular expressions in Python for detecting word patterns and tokenizing text.

SOFTWARE AND HARDWARE SPECIFICATIONS:

SOFTWARE SPECIFICATIONS:

• Anaconda Navigator
• Jupyter Notebook
• Google Colab

HARDWARE SPECIFICATIONS:

• Windows 10/11
• RAM - 16 GB
• Hard-disk - 1 TB
• Processor - Intel i5/i7

ALGORITHM:

• Import the Regular Expression module (re).


• Define detect_ word_patterns function to find all word patterns in the text using a
regular expression.
• Define tokenize_text function to tokenize the text using a regular expression pattern.
• Define the main function to demonstrate the functionality of the above two functions.
• Check if the script is being run as the main program.

1
• Execute the main function if the script is being run directly.
• Display the detected words and tokens in the example text.

PROGRAM:

import re

def detect_word_patterns(text):
# Regular expression pattern for detecting words
word_pattern = re.compile(r'\b\w+\b')
# Find all matches for the word pattern in the text
words = word_pattern.findall(text)
return words

def tokenize_text(text):
# Regular expression pattern for tokenizing text
token_pattern = re.compile(r'\b\w+\b|\s|[^\w\s]')
# Find all matches for the token pattern in the
text tokens = token_pattern.findall(text)
return tokens

def main():
# Example text
text = "Rajalakshmi Institute of Technology was established in 2008. RIT is accredited with
highest grade of A++ by NAAC. RIT is affiliated with Anna University Chennai. "
# Detect word patterns
words = detect_word_patterns(text)

2
print("Words:", words)

# Tokenize text
tokens = tokenize_text(text)
print("Tokens:", tokens)
if name == " main ":
main()

OUTPUT:

Words: ['Rajalakshmi', 'Institute', 'of', 'Technology', 'was', 'established', 'in', '2008', 'RIT', 'is',
'accredited', 'with', 'highest', 'grade', 'of', 'A', 'by', 'NAAC', 'RIT', 'is', 'affiliated', 'with', 'Anna',
'University', 'Chennai']

Tokens: ['Rajalakshmi', ' ', 'Institute', ' ', 'of', ' ', 'Technology', ' ', 'was', ' ', 'established', ' ', 'in', ' ',
'2008', '.', ' ', 'RIT', ' ', 'is', ' ', 'accredited', ' ', 'with', ' ', 'highest', ' ', 'grade', ' ', 'of', ' ', 'A', '+', '+', ' ',
'by', ' ', 'NAAC', '.', ' ', 'RIT', ' ', 'is', ' ', 'affiliated', ' ', 'with', ' ', 'Anna', ' ', 'University', ' ',
'Chennai', '.', ' ']

RESULT:

Therefore, the Python program for generating regular expressions to detect word patterns and
tokenize text is completed.

3
EXP NO: 2a
Searching Text
Date:

AIM:

To create a python program for searching text.

SOFTWARE AND HARDWARE SPECIFICATIONS:

SOFTWARE SPECIFICATIONS:

• Anaconda Navigator
• Jupyter Notebook
• Google Colab

HARDWARE SPECIFICATIONS:

• Windows 10/11
• RAM - 16 GB
• Hard-disk - 1 TB
• Processor - Intel i5/i7

ALGORITHM:

• Import the NLTK module (nltk) and install it if not already installed.
• Define the example text.
• Use NLTK's regular expression (nltk.re) module to find all occurrences of the word
"text" in the example text.
• Tokenize the text into words using NLTK's word_tokenize function and convert
them to lowercase.

4
• Create an NLTK Text object from the tokenized text.
• Use the similar method of the Text object to find similar words to "text".
• Print the occurrences of "text" and the similar words.

PROGRAM:

!pip install nltk


import nltk
nltk.download('punkt')
text = "This is a sample text to demonstrate text searching in NLTK."
# Find all occurrences of the word "text"
text_occurrences = nltk.re.findall(r"\btext\b", text)
print(text_occurrences) # Output: ['text', 'text']
# Find similar words using a different approach
tokenized_text = nltk.word_tokenize(text.lower())
text_index = nltk.Text(tokenized_text)
similar_words = text_index.similar("text")
print(similar_words)

OUTPUT:

Requirement already satisfied: nltk in


c:\users\user\appdata\local\programs\python\python311\lib\ site-packages (3.8.1)

Requirement already satisfied: click in


c:\users\user\appdata\local\programs\python\python311\ lib\site-packages (from nltk) (8.1.7)

Requirement already satisfied: joblib in c:\users\user\appdata\local\programs\python\python311\

5
lib\site-packages (from nltk) (1.2.0)

Requirement already satisfied: regex>=2021.8.3 in


c:\users\user\appdata\local\programs\python\ python311\lib\site-packages (from nltk)
(2023.12.25)

Requirement already satisfied: tqdm in


c:\users\user\appdata\local\programs\python\python311\ lib\site-packages (from nltk) (4.66.1)

Requirement already satisfied: colorama in


c:\users\user\appdata\local\programs\python\ python311\lib\site-packages (from click-
>nltk) (0.4.6)

[nltk_data] Downloading package punkt to

[nltk_data] C:\Users\User\AppData\Roaming\nltk_data...

[notice] A new release of pip is available: 23.3.1 -> 23.3.2

[notice] To update, run: python.exe -m pip install --upgrade

pip [nltk_data] Unzipping tokenizers\punkt.zip.

['text',

'text']

None

RESULT:
The searching text program is executed and output got verified.

6
EXP NO:2b
Counting Vocabulary
Date:

AIM:
To create python program for Counting Vocabulary.

SOFTWARE AND HARDWARE SPECIFICATIONS:

SOFTWARE SPECIFICATIONS:

• Anaconda Navigator
• Jupyter Notebook
• Google Colab

HARDWARE SPECIFICATIONS:

• Windows 10/11
• RAM - 16 GB
• Hard-disk - 1 TB
• Processor - Intel i5/i7

ALGORITHM:

• Import the word_tokenize function from NLTK's tokenize module.


• Define the example text.
• Tokenize the text into words using the word_tokenize function and convert them to
lowercase.
• Create a set of unique words (vocabulary) from the tokenized text.
• Calculate the length of the vocabulary set.

7
• Print the length of the vocabulary set.
• Display the number of unique words in the example text.

PROGRAM:

from nltk.tokenize import word_tokenize


text = "This is a sample text with some repeated words."
# Tokenize the text into words
tokens = word_tokenize(text.lower())
# Count the unique words (vocabulary)
vocabulary = set(tokens) print(len(vocabulary))
# Output: 9

OUTPUT:

10

RESULT:
The Counting Vocabulary program is executed successfully and output got verified.

8
EXP NO:2c
Frequency Distribution
Date:

AIM:
To create a python program for Frequency Distribution.

SOFTWARE AND HARDWARE SPECIFICATIONS:

SOFTWARE SPECIFICATIONS:

• Anaconda Navigator
• Jupyter Notebook
• Google Colab

HARDWARE SPECIFICATIONS:

• Windows 10/11
• RAM - 16 GB
• Hard-disk - 1 TB
• Processor - Intel i5/i7

ALGORITHM:

• Import NLTK and download the necessary resources (e.g., punkt tokenizer).
• Import the FreqDist class from NLTK's probability module and the word_tokenize
function from the tokenize module.
• Define the example text.
• Tokenize the text into words using the word_tokenize function and convert them to
lowercase.
• Create a frequency distribution (FreqDist) for the tokens.

9
• Print the most common n words, where n is the desired number of most common words.
• Plot the frequency distribution of the tokens.

PROGRAM:
Import nltk
nltk.download('punkt')
from nltk.probability import FreqDist
from nltk.tokenize import word_tokenize
text = "This is a sample text with some repeated words."
tokens = word_tokenize(text.lower())
# Create a frequency distribution for the words
fdist = FreqDist(tokens)
# Print the most frequent words
print(fdist.most_common(3)) # Output: [('is', 2), ('this', 1), ('a', 1)]
# Plot the frequency distribution
fdist.plot(cumulative=False)

10
OUTPUT:
[nltk_data] Downloading package punkt to /root/nltk_data...

[nltk_data] Unzipping tokenizers/punkt.zip.

[('this', 1), ('is', 1), ('a', 1)]

<Axes: xlabel='Samples', ylabel='Counts'>

RESULT:

The frequency distribution program is executed successfully and output got verified.

11
EXP NO:2d
Collocations
Date:

AIM:

To create a python program for Collocations.

SOFTWARE AND HARDWARE SPECIFICATIONS:

SOFTWARE SPECIFICATIONS:

• Anaconda Navigator
• Jupyter Notebook
• Google Colab

HARDWARE SPECIFICATIONS:

• Windows 10/11
• RAM - 16 GB
• Hard-disk - 1 TB
• Processor - Intel i5/i7

ALGORITHM:

• Import NLTK and the necessary modules (collocations, tokenize).


• Define the example text.
• Tokenize the text into words using the word tokenize function and convert them to
lowercase.
• Initialize a Bigram Assoc Measures object to measure association between bigrams.

12
• Create a Bigram Collocation Finder from the tokenized words.
• Use the Nbest method to find the top n bigrams with the highest Pointwise Mutual
Information (PMI), where n is the desired number of top bigrams.
• Print the top n bigrams with the highest PMI.

PROGRAM:

from nltk.collocations import *


import nltk
from nltk.tokenize import word_tokenize
text = "Natural language processing is an exciting field with many applications."
tokens = word_tokenize(text.lower())
bigram_measures = nltk.collocations.BigramAssocMeasures()
finder = BigramCollocationFinder.from_words(tokens)
# Find the top 5 bigrams with the highest pointwise mutual information (PMI)
print(finder.nbest(bigram_measures.pmi, 5))

OUTPUT:

[('an', 'exciting'),
('applications', '.'),
('exciting', 'field'), ('field', 'with'), ('is', 'an')]

RESULT:
The Collocations program executed successfully, and the output was verified.

13
EXP NO:2e
Bigrams
Date:

AIM:
To create a python program for Bigrams.

SOFTWARE AND HARDWARE SPECIFICATIONS:

SOFTWARE SPECIFICATIONS:

• Anaconda Navigator
• Jupyter Notebook
• Google Colab

HARDWARE SPECIFICATIONS:

• Windows 10/11
• RAM - 16 GB
• Hard-disk - 1 TB
• Processor - Intel i5/i7

ALGORITHM:
• Import NLTK and download the necessary resources (e.g., punkt tokenizer).
• Import the ngrams function from NLTK's util module and the word_tokenize function
from the tokenize module.
• Define the example text.
• Tokenize the text into words using the word_tokenize function and convert them to
lowercase.
• Generate bigrams (sequences of two consecutive words) from the tokenized words
using the ngrams function.

14
• Convert the bigrams generator into a list to display the bigrams.
• Print the list of generated bigrams.

PROGRAM:

import nltk
nltk.download('punkt')
from nltk.util import ngrams
from nltk.tokenize import word_tokenize
text = "This is a sample text to demonstrate bigrams."
tokens = word_tokenize(text.lower())
# Generate bigrams (sequences of two consecutive words)
bigrams = ngrams(tokens, 2)
print(list(bigrams))

OUTPUT:

[('this', 'is'), ('is', 'a'), ('a', 'sample'), ('sample', 'text'), ('text', 'to'), ('to', 'demonstrate'),
('demonstrate', 'bigrams'), ('bigrams', '.')]
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data] Package punkt is already up-to-date!

RESULT:

The Bigrams program is executed successfully and Output got verified.

15
EXP NO:3
Accessing Text Corpora using NLTK in Python
Date:

AIM:

To create a python program for Accessing Text Corpora using NLTK in Python

SOFTWARE AND HARDWARE SPECIFICATIONS:

SOFTWARE SPECIFICATIONS:

• Anaconda Navigator
• Jupyter Notebook
• Google Colab

HARDWARE SPECIFICATIONS:

• Windows 10/11
• RAM - 16 GB
• Hard-disk - 1 TB
• Processor - Intel i5/i7

ALGORITHM:

• Import the nltk module and install it if not already installed.


• Import the gutenberg corpus from NLTK's corpus module.
• Download the Gutenberg corpus if not already downloaded.
• Define a function access_gutenberg_corpus to access and display information about the

16
Gutenberg corpus.
• List available files in the Gutenberg corpus using the fileids method.
• Access and print the text of a specific document in the corpus (in this case,
"shakespeare- hamlet.txt").
• Call the access_gutenberg_corpus function from the main function.

PROGRAM:

# !pip install nltk


import nltk

from nltk.corpus import gutenberg


# Download the Gutenberg corpus (if not already downloaded)
nltk.download('gutenberg')
def access_gutenberg_corpus():
# List available files in the Gutenberg corpus
print("Available files in Gutenberg Corpus:")
print(gutenberg.fileids())
# Access and print the text of a specific document in the corpus
document_name = '/content/shakespeare-hamlet.txt.txt'
document_text = gutenberg.raw(document_name)
print(f"\nText of '{document_name}':\n{document_text[:500]}...")
def main():

# Access the Gutenberg corpus


access_gutenberg_corpus()

17
if name_== " main ":
main()

OUTPUT:

Available files in Gutenberg Corpus:


['austen-emma.txt', 'austen-persuasion.txt', 'austen-sense.txt', 'bible-kjv.txt', 'blake-poems.txt',
'bryant-stories.txt', 'burgess-busterbrown.txt', 'carroll-alice.txt', 'chesterton-ball.txt', 'chesterton-
brown.txt', 'chesterton-thursday.txt', 'edgeworth-parents.txt', 'melville-moby_dick.txt', 'milton-
paradise.txt', 'shakespeare-caesar.txt', 'shakespeare-hamlet.txt', 'shakespeare-macbeth.txt',
'whitman- leaves.txt']
Text of '/content/shakespeare-hamlet.txt.txt':
THE TRAGEDY OF HAMLET, PRINCE OF DENMARK
by William Shakespeare
Dramatis Personae
Claudius, King of Denmark.
Marcellus, Officer.
Hamlet, son to the former, and nephew to the present king.
Polonius, Lord Chamberlain.
Horatio, friend to Hamlet.
Laertes, son to Polonius.
Voltemand, courtier.

18
Cornelius, courtier.
Rosencrantz, courtier.
Guildenstern, courtier.
Osric, courtier.
A Gentleman, courtier.
A Priest.
Marcellus, officer.
Bernardo, off...
[nltk_data] Downloading package gutenberg to /root/nltk_data...
[nltk_data] Package gutenberg is already up-to-date!

RESULT:

The Accessing Text Corpora using NLTK in Python program is executed successfully and
output got verified.

19
EXP NO:4 Write a Function that Finds The 50 Most Frequently
Date: Occurring Words of a Text Words.

AIM:

To create python program for creating a function that finds the 50 most frequently occurring
words of a text words.

SOFTWARE AND HARDWARE SPECIFICATIONS:

SOFTWARE SPECIFICATIONS:

• Anaconda Navigator
• Jupyter Notebook
• Google Colab

HARDWARE SPECIFICATIONS:

• Windows 10/11
• RAM - 16 GB
• Hard-disk - 1 TB
• Processor - Intel i5/i7

ALGORITHM:

• Import the nltk module and install it if not already installed.


• Download the stopwords corpus from NLTK.

20
• Import the necessary modules (stopwords, FreqDist) from NLTK.
• Define the find_frequent_words function to find the most frequent words in the text.
• Load English stopwords using stopwords.words('english').
• Tokenize the text into words using nltk.word_tokenize and convert them to lowercase.
• Filter out stopwords from the tokenized words, create a frequency distribution using
FreqDist, and return the most common words.

PROGRAM:

# !pip install nltk


import nltk
nltk.download('stopwords')
import nltk
from nltk.corpus import stopwords from
nltk.probability import FreqDist
def find_frequent_words(text, num_words=50):
stop_words = set(stopwords.words('english'))
# Load English stop words
words = nltk.word_tokenize(text.lower())
# Tokenize text and lowercase words
filtered_words = [word for word in words if word not in stop_words]
# Filter stop words
fdist = FreqDist(filtered_words) # Create frequency distribution
return fdist.most_common(num_words) # Return the most common words

# Example usage:
text = "This is a sample text with some common words and some less common words."
frequent_words = find_frequent_words(text)
print(frequent_words)

21
OUTPUT:

[('common', 2), ('words', 2), ('sample', 1), ('text', 1), ('less', 1), ('.',
1)] [nltk_data] Downloading package stopwords to
/root/nltk_data... [nltk_data] Unzipping corpora/stopwords.zip.

RESULT:

The Write a function that finds the 50 most frequently occurring words of a text words program
is executed successfully and output got verified.

22
EXP NO:5
Implement the Word2Vec Mode
Date:

AIM:

To create a python program for Implement the Word2Vec mode.

SOFTWARE AND HARDWARE SPECIFICATIONS:

SOFTWARE SPECIFICATIONS:

• Anaconda Navigator
• Jupyter Notebook
• Google Colab

HARDWARE SPECIFICATIONS:

• Windows 10/11
• RAM - 16 GB
• Hard-disk - 1 TB
• Processor - Intel i5/i7

ALGORITHM:

• Install the gensim library if not already installed.


• Import the Word2Vec class from the models module of gensim, and import the
word_tokenize function from nltk.tokenize.
• Download the Punkt tokenizer from NLTK.

23
• Define the sample sentences.
• Tokenize the sentences into words using word_tokenize, and convert them to lowercase.
• Set up and train the Word2Vec model with specified parameters (vector_size=100,
window=5, min_count=1, workers=4).
• Save the trained Word2Vec model to a file named "word2vec_model_sentences.bin", and
load the saved model.

PROGRAM:

!pip install gensim

# !pip install nltk


from gensim.models import Word2Vec
from nltk.tokenize import word_tokenize
import nltk
nltk.download('punkt') # Download the Punkt tokenizer
# Sample sentences
sentences = [

"Rajalakshmi Institute of Technology (An Autonomous Institution) is one of the best


engineering colleges in Chennai and is part of Rajalakshmi Institution.",]

# Tokenize the sentences into words

tokenized_sentences = [word_tokenize(sentence.lower()) for sentence in sentences]


# Set up and train the Word2Vec model

model = Word2Vec(sentences=tokenized_sentences, vector_size=100, window=5, min_count=1,


workers=4)

24
# Save the trained model to a file

model.save("word2vec_model_sentences.bin")

# Load the saved model

loaded_model = Word2Vec.load("word2vec_model_sentences.bin")

# Example of accessing word embeddings

word_embedding = loaded_model.wv['engineering']

print("Word embedding for 'engineering':", word_embedding)

OUTPUT:

Requirement already satisfied: gensim in /usr/local/lib/python3.10/dist-packages (4.3.2)


Requirement already satisfied: numpy>=1.18.5 in /usr/local/lib/python3.10/dist-packages
(from gensim) (1.25.2)
Requirement already satisfied: scipy>=1.7.0 in /usr/local/lib/python3.10/dist-packages (from
gensim) (1.11.4)
Requirement already satisfied: smart-open>=1.8.1 in /usr/local/lib/python3.10/dist-packages
(from gensim) (6.4.0)
Word embedding for 'engineering': [ 8.1337420e-03 -4.4567669e-03 -1.0677723e-03
1.0070275e- 03
-1.9067159e-04 1.1478068e-03 6.1138864e-03 -1.9740148e-05
-3.2462610e-03 -1.5113860e-03 5.8977930e-03 1.5140529e-03
-7.2417606e-04 9.3328813e-03 -4.9208086e-03 -
8.3867664e-04 9.1757430e-03 6.7491201e-03
1.5023022e-03 -8.8838348e-03
1.1485356e-03 -2.2885958e-03 9.3682129e-03 1.2085887e-03
1.4890181e-03 2.4062216e-03 -1.8367700e-03 -4.9984655e-03
2.3243426e-04 -2.0137802e-03 6.6000852e-03 8.9397114e-03

25
-6.7476230e-04 2.9767421e-03 -6.1080824e-03 1.6988355e-03
-6.9264821e-03 -8.6941104e-03 -5.9012529e-03 -8.9566577e-03
7.2771502e-03 -5.7719820e-03 8.2766823e-03 -7.2425883e-03
3.4219360e-03 9.6746497e-03 -7.7848872e-03 -9.9454839e-03
-4.3290602e-03 -2.6821969e-03 -2.7132613e-04 -8.8319331e-03
-8.6167511e-03 2.7997096e-03 -8.2072075e-03 -9.0692798e-03
-2.3409016e-03 -8.6309426e-03 -7.0565986e-03 -8.4008174e-03
-3.0119700e-04 -4.5645908e-03 6.6272104e-03 1.5276786e-03
-3.3420518e-03 6.1100693e-03 -6.0128779e-03 -4.6551023e-03
-7.2083715e-03 -4.3364055e-03 -1.8094820e-03 6.4903200e-03
-2.7698609e-03 4.9190638e-03 6.9043743e-03 -
7.4632545e-03 4.5653125e-03 6.1272969e-03 -2.9546837e-
03 6.6242618e-03
6.1250199e-03 -6.4425734e-03 -6.7656934e-03 2.5390687e-03
-1.6231104e-03 -6.0651163e-03 9.4992034e-03 -5.1304861e-03
-6.5529565e-03 -1.1961181e-04 -2.7010120e-03 4.4384925e-04
-3.5381056e-03 -4.1872010e-04 -7.0809841e-04 8.2218763e-
04 8.1943199e-03 -5.7367464e-03 -1.6597145e-03
5.5715367e-03]
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data] Package punkt is already up-to-date!

RESULT:

The implementation of the Word2Vec Mode program is executed successfully and Output got
verified.

26
EXP NO:6
Use a Transformer for Implementing Classification
Date:

AIM:

To create a python program for use a transformer for implementing classification.

SOFTWARE AND HARDWARE SPECIFICATIONS:

SOFTWARE SPECIFICATIONS:

• Anaconda Navigator
• Jupyter Notebook
• Google Colab

HARDWARE SPECIFICATIONS:

• Windows 10/11
• RAM - 16 GB
• Hard-disk - 1 TB
• Processor - Intel i5/i7

ALGORITHM:

• Install the required libraries: torch, transformers, sklearn, and tqdm.


• Import the necessary modules from the installed libraries.
• Define sample data for text classification (texts and corresponding labels).

27
• Split the data into training and testing sets using train_test_split from
sklearn.model_selection.
• Load the pre-trained BERT model and tokenizer using BertTokenizer.from_pretrained
and BertForSequenceClassification.from_pretrained from transformers.
• Tokenize and encode the training and testing data using the BERT tokenizer.
• Define the optimizer, loss function, and set up the training loop using PyTorch.

PROGRAM:

!pip install torch


!pip install transformers
!pip install sklearn
!pip install tqdm
import torch

from transformers import BertTokenizer, BertForSequenceClassification


from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from torch.utils.data import DataLoader, TensorDataset
from tqdm import tqdm
# Sample data for text classification
texts = ["This is a positive example.", "This is a negative example.", "Another positive one.",
"Negative text here."]
labels = [1, 0, 1, 0]
# Split data into training and testing sets

train_texts, test_texts, train_labels, test_labels = train_test_split(texts, labels, test_size=0.2,


random_state=42)

28
# Load pre-trained BERT model and tokenizer
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertForSequenceClassification.from_pretrained('bert-base-uncased', num_labels=2)
# Tokenize and encode the training data

train_encodings = tokenizer(train_texts, truncation=True, padding=True, return_tensors='pt')


train_labels = torch.tensor(train_labels)
# Tokenize and encode the testing data
test_encodings = tokenizer(test_texts, truncation=True, padding=True, return_tensors='pt')
test_labels = torch.tensor(test_labels)
# Create DataLoader for training and testing data
train_dataset = TensorDataset(train_encodings['input_ids'], train_encodings['attention_mask'],
train_labels)
test_dataset = TensorDataset(test_encodings['input_ids'], test_encodings['attention_mask'],
test_labels)
train_dataloader = DataLoader(train_dataset, batch_size=2, shuffle=True)
test_dataloader = DataLoader(test_dataset, batch_size=2, shuffle=False)
# Define optimizer and loss function
optimizer = torch.optim.AdamW(model.parameters(), lr=1e-5)
criterion = torch.nn.CrossEntropyLoss()
# Training loop
device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')
model.to(device)
for epoch in range(3):

29
model.train()
for batch in tqdm(train_dataloader, desc=f"Epoch {epoch + 1}"):
input_ids, attention_mask, labels = batch
input_ids, attention_mask, labels = input_ids.to(device), attention_mask.to(device),
labels.to(device)
optimizer.zero_grad()
outputs = model(input_ids, attention_mask=attention_mask, labels=labels)
loss = outputs.loss
loss.backward()
optimizer.step()

# Evaluation
model.eval()
predictions = []
true_labels = []
with torch.no_grad():
for batch in tqdm(test_dataloader, desc="Evaluating"):
input_ids, attention_mask, labels = batch
input_ids, attention_mask, labels = input_ids.to(device), attention_mask.to(device),
labels.to(device)
outputs = model(input_ids, attention_mask=attention_mask)
logits = outputs.logits
predicted_labels = torch.argmax(logits, dim=1).cpu().numpy()
predictions.extend(predicted_labels)
true_labels.extend(labels.cpu().numpy())

30
# Calculate accuracy
accuracy = accuracy_score(true_labels, predictions)
print(f"Accuracy: {accuracy * 100:.2f}%")

OUTPUT:

Requirement already satisfied: torch in /usr/local/lib/python3.10/dist-packages (2.2.1+cu121)


Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from torch)
(3.13.3)
Requirement already satisfied: typing-extensions>=4.8.0 in /usr/local/lib/python3.10/dist-packages
(from torch) (4.10.0)
Requirement already satisfied: sympy in /usr/local/lib/python3.10/dist-packages (from torch) (1.12)
Requirement already satisfied: networkx in /usr/local/lib/python3.10/dist-packages (from torch)
(3.2.1)
Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from torch) (3.1.3)
Requirement already satisfied: fsspec in /usr/local/lib/python3.10/dist-packages (from torch)
(2023.6.0)
Collecting nvidia-cuda-nvrtc-cu12==12.1.105 (from torch)
Downloading nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (23.7 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 23.7/23.7 MB 28.0 MB/s eta 0:00:00
Collecting nvidia-cuda-runtime-cu12==12.1.105 (from torch)
Downloading nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (823 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 823.6/823.6 kB 29.2 MB/s eta
0:00:00
Collecting nvidia-cuda-cupti-cu12==12.1.105 (from torch)
Downloading nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (14.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 14.1/14.1 MB 33.2 MB/s eta 0:00:00
Collecting nvidia-cudnn-cu12==8.9.2.26 (from torch)
Downloading nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64.whl (731.7 MB)

31
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 731.7/731.7 MB 987.8 kB/s eta
0:00:00
Collecting nvidia-cublas-cu12==12.1.3.1 (from torch)
Downloading nvidia_cublas_cu12-12.1.3.1-py3-none-manylinux1_x86_64.whl (410.6 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 410.6/410.6 MB 1.2 MB/s eta
0:00:00
Collecting nvidia-cufft-cu12==11.0.2.54 (from torch)
Downloading nvidia_cufft_cu12-11.0.2.54-py3-none-manylinux1_x86_64.whl (121.6 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 121.6/121.6 MB 8.3 MB/s eta
0:00:00
Collecting nvidia-curand-cu12==10.3.2.106 (from torch)
Downloading nvidia_curand_cu12-10.3.2.106-py3-none-manylinux1_x86_64.whl (56.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 56.5/56.5 MB 11.4 MB/s eta 0:00:00
Collecting nvidia-cusolver-cu12==11.4.5.107 (from torch)
Downloading nvidia_cusolver_cu12-11.4.5.107-py3-none-manylinux1_x86_64.whl (124.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 124.2/124.2 MB 6.2 MB/s eta
0:00:00
Collecting nvidia-cusparse-cu12==12.1.0.106 (from torch)
Downloading nvidia_cusparse_cu12-12.1.0.106-py3-none-manylinux1_x86_64.whl (196.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 196.0/196.0 MB 2.4 MB/s eta
0:00:00
Collecting nvidia-nccl-cu12==2.19.3 (from torch)
Downloading nvidia_nccl_cu12-2.19.3-py3-none-manylinux1_x86_64.whl (166.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 166.0/166.0 MB 2.3 MB/s eta
0:00:00
Collecting nvidia-nvtx-cu12==12.1.105 (from torch)
Downloading nvidia_nvtx_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (99 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 99.1/99.1 kB 11.1 MB/s eta 0:00:00
Requirement already satisfied: triton==2.2.0 in /usr/local/lib/python3.10/dist-packages (from torch)
(2.2.0)

32
Collecting nvidia-nvjitlink-cu12 (from nvidia-cusolver-cu12==11.4.5.107->torch)
Downloading nvidia_nvjitlink_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl (21.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 21.1/21.1 MB 45.3 MB/s eta 0:00:00
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from
jinja2->torch) (2.1.5)
Requirement already satisfied: mpmath>=0.19 in /usr/local/lib/python3.10/dist-packages (from
sympy->torch) (1.3.0)
Installing collected packages: nvidia-nvtx-cu12, nvidia-nvjitlink-cu12, nvidia-nccl-cu12, nvidia-
curand-cu12, nvidia-cufft-cu12, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-
cupti-cu12, nvidia-cublas-cu12, nvidia-cusparse-cu12, nvidia-cudnn-cu12, nvidia-cusolver-cu12
Successfully installed nvidia-cublas-cu12-12.1.3.1 nvidia-cuda-cupti-cu12-12.1.105 nvidia-cuda-
nvrtc-cu12-12.1.105 nvidia-cuda-runtime-cu12-12.1.105 nvidia-cudnn-cu12-8.9.2.26 nvidia-cufft-
cu12-11.0.2.54 nvidia-curand-cu12-10.3.2.106 nvidia-cusolver-cu12-11.4.5.107 nvidia-cusparse-
cu12-12.1.0.106 nvidia-nccl-cu12-2.19.3 nvidia-nvjitlink-cu12-12.4.127 nvidia-nvtx-cu12-12.1.105
Requirement already satisfied: transformers in /usr/local/lib/python3.10/dist-packages (4.38.2)
Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from
transformers) (3.13.3)
Requirement already satisfied: huggingface-hub<1.0,>=0.19.3 in /usr/local/lib/python3.10/dist-
packages (from transformers) (0.20.3)
Requirement already satisfied: numpy>=1.17 in /usr/local/lib/python3.10/dist-packages (from
transformers) (1.25.2)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from
transformers) (24.0)
Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.10/dist-packages (from
transformers) (6.0.1)
Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.10/dist-packages (from
transformers) (2023.12.25)
Requirement already satisfied: requests in /usr/local/lib/python3.10/dist-packages (from
transformers) (2.31.0)
Requirement already satisfied: tokenizers<0.19,>=0.14 in /usr/local/lib/python3.10/dist-packages

33
(from transformers) (0.15.2)
Requirement already satisfied: safetensors>=0.4.1 in /usr/local/lib/python3.10/dist-packages (from
transformers) (0.4.2)
Requirement already satisfied: tqdm>=4.27 in /usr/local/lib/python3.10/dist-packages (from
transformers) (4.66.2)
Requirement already satisfied: fsspec>=2023.5.0 in /usr/local/lib/python3.10/dist-packages (from
huggingface-hub<1.0,>=0.19.3->transformers) (2023.6.0)
Requirement already satisfied: typing-extensions>=3.7.4.3 in /usr/local/lib/python3.10/dist-packages
(from huggingface-hub<1.0,>=0.19.3->transformers) (4.10.0)
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages
(from requests->transformers) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from
requests->transformers) (3.6)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from
requests->transformers) (2.0.7)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from
requests->transformers) (2024.2.2)
Collecting sklearn
Downloading sklearn-0.0.post12.tar.gz (2.6 kB)
error: subprocess-exited-with-error

× python setup.py egg_info did not run successfully.


│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
Preparing metadata (setup.py) ... error
error: metadata-generation-failed

× Encountered error while generating package metadata.

34
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.
Requirement already satisfied: tqdm in /usr/local/lib/python3.10/dist-packages (4.66.2)
/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_token.py:88: UserWarning:
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab
(https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your
session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.
warnings.warn(
tokenizer_config.json: 100%
48.0/48.0 [00:00<00:00, 1.74kB/s]
vocab.txt: 100%
232k/232k [00:00<00:00, 6.39MB/s]
tokenizer.json: 100%
466k/466k [00:00<00:00, 19.9MB/s]
config.json: 100%
570/570 [00:00<00:00, 21.2kB/s]
model.safetensors: 100%
440M/440M [00:01<00:00, 240MB/s]
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at
bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions
and inference.
Epoch 1: 100%|██████████| 2/2 [00:04<00:00, 2.31s/it]
Epoch 2: 100%|██████████| 2/2 [00:02<00:00, 1.38s/it]
Epoch 3: 100%|██████████| 2/2 [00:02<00:00, 1.40s/it]

35
Evaluating: 100%|██████████| 1/1 [00:00<00:00, 7.83it/s]Accuracy: 0.00%

RESULT:

Using a transformer for implementing classification program is executed successfully and


Output got verified.

36
EXP NO:7 Design a Chatbot with a Simple Dialog System
Date:

AIM:

To create a python python for design a Chatbot with a simple dialog system.

SOFTWARE AND HARDWARE SPECIFICATIONS:

SOFTWARE SPECIFICATIONS:

• Anaconda Navigator
• Jupyter Notebook
• Google Colab

HARDWARE SPECIFICATIONS:

• Windows 10/11
• RAM - 16 GB
• Hard-disk - 1 TB
• Processor - Intel i5/i7

ALGORITHM:

• Define a class named Simple Chatbot.


• Initialize the chatbot with lists of greetings, goodbyes, and a dictionary of responses.

37
• Define a method get_response to generate responses based on user input.
• Define a main function to interact with the chatbot.
• Create an instance of the Simple Chabot class.
• Start a conversation loop where the user can input questions or statements.
• Provide appropriate responses based on user input, including handling greetings,
goodbyes, and predefined queries.

PROGRAM:

import random
class SimpleChatbot:
def init (self):
self.greetings = ['hello', 'hi', 'hey', 'greetings', 'howdy']
self.goodbyes = ['bye', 'goodbye', 'see you', 'farewell']
self.responses = {
'tell me a joke': 'Why did the chicken cross the road? To get to the other
side!', 'how are you': 'I am just a computer program, but thanks for
asking!',
'default': 'I\'m sorry, I don\'t understand that. Can you ask me something else?'
}
def get_response(self, user_input):
user_input = user_input.lower()
if any(greeting in user_input for greeting in self.greetings):
return 'Hello! How can I help you today?'
elif any(goodbye in user_input for goodbye in self.goodbyes):
return 'Goodbye! Have a great day.'

38
else:
for key in self.responses:

if key in user_input:
return self.responses[key]
return self.responses['default']
def main():
chatbot = SimpleChatbot()
print("Simple Chatbot: Hello! Ask me anything or say goodbye to end the conversation.")
while True:
user_input = input("You: ")
if user_input.lower() in ['bye', 'goodbye', 'exit']:
print("Simple Chatbot: Goodbye! Have a great day.")
break
response =chatbot.get_response(user_input)
print("Simple Chatbot:", response)

if name == " main ":


main()

39
OUTPUT:

Simple Chatbot: Hello! Ask me anything or say goodbye to end the


conversation. You: hi
Simple Chatbot: Hello! How can I help you
today? You: how are you!?
Simple Chatbot: I am just a computer program, but thanks for
asking! You: bye
Simple Chatbot: Goodbye! Have a great day.

RESULT:

The Design of a chatbot with a simple dialog system program is executed successfully and
Output got verified.

40
EXP NO:8
Convert Text to Speech and Find accuracy
Date:

AIM:

To create a python program for converting text to speech and find accuracy.

SOFTWARE AND HARDWARE SPECIFICATIONS:

SOFTWARE SPECIFICATIONS:

• Anaconda Navigator
• Jupyter Notebook
• Google Colab

HARDWARE SPECIFICATIONS:

• Windows 10/11
• RAM - 16 GB
• Hard-disk - 1 TB
• Processor - Intel i5/i7

ALGORITHM:

• Install the required libraries: SpeechRecognition, gTTS, and pyaudio.


• Import the necessary modules (speech_recognition, gTTS, os) for speech recognition
and text- to-speech conversion.
• Define a function text_to_speech to convert text to speech using gTTS.

41
• Define a function speech_to_text to recognize speech using the microphone and Google's
speech recognition service.
• Define a function evaluate_accuracy to compare the original text with the recognized
text and calculate accuracy.
• Execute the text-to-speech conversion for the original text.
• Use the microphone to capture speech input, recognize it, and evaluate the
accuracy of the recognition.

PROGRAM:

!pip install SpeechRecognition


!pip install gTTS
!pip install pyaudio

import speech_recognition as sr
from gtts import gTTS
import os
def text_to_speech(text, language='en'):

tts = gTTS(text=text, lang=language, slow=False)


tts.save("output.mp3")
os.system("start output.mp3") # This opens the file using the default media player
def speech_to_text():
recognizer = sr.Recognizer()
with sr.Microphone() as source:
print("Say something:")
audio = recognizer.listen(source)

42
try:
text = recognizer.recognize_google(audio)
return text
except sr.UnknownValueError:
print("Sorry, could not understand audio.")
return None
except sr.RequestError as e:
print(f"Could not request results from Google Speech Recognition service; {e}")
return None
def evaluate_accuracy(original_text, recognized_text):
if recognized_text:
print(f"Original Text: {original_text}")
print(f"Recognized Text: {recognized_text}")
original_words = set(original_text.lower().split())
recognized_words = set(recognized_text.lower().split())
common_words = original_words.intersection(recognized_words)
accuracy = len(common_words) / len(original_words)
print(f"Accuracy: {accuracy * 100:.2f}%")
else:
print("No text recognized. Accuracy cannot be calculated.")
if name_== " main ":

original_text = "Hello, how are you today?"


# Convert text to speech
text_to_speech(original_text)

43
# Speech to text
recognized_text = speech_to_text()
# Evaluate accuracy
evaluate_accuracy(original_text, recognized_text)

OUTPUT:

Requirement already satisfied: SpeechRecognition in c:\users\user\appdata\local\programs\python\


python311\lib\site-packages (3.10.0)
Requirement already satisfied: requests>=2.26.0 in c:\users\user\appdata\local\programs\python\
python311\lib\site-packages (from SpeechRecognition) (2.31.0)
Requirement already satisfied: charset-normalizer<4,>=2 in c:\users\user\appdata\local\programs\
python\python311\lib\site-packages (from requests>=2.26.0->SpeechRecognition) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in c:\users\user\appdata\local\programs\python\
python311\lib\site-packages (from requests>=2.26.0->SpeechRecognition) (3.4)
Requirement already satisfied: urllib3<3,>=1.21.1 in c:\users\user\appdata\local\programs\python\
python311\lib\site-packages (from requests>=2.26.0->SpeechRecognition) (2.1.0)
Requirement already satisfied: certifi>=2017.4.17 in c:\users\user\appdata\local\programs\python\
python311\lib\site-packages (from requests>=2.26.0->SpeechRecognition) (2023.7.22)

[notice] A new release of pip is available: 23.1.2 -> 23.3.2


[notice] To update, run: python.exe -m pip install --upgrade pip
Collecting gTTS
Downloading gTTS-2.5.0-py3-none-any.whl (29 kB)
Requirement already satisfied: requests<3,>=2.27 in c:\users\user\appdata\local\programs\python\
python311\lib\site-packages (from gTTS) (2.31.0)
Collecting click<8.2,>=7.1 (from gTTS)

44
Downloading click-8.1.7-py3-none-any.whl (97 kB)
0.0/97.9 kB ? eta -:--:--
------------------------- 61.4/97.9 kB 1.1 MB/s eta 0:00:01
- ----------------------------------- - 97.9/97.9 kB 933.0 kB/s eta 0:00:00
Requirement already satisfied: colorama in c:\users\user\appdata\local\programs\python\
python311\lib\site-packages (from click<8.2,>=7.1->gTTS) (0.4.6)
Requirement already satisfied: charset-normalizer<4,>=2 in c:\users\user\appdata\local\programs\
python\python311\lib\site-packages (from requests<3,>=2.27->gTTS) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in c:\users\user\appdata\local\programs\python\
python311\lib\site-packages (from requests<3,>=2.27->gTTS) (3.4)
Requirement already satisfied: urllib3<3,>=1.21.1 in c:\users\user\appdata\local\programs\python\
python311\lib\site-packages (from requests<3,>=2.27->gTTS) (2.1.0)
Requirement already satisfied: certifi>=2017.4.17 in c:\users\user\appdata\local\programs\python\
python311\lib\site-packages (from requests<3,>=2.27->gTTS) (2023.7.22)
Installing collected packages: click, gTTS
Successfully installed click-8.1.7 gTTS-2.5.0

[notice] A new release of pip is available: 23.1.2 -> 23.3.2


[notice] To update, run: python.exe -m pip install --upgrade pip

[notice] A new release of pip is available: 23.1.2 -> 23.3.2


[notice] To update, run: python.exe -m pip install --upgrade pip
Collecting pyaudio
Downloading PyAudio-0.2.14-cp311-cp311-win_amd64.whl (164 kB)
0.0/164.1 kB ? eta -:--:--
------------------------------------- 163.8/164.1 kB 5.0 MB/s eta 0:00:01
-------------------------------------- 164.1/164.1 kB 3.3 MB/s eta 0:00:00
Installing collected packages: pyaudio
Successfully installed pyaudio-0.2.14
Say something:

45
Original Text: Hello, how are you today?
Recognized Text: hello how are you today
Accuracy: 60.00%

RESULT:

The Conversion of text to speech and find the accuracy program is executed successfully
and Output got verified.

46
EXP NO:9
Design a speech recognition system and find the error rate
Date:

AIM:

To create a python program for design a speech recognition system and find the error rate.

SOFTWARE AND HARDWARE SPECIFICATIONS:

SOFTWARE SPECIFICATIONS:

• Anaconda Navigator
• Jupyter Notebook
• Google Colab

HARDWARE SPECIFICATIONS:

• Windows 10/11
• RAM - 16 GB
• Hard-disk - 1 TB
• Processor - Intel i5/i7

ALGORITHM:

• Install the jiwer library using pip.


• Import the required modules: speech_recognition for audio processing and jiwer for
calculating Word Error Rate (WER).
• Define a function recognize_speech to recognize speech from an audio file using Google's

47
speech recognition service.
• Define a function calculate_word_error_rate to calculate the WER between a reference
text and recognized text.
• Simulate a reference text and provide the path to the audio file containing the recognized
speech.
• Use the recognize_speech function to get the recognized text from the audio file.
• Calculate the Word Error Rate (WER) between the reference text and the recognized
text using the calculate_word_error_rate function.

PROGRAM:

!pip install jiwer


import speech_recognition as sr
import jiwer
def recognize_speech(audio_file, language='en-US'):
recognizer = sr.Recognizer()
with sr.AudioFile(audio_file) as source:
audio = recognizer.record(source)
try:
recognized_text = recognizer.recognize_google(audio, language=language)
return recognized_text
except sr.UnknownValueError:
print("Speech recognition could not understand the audio.")
return None
except sr.RequestError as e:
print(f"Could not request results from Google Speech Recognition service; {e}")
return None

48
def calculate_word_error_rate(reference_text, recognized_text):
wer = jiwer.wer(reference_text, recognized_text)
return wer
if name _== " main ":

# Simulating a reference text


reference_text = "hello how are you"
# Simulating a recognized text (replace 'audio_file.wav' with the path to your actual audio file)
audio_file_path = 'audio_file.wav'
recognized_text = recognize_speech(audio_file_path)
if recognized_text:

print(f"Reference Text: {reference_text}")


print(f"Recognized Text: {recognized_text}")
# Calculate Word Error Rate (WER)
wer = calculate_word_error_rate(reference_text, recognized_text)
print(f"Word Error Rate (WER): {wer * 100:.2f}%")
else:
print("No text recognized.")

49
OUTPUT:

Original Text: hello How are you


Recognized Text : hello How are you
Error rate : 0.00%

RESULT:

The Design of a speech recognition system and find the error rate program is

executed successfully and Output got verified.

50
51

You might also like