0% found this document useful (0 votes)

26 views19 pages

Unit 1 TB

The document discusses Natural Language Processing (NLP), highlighting the differences between natural and computer languages, and the challenges posed by ambiguities in language. It outlines the components and phases of NLP, including Natural Language Understanding (NLU) and Natural Language Generation (NLG), along with various applications such as sentiment analysis and machine translation. Additionally, it covers deep learning approaches, NLP pipelines, and feature engineering techniques essential for processing and analyzing human language.

Uploaded by

rainanewmail8

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views19 pages

Unit 1 TB

Uploaded by

rainanewmail8

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

NLP (TB)—Unit-1

NLP—Unit-1

Difference between Natural language and Computer Language

NLP is difficult because Ambiguity and Uncertainty exist in the language.

Ambiguity

There are the following three ambiguity -

○ Lexical Ambiguity

Lexical Ambiguity exists in the presence of two or more possible meanings of the sentence within a single
word.

Example:

Manya is looking for a match.

In the above example, the word match refers to that either Manya is looking for a partner or Manya is
looking for a match. (Cricket or other match)

○ Syntactic Ambiguity

Syntactic Ambiguity exists in the presence of two or more possible meanings within the sentence.

Example:

I saw the girl with the binocular.

In the above example, did I have the binoculars? Or did the girl have the binoculars?

○ Referential Ambiguity

Referential Ambiguity exists when you are referring to something using the pronoun.
1
NLP (TB)—Unit-1
Example: Kiran went to Sunita and she said "I am hungry."

In the above sentence, you do not know that who is hungry, either Kiran or Sunita.

NLP stands for Natural Language Processing, which is a part of Computer Science, Human language, and
Artificial Intelligence. It is the technology that is used by machines to understand, analyse, manipulate, and
interpret human's languages.

Components of NLP
1. Natural Language Understanding (NLU)

Natural Language Understanding (NLU) helps the machine to understand and analyse human language by
extracting the metadata from content such as concepts, entities, keywords, emotion, relations, and semantic
roles.

2. Natural Language Generation (NLG)

Natural Language Generation (NLG) acts as a translator that converts the computerized data into natural
language representation. It mainly involves Text planning, Sentence planning, and Text Realization.

Difference between NLU and NLG

2
NLP (TB)—Unit-1

Phases of NLP
There are the following five phases of NLP:

1. Lexical Analysis

The first phase of NLP is the Lexical Analysis. This phase scans the source code as a stream of characters
and converts it into meaningful lexemes. It divides the whole text into paragraphs, sentences, and words.

2. Syntactic Analysis (Parsing)

Syntactic Analysis is used to check grammar, word arrangements, and shows the relationship among the
words.

Example: Agra goes to the Poonam

In the real world, Agra goes to the Poonam, does not make any sense, so this sentence is rejected by the
Syntactic analyzer.

3. Semantic Analysis

Semantic analysis is concerned with the meaning representation. It mainly focuses on the literal meaning
of words, phrases, and sentences.

3
NLP (TB)—Unit-1
4. Discourse Integration

Discourse Integration depends upon the sentences that proceeds it and also invokes the meaning of the
sentences that follow it.

5. Pragmatic Analysis

Pragmatic is the fifth and last phase of NLP. It helps you to discover the intended effect by applying a set
of rules that characterize cooperative dialogues.

For Example: "Open the door" is interpreted as a request instead of an order.

Applications of NLP:

Natural Language Processing (NLP) has a wide range of applications across various fields.

● Sentiment Analysis: NLP can be used to analyze the sentiment of text data, which is valuable for
understanding public opinion, customer feedback, and social media sentiment.
● Machine Translation: NLP powers machine translation systems that translate text from one
language to another, facilitating communication across language barriers.

4
NLP (TB)—Unit-1
● Chatbots and Virtual Assistants: NLP enables the development of chatbots and virtual assistants
that can understand and respond to user queries, providing customer support, information retrieval,
and task automation.
● Information Extraction: NLP techniques can extract structured information from unstructured
text data, such as named entity recognition, relation extraction, and event extraction.
● Text Summarization: NLP algorithms can summarize large volumes of text into concise
summaries, which is useful for quickly understanding the key points of lengthy documents or
articles.
● Question Answering Systems: NLP powers question answering systems that can understand
natural language questions and provide relevant answers by extracting information from structured
or unstructured data sources.
● Text Classification: NLP techniques are used for text classification tasks such as spam detection,
sentiment analysis, topic classification, and document categorization.
● Named Entity Recognition (NER): NLP models can identify and classify named entities
mentioned in text, such as names of people, organizations, locations, dates, and numerical
expressions.
● Language Generation: NLP can generate human-like text, including generating creative content,
writing articles, composing poetry, and generating code snippets.
● Information Retrieval: NLP is used in search engines to understand user queries and retrieve
relevant documents or web pages from large collections of text data.
● Speech Recognition and Speech-to-Text: NLP techniques are employed in speech recognition
systems to convert spoken language into text, enabling applications such as virtual assistants, voice-
controlled devices, and dictation software.
● Text Mining: NLP facilitates the mining of valuable insights and patterns from large volumes of
text data, including social media data, customer reviews, scientific literature, and financial reports.

These applications demonstrate the versatility and importance of NLP in various domains, ranging from
customer service and healthcare to finance and entertainment.

5
NLP (TB)—Unit-1
Approaches to NLP

DL Approaches

Artificial Neural Networks (ANN), are computational networks that are able to solve complex, nonlinear
mathematical problems.

The field of ANN has been inspired by the ambition to model biological neural systems.

NN are modelled as collections of layers of neurons that are connected in an acyclic graph.

The output of such ANN could be a predicted numerical value, but in many cases, it is usually taken to represent
the class scores (e.g. in text classification) .

In most cases of NLP, we are interested in multinomial classification, such as part-of-speech tagging. Hence, the
output layer yields a probability distribution across the output nodes.

DNNs stack up several hidden layers, with each layer acting as the input to the next layer.

DL allows a computer to build complex concepts out of simpler concepts. Another perspective on deep learning
is that the depth allows the computer to learn a multi-step program.

Each layer can be interpreted as the state of the computer’s memory after executing a set of instructions.
Overview of different DL techniques.

• Multilayer Perceptron (MLP) is a feed-forward neural network with multiple (one or more) hidden layers
between the input layer and output layer.

• Autoencoder (AE) is an unsupervised model attempting to reconstruct its input data in the output layer.

• Convolutional Neural Network (CNN) is a special kind of feed-forward neural network with convolution
layers and pooling operations.

6
NLP (TB)—Unit-1
• Recurrent Neural Network (RNN) use loops and memorises to remember former computations.

• Reinforcement learning Learning operates (DRL). on a trial-and-error paradigm. The whole framework mainly
consists of the following components: agents, environments, states, actions, and rewards.

RNN (For Further Reading in later units)

● A sentence in any language flows from one direction to another.

● Thus, a model that can progressively read an input text from one end to another can be very useful for
language understanding.
● Recurrent neural networks (RNNs) are specially designed to keep such sequential processing and learning
in mind.
● RNNs have neural units that are capable of remembering what they have processed so far.
● This memory is temporal, and the information is stored and updated with every time step as the RNN reads
the next word in the input.

LSTM

● RNNs suffer from the problem of forgetful memory—they cannot remember longer contexts and therefore
do not perform well when the input text is long, which is typically the case with text inputs.
● Long shortterm memory networks (LSTMs), a type of RNN, were invented to mitigate this shortcoming of
the RNNs.
● LSTMs circumvent this problem by letting go of the irrelevant context and only remembering the part of
the context that is needed to solve the task at hand.
● Gated recurrent units (GRUs) are another variant of RNNs that are used mostly in language generation.

CNN

7
NLP (TB)—Unit-1
● A word in a sentence can be replaced with its corresponding word vector, and all vectors are of the same
size (d).
● Thus, they can be stacked one over another to form a matrix or 2D array of dimension n ✕d, where
n is the number of words in the sentence and d is the size of the word vectors.

● This matrix can now be treated similar to an image and can be modeled by a CNN.
● The main advantage CNNs have is their ability to look at a group of words together using a context window.
● For example, we are doing sentiment classification, and we get a sentence like, “I like this movie very
much!” In order to make sense of this sentence, it is better to look at words and different sets of contiguous words.
● CNN uses a collection of convolution and pooling layers to achieve this condensed representation of the
text, which is then fed as input to a fully connected layer to learn some NLP tasks like text classification.

8
NLP (TB)—Unit-1

Transformers

● Transformer models are the recent advancement in past two years.

● They model the textual context but not in a sequential manner.
● Given a word in the input, it prefers to look at all the words around it (known as self- attention) and represent
each word with respect to its context.
● For example, the word “bank” can have different meanings depending on the context.
● With transformers, a very large transformer mode is trained in an unsupervised manner (known as pre-
training) to predict a part of a sentence given the rest of the content.
● These models are trained on more than 40 GB of textual data, scraped from the whole internet.
● An example of a large transformer is BERT (Bidirectional Encoder Representations from Transformers),
which is pre-trained on massive data and open sourced by Google.

9
NLP (TB)—Unit-1

NLP- Pipeline
The key stages in the pipeline are as follows:
1. Data acquisition
2. Text cleaning
3. Pre-processing
4. Feature engineering
5. Modeling
6. Evaluation
7. Deployment
8. Monitoring and model updating

1. Data acquisition
10
NLP (TB)—Unit-1
● Use a public dataset
We could see if there are any public datasets available that we can leverage.
● Scrape data
We could find a source of relevant data on the internet—for example, a consumer or discussion forum
where people have posted queries (sales or support). Scrape the data from there and get it labeled by
human annotators.
● Product intervention

The AI team should work with the product team to collect more and richer data by developing better
instrumentation in the product.

● Data augmentation

We can take a small dataset and use some tricks to create more data. These tricks are also called data
augmentation, and they try to exploit language properties to create text that is syntactically similar to
source text data. They are

⮚ Synonym replacement
⮚ Back translation
Say we have a sentence, S1, in English. We use a machine-translation library like
Google Translate to translate it into some other language—say, German. Let the
corresponding sentence in German be S2. Now, we’ll use the machine-translation
library again to translate back to English. Let the output sentence be S3.
⮚ TF-IDF–based word replacement
⮚ Bigram flipping
Divide the sentence into bigrams. Take one bigram at random and flip it.
⮚ Replacing entities
Replace entities like person name, location, organization, etc., with other entities
in the same category.
⮚ Adding noise to data
In many NLP applications, the incoming data contains spelling mistakes.
(example, Twitter). In such cases, we can add a bit of noise to data to train robust
models. For example, randomly choose a word in a sentence and replace it with
another word that’s closer in spelling to the first word. (“fat finger” on mobile
keyboards, QWERTY keyboard error)
There are other advanced techniques and systems that can augment text data.
● Snorkel
● Easy Data Augmentation (EDA)
● Active learning

2. Text Extraction and Cleanup

11
NLP (TB)—Unit-1
Text extraction and cleanup refers to the process of extracting raw text from the input data by removing
all the other non-textual information, such as markup, metadata, etc., and converting the text to the
required encoding format.
(a) PDF invoice
(b) HTML texts
(c) text embedded in an image
● HTML Parsing and Cleanup
⮚ Say we’re working on a project where we’re building a forum search engine for
programming questions. Eg. Stack Overflow as a source and decided to extract
question and best-answer pairs from the website. We notice that questions and
answers have special tags associated with them. We can utilize this information
while extracting text from the HTML page.
⮚ It’s more feasible to utilize existing libraries such as Beautiful Soup and Scrapy,
which provide a range of utilities to parse web pages.

Extracting a question and its best-answer pair from a Stack Overflow web page:

⮚ Unicode Normalization

As we develop code for cleaning up HTML tags, we may also encounter various Unicode characters,
including symbols, emojis, and other graphic characters.

⮚ Spelling Correction
Shorthand text messages in social microblogs often hinder language processing and context
understanding. Two such examples follow:
Shorthand typing: Hllo world! I am back!
12
NLP (TB)—Unit-1
Fat finger problem: I pronise that I will not bresk the silence again!

⮚ System-Specific Error Correction

The pipeline in this case starts with extraction of plain text from PDF documents.
However, different PDF documents are encoded differently, and sometimes, we may not be able to extract
the full text, or the structure of the text may get messed up.
While there are several libraries, such as PyPDF, PDFMiner etc., to extract text from PDF documents, they
are far from perfect, and it’s common to encounter PDF documents that can’t be processed by such libraries.

Pre-processing

Other Pre-Processing Steps

⮚ Text normalization
⮚ A word can be spelled in different ways, including in shortened forms, a phone number can be
written in different formats (e.g., with and without hyphens), names are sometimes in
lowercase, and so on.
⮚ This is known as text normalization. Some common steps for text normalization are to convert
all text to lowercase or uppercase, convert digits to text (e.g., 9 to nine), expand abbreviations,
and so on.
⮚ Language detection
⮚ A lot of web content is in non-English languages. For example, say we’re asked to collect
all reviews about our product on the web.
13
NLP (TB)—Unit-1
⮚ In such cases, language detection is performed as the first step in an NLP pipeline. We can
use libraries like Polyglot [36] for language detection.

⮚ Code mixing and transliteration

⮚ Many people across the world speak more than one language in their day-to-day lives.
Thus, it’s common to see them using multiple languages in their social media posts, and a
single post may contain many languages.

Advanced Processing
⮚ POS Tagging,
⮚ Co-reference resolution, etc

14
NLP (TB)—Unit-1

Feature Engineering
⮚ The goal of feature engineering is to capture the characteristics of the text into a numeric vector
that can be understood by the ML algorithms. We refer to this step as “text representation”.
⮚ We’ll briefly touch on two different approaches taken in practice for feature engineering in (1) a
classical NLP and traditional ML pipeline and (2) a DL pipeline.

15
NLP (TB)—Unit-1

Modeling
The next step is about how to build a useful solution out of this. At the start, when we have limited data,
we can use simpler methods and rules.
⮚ Start with Simple Heuristics
Part of that could be due to a lack of data, but human-built heuristics can also provide a great start in some
ways.
Heuristics may already be part of your system, either implicitly or explicitly.
For instance, in email spam-classification tasks, we may have a blacklist of domains that are used
exclusively to send spam. This information can be used to filter emails from those domains. Similarly, a
blacklist of words in an email that denote a high chance of spam could also be used for this classification.

Building Your Model

⮚ Create a feature from the heuristic for your ML model
When there are many heuristics where the behavior of a single heuristic is deterministic but their combined
behavior is fuzzy in terms of how they predict, it’s best to use these heuristics as features to train your ML
model.
Eg- in the email spam-classification example, we can add features, such as the number of words from the
blacklist in a given email or the email bounce rate, to the ML model.

Building the Model

● Ensemble and stacking
o Model stacking
We can feed one model’s output as input for another model, thus sequentially going from one model to
another and obtaining a final output. This is called model stacking.

o Model ensembling
We can also pool predictions from multiple models and make a final prediction. This is called model
ensembling.

16
NLP (TB)—Unit-1

Stacking (Stacked Generalization) is an ensemble learning technique that aims to combine multiple models

to improve predictive performance. Steps are:

1. Base Models: Training multiple models (level-0 models) on the same dataset.

2. Meta-Model: Training a new model (level-1 or meta-model) to combine the predictions of the

base models. Using the predictions of the base models as input features for the meta-model.

Evaluation
Evaluations are of two types: intrinsic and extrinsic.
Intrinsic focuses on intermediary objectives, while extrinsic focuses on evaluating performance on the final
objective.
● Intrinsic Evaluation
For most metrics in this category, we assume a test set where we have the ground truth or labels (human
annotated, correct answers).
Labels could be binary (e.g., 0/1 for text classification), one-to-two words (e.g., names for named entity
recognition), or large text itself (e.g., text translated by machine translation).
The output of the NLP model on a data point is compared against the corresponding label for that data
point, and metrics are calculated based on the match (or mismatch) between the output and label.
For most NLP tasks, the comparison can be automated, hence intrinsic evaluation can be automated.

17
NLP (TB)—Unit-1

18
NLP (TB)—Unit-1

Unit 1
No ratings yet
Unit 1
19 pages
Hocken Maier 25
No ratings yet
Hocken Maier 25
46 pages
ML For NLP-LO3
No ratings yet
ML For NLP-LO3
61 pages
Deep Learning For Natural Language Processing: July 2021
No ratings yet
Deep Learning For Natural Language Processing: July 2021
10 pages
NLP Concepts and Techniques Guide
No ratings yet
NLP Concepts and Techniques Guide
15 pages
Chapter 1 Solutions
No ratings yet
Chapter 1 Solutions
5 pages
NLP Exam Notes
No ratings yet
NLP Exam Notes
15 pages
NLP Handwritten Notes
No ratings yet
NLP Handwritten Notes
26 pages
NLG: Transforming Data to Text
No ratings yet
NLG: Transforming Data to Text
5 pages
Unit 1 and 2
No ratings yet
Unit 1 and 2
5 pages
NLP Introduction
No ratings yet
NLP Introduction
36 pages
ChatBot Unit1
No ratings yet
ChatBot Unit1
35 pages
NLP Unit-1 Merged
No ratings yet
NLP Unit-1 Merged
41 pages
AI4youngster - 6 - Topic NLP
No ratings yet
AI4youngster - 6 - Topic NLP
66 pages
CL Unit 1
No ratings yet
CL Unit 1
11 pages
DLT Unit-5
No ratings yet
DLT Unit-5
48 pages
Unit 4
No ratings yet
Unit 4
39 pages
Foundation For NLP
No ratings yet
Foundation For NLP
14 pages
NLP Unit1 Presentation
No ratings yet
NLP Unit1 Presentation
65 pages
AI Unit-5
No ratings yet
AI Unit-5
10 pages
NLP Unit 1 & 2
No ratings yet
NLP Unit 1 & 2
29 pages
NLP LectureNotes UNIT 1
No ratings yet
NLP LectureNotes UNIT 1
55 pages
Unit I
No ratings yet
Unit I
36 pages
Unit-I NLP
No ratings yet
Unit-I NLP
37 pages
NLP Chapter - 1 Sheet
No ratings yet
NLP Chapter - 1 Sheet
6 pages
Project Plan - Kel 5 PDF
No ratings yet
Project Plan - Kel 5 PDF
5 pages
NLP UNIT 1 Part 1
No ratings yet
NLP UNIT 1 Part 1
24 pages
NLP Unit 1 To 5
No ratings yet
NLP Unit 1 To 5
91 pages
Natural Language Processing: Components of NLP
No ratings yet
Natural Language Processing: Components of NLP
8 pages
Introduction To Natural Language Processing: by Rohit Sharma
No ratings yet
Introduction To Natural Language Processing: by Rohit Sharma
8 pages
Chapter 1
No ratings yet
Chapter 1
31 pages
Natural Language Processing - Bridging The Gap Between Humans and Machines
No ratings yet
Natural Language Processing - Bridging The Gap Between Humans and Machines
6 pages
Ai Unit5
No ratings yet
Ai Unit5
16 pages
Introduction To NLP
No ratings yet
Introduction To NLP
51 pages
NLP Unit 1
No ratings yet
NLP Unit 1
48 pages
NLP for Business Efficiency
No ratings yet
NLP for Business Efficiency
6 pages
Introduction To NLP - First - Week - Lecture - 2st
No ratings yet
Introduction To NLP - First - Week - Lecture - 2st
4 pages
NLP Lecture
No ratings yet
NLP Lecture
18 pages
University Institute of Engineering Department of Computer Science and Engg
No ratings yet
University Institute of Engineering Department of Computer Science and Engg
9 pages
UNIT - 03 (All Topics)
No ratings yet
UNIT - 03 (All Topics)
54 pages
Slide
No ratings yet
Slide
28 pages
Notes MSC NLP
No ratings yet
Notes MSC NLP
36 pages
NLP 230920 150745
No ratings yet
NLP 230920 150745
17 pages
NLP 1
No ratings yet
NLP 1
29 pages
Natural Language Processin1
No ratings yet
Natural Language Processin1
86 pages
Natural Language Processing
No ratings yet
Natural Language Processing
87 pages
Natural Language Processing
No ratings yet
Natural Language Processing
4 pages
NLP Prep
No ratings yet
NLP Prep
14 pages
Unit-3NaturalLanguageProcessing (NLP) 1 T1743588944524
No ratings yet
Unit-3NaturalLanguageProcessing (NLP) 1 T1743588944524
83 pages
Sha 10
No ratings yet
Sha 10
6 pages
Chapter - 6 Communicating, Perceiving, and Acting
No ratings yet
Chapter - 6 Communicating, Perceiving, and Acting
30 pages
NLP Basics for Computer Science Students
No ratings yet
NLP Basics for Computer Science Students
87 pages
Natural Language Processing
No ratings yet
Natural Language Processing
3 pages
What Is NLP
No ratings yet
What Is NLP
16 pages
4 - Aisc
No ratings yet
4 - Aisc
14 pages
Unit 3
No ratings yet
Unit 3
14 pages
Deep Learning in Mobile Networks Survey
No ratings yet
Deep Learning in Mobile Networks Survey
67 pages
DRP Ammsac 2024-117-123
No ratings yet
DRP Ammsac 2024-117-123
7 pages
Named Entity Recognitionfor Culturalalheritage
No ratings yet
Named Entity Recognitionfor Culturalalheritage
22 pages
Python Text Classification Guide
No ratings yet
Python Text Classification Guide
34 pages
Deepthermal: Combustion Optimization For Thermal Power Generating Units Using Offline Reinforcement Learning
No ratings yet
Deepthermal: Combustion Optimization For Thermal Power Generating Units Using Offline Reinforcement Learning
12 pages
Deepchannel: Wireless Channel Quality Prediction Using Deep Learning
No ratings yet
Deepchannel: Wireless Channel Quality Prediction Using Deep Learning
14 pages
Module 3
100% (1)
Module 3
79 pages
2935 5578 1 PB
No ratings yet
2935 5578 1 PB
5 pages
1223-Article Text-5565-1-10-20200702
No ratings yet
1223-Article Text-5565-1-10-20200702
7 pages
Sensefi
No ratings yet
Sensefi
15 pages
Arabic Image Description Model
No ratings yet
Arabic Image Description Model
9 pages
CNN and RNN Applications in AI
No ratings yet
CNN and RNN Applications in AI
41 pages
Temperature Prediction For Reheating Furnace by Gated Recurrent Unit Approach
No ratings yet
Temperature Prediction For Reheating Furnace by Gated Recurrent Unit Approach
8 pages
B.Tech Project: 2FA Security System
No ratings yet
B.Tech Project: 2FA Security System
77 pages
Predictive Analytics For Demand Forecasting A Deep
No ratings yet
Predictive Analytics For Demand Forecasting A Deep
9 pages
Seq2Seq Learning with COPYNET
No ratings yet
Seq2Seq Learning with COPYNET
10 pages
Wang W D 2018
No ratings yet
Wang W D 2018
136 pages
Deep Learning Frameworks in Education
No ratings yet
Deep Learning Frameworks in Education
13 pages
Deakin Ms Data Science Programme
No ratings yet
Deakin Ms Data Science Programme
21 pages
1 s2.0 S0957417423021942 Main
No ratings yet
1 s2.0 S0957417423021942 Main
23 pages
ViroNia LSTM Based Proteomics Model For Precis - 2025 - Computers in Biology An
No ratings yet
ViroNia LSTM Based Proteomics Model For Precis - 2025 - Computers in Biology An
12 pages
Collaborative Fall Detection Using A Wearable Device and A Companion Robot
No ratings yet
Collaborative Fall Detection Using A Wearable Device and A Companion Robot
7 pages
(2023) Prediction Method For Aging Life of Rubber O-Rings Based On GRU-LSTM Neural Network
No ratings yet
(2023) Prediction Method For Aging Life of Rubber O-Rings Based On GRU-LSTM Neural Network
15 pages
IEEE Paper IPL Score Predict
No ratings yet
IEEE Paper IPL Score Predict
4 pages
Machine Learning Beginners Guide
No ratings yet
Machine Learning Beginners Guide
25 pages
Comprehensive Review of Machine Learning Deep Learning and Digital Twin Data-Driven Approaches in Battery Health Prediction of Electric Vehicles
No ratings yet
Comprehensive Review of Machine Learning Deep Learning and Digital Twin Data-Driven Approaches in Battery Health Prediction of Electric Vehicles
16 pages
Final Report - Rahma Ahme (P-EM0295-23)
No ratings yet
Final Report - Rahma Ahme (P-EM0295-23)
42 pages
Full Thesis
No ratings yet
Full Thesis
84 pages
VQA: Visual Question Answering
No ratings yet
VQA: Visual Question Answering
25 pages

Unit 1 TB

Uploaded by

Unit 1 TB

Uploaded by

NLP (TB)—Unit-1

Difference between Natural language and Computer Language

NLP is difficult because Ambiguity and Uncertainty exist in the language.

There are the following three ambiguity -

Manya is looking for a match.

I saw the girl with the binocular.

2. Natural Language Generation (NLG)

Difference between NLU and NLG

2. Syntactic Analysis (Parsing)

Example: Agra goes to the Poonam

For Example: "Open the door" is interpreted as a request instead of an order.

RNN (For Further Reading in later units)

● A sentence in any language flows from one direction to another.

● Transformer models are the recent advancement in past two years.

2. Text Extraction and Cleanup

⮚ System-Specific Error Correction

Other Pre-Processing Steps

⮚ Code mixing and transliteration

Building Your Model

Building the Model

to improve predictive performance. Steps are:

You might also like