0% found this document useful (0 votes)

51 views37 pages

NLP Notes

Natural Language Processing (NLP) is a technology that enables machines to understand and interpret human languages, facilitating tasks such as translation and sentiment analysis. The history of NLP dates back to the 1940s, evolving through various phases including Natural Language Understanding (NLU) and Natural Language Generation (NLG). Despite its advantages like efficiency and accuracy, NLP faces challenges such as ambiguity and context limitations.

Uploaded by

rj0110865

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

51 views37 pages

NLP Notes

Uploaded by

rj0110865

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 37

What is NLP?

NLP stands for Natural Language Processing, which is a part

of Computer Science, Human language, and Artificial Intelligence. It
is the technology that is used by machines to understand, analyse,
manipulate, and interpret human's languages. It helps developers to
organize knowledge for performing tasks such as translation, automatic
summarization, Named Entity Recognition (NER), speech
recognition, relationship extraction, and topic segmentation.

History of NLP
(1940-1960) - Focused on Machine Translation (MT)

The Natural Languages Processing started in the year 1940s.

1948 - In the Year 1948, the first recognisable NLP application was
introduced in Birkbeck College, London.

1950s - In the Year 1950s, there was a conflicting view between linguistics
and computer science. Now, Chomsky developed his first book syntactic
structures and claimed that language is generative in nature.

In 1957, Chomsky also introduced the idea of Generative Grammar, which

is rule based descriptions of syntactic structures.

(1960-1980) - Flavored with Artificial Intelligence (AI)

Advantages of NLP
o NLP helps users to ask questions about any subject and get a direct
response within seconds.
o NLP offers exact answers to the question means it does not offer
unnecessary and unwanted information.
o NLP helps computers to communicate with humans in their languages.
o It is very time efficient.
o Most of the companies use NLP to improve the efficiency of documentation
processes, accuracy of documentation, and identify the information from
large databases.

Disadvantages of NLP
A list of disadvantages of NLP is given below:

o NLP may not show context.

o NLP is unpredictable
o NLP may require more keystrokes.
o NLP is unable to adapt to the new domain, and it has a limited function
that's why NLP is built for a single and specific task only.

Components of NLP
There are the following two components of NLP -

1. Natural Language Understanding (NLU)

Natural Language Understanding (NLU) helps the machine to understand

and analyse human language by extracting the metadata from content
such as concepts, entities, keywords, emotion, relations, and semantic
roles.

NLU mainly used in Business applications to understand the customer's

problem in both spoken and written language.

NLU involves the following tasks -

o It is used to map the given input into useful representation.
o It is used to analyze different aspects of the language.

2. Natural Language Generation (NLG)

Natural Language Generation (NLG) acts as a translator that converts the

computerized data into natural language representation. It mainly involves
Text planning, Sentence planning, and Text Realization.

Difference between NLU and NLG

NLU NLG

NLU is the process of reading and NLG is the process of writing or generating
interpreting language. language.

It produces non-linguistic outputs It produces constructing natural language

from natural language inputs. outputs from non-linguistic inputs.
Applications of NLP
There are the following applications of NLP -

1. Question Answering

Question Answering focuses on building systems that automatically

answer the questions asked by humans in a natural language.

2. Spam Detection

Spam detection is used to detect unwanted e-mails getting to a user's

inbox.

3. Sentiment Analysis
Sentiment Analysis is also known as opinion mining. It is used on the
web to analyse the attitude, behaviour, and emotional state of the sender.
This application is implemented through a combination of NLP (Natural
Language Processing) and statistics by assigning the values to the text
(positive, negative, or natural), identify the mood of the context (happy,
sad, angry, etc.)

4. Machine Translation

Machine translation is used to translate text or speech from one natural

language to another natural language.

Example: Google Translator

5. Spelling correction

Microsoft Corporation provides word processor software like MS-word,

PowerPoint for the spelling correction.

6. Speech Recognition

Speech recognition is used for converting spoken words into text. It is used
in applications, such as mobile, home automation, video recovery,
dictating to Microsoft Word, voice biometrics, voice user interface, and so
on.

7. Chatbot

Implementing the Chatbot is one of the important applications of NLP. It is

used by many companies to provide the customer's chat services.

8. Information extraction

Information extraction is one of the most important applications of NLP. It

is used for extracting structured information from unstructured or semi-
structured machine-readable documents.

9. Natural Language Understanding (NLU)

It converts a large set of text into more formal representations such as

first-order logic structures that are easier for the computer programs to
manipulate notations of the natural language processing.

Phases of NLP
There are the following five phases of NLP:
1. Lexical Analysis and Morphological

The first phase of NLP is the Lexical Analysis. This phase scans the source

code as a stream of characters and converts it into meaningful lexemes

[शब्दिम]. It divides the whole text into paragraphs, sentences, and

words.

2. Syntactic Analysis (Parsing)

Syntactic Analysis is used to check grammar, word arrangements, and

shows the relationship among the words.

Example: Agra goes to the Poonam

In the real world, Agra goes to the Poonam, does not make any sense, so
this sentence is rejected by the Syntactic analyser.

3. Semantic Analysis

Semantic analysis is concerned with the meaning representation. It mainly

focuses on the literal meaning of words, phrases, and sentences.

4. Discourse Integration
Discourse Integration depends upon the sentences that proceeds it and
also invokes the meaning of the sentences that follow it.

5. Pragmatic Analysis

Pragmatic is the fifth and last phase of NLP. It helps you to discover the
intended effect by applying a set of rules that characterize cooperative
dialogues.

For Example: "Open the door" is interpreted as a request instead of an

order.

Why NLP is difficult?

NLP is difficult because Ambiguity and Uncertainty exist in the language.

Ambiguity

There are the following three ambiguities -

o Lexical Ambiguity

Lexical Ambiguity exists in the presence of two or more possible meanings

of the sentence within a single word.

Example:

Manya is looking for a match.

In the above example, the word match refers to that either Manya is
looking for a partner or Manya is looking for a match. (Cricket or other
match)

o Syntactic Ambiguity

Syntactic Ambiguity exists in the presence of two or more possible

meanings within the sentence.

Example:

I saw the girl with the binocular.

In the above example, did I have the binoculars? Or did the girl have the
binoculars?

o Referential Ambiguity
Referential Ambiguity exists when you are referring to something using
the pronoun.

Example: Kiran went to Sunita. She said, "I am hungry."

In the above sentence, you do not know that who is hungry, either Kiran or
Sunita.

Difference between Natural language and Computer

Language

Natural Language Computer Language

Natural language has a very large Computer language has a very limited
vocabulary. vocabulary.

Natural language is easily Computer language is easily understood

understood by humans. by the machines.

Natural language is ambiguous in Computer language is unambiguous.

nature.

f the sentence. The words

are transformed into the
structure to show how's
the
word are related to each
other.
Semantic Analysis − It is
the third phase of NLP. The
purpose of this phase is
to draw exact meaning, or
you can say dictionary
meaning from the text.
The
text is checked for
meaningfulness. For
example, semantic
analyzer would reject
a sentence like “Hot ice-
cream”.
Semantic Analysis is a
structure created by the
syntactic analyzer which
assigns
meanings. This component
transfers linear sequences
of words into structures. It
shows how the words are
associated with each
other.
Semantics focuses only on
the literal meaning of
words, phrases, and
sentences.
This only abstracts the
dictionary meaning or the
real meaning from the
given
context. The structures
assigned by the syntactic
analyzer always have
assigned
meaning
E.g.. "colorless green
idea." This would be
rejected by the Symantec
analysis as
colorless Here; green
doesn't make any sense.
Pragmatic Analysis It is the
fourth phase of NLP.
Pragmatic analysis simply
fits
the actual objects/events,
which exist in a given
context with object
references
obtained during the last
phase (semantic analysis).
For example, the sentence
“Put the banana in the
basket on the shelf” can
have two semantic
interpretations and
pragmatic analyzer will
choose between these two
possibilities.
Pragmatic Analysis deals
with the overall
communicative and social
content and
its effect on interpretation.
It means abstracting or
deriving the meaningful
use
of language in situations.
In this analysis, the main
focus always on what was
said in reinterpreted on
what is meant.
Pragmatic analysis helps
users to discover this
intended effect by
applying a set
of rules that characterize
cooperative dialogues.
E.g., "close the window?"
should be interpreted as a
request instead of an
order.
of the sentence. The
words are transformed
into the structure to
show how's the
word are related to
each other.
Semantic Analysis − It
is the third phase of
NLP. The purpose of
this phase is
to draw exact meaning,
or you can say
dictionary meaning
from the text. The
text is checked for
meaningfulness. For
example, semantic
analyzer would reject
a sentence like “Hot
ice-cream”.
Semantic Analysis is a
structure created by
the syntactic analyzer
which assigns
meanings. This
component transfers
linear sequences of
words into structures. It
shows how the words
are associated with
each other.
Semantics focuses only
on the literal meaning
of words, phrases, and
sentences.
This only abstracts the
dictionary meaning or
the real meaning from
the given
context. The structures
assigned by the
syntactic analyzer
always have assigned
meaning
E.g.. "colorless green
idea." This would be
rejected by the
Symantec analysis as
colorless Here; green
doesn't make any
sense.
Pragmatic Analysis It is
the fourth phase of
NLP. Pragmatic analysis
simply fits
the actual
objects/events, which
exist in a given context
with object references
obtained during the last
phase (semantic
analysis). For example,
the sentence
“Put the banana in the
basket on the shelf”
can have two semantic
interpretations and
pragmatic analyzer will
choose between these
two
possibilities.
Pragmatic Analysis
deals with the overall
communicative and
social content and
its effect on
interpretation. It means
abstracting or deriving
the meaningful use
of language in
situations. In this
analysis, the main focus
always on what was
said in reinterpreted on
what is meant.
Pragmatic analysis
helps users to discover
this intended effect by
applying a set
of rules that
characterize
cooperative dialogues.
E.g., "close the
window?" should be
interpreted as a
request instead of an
order.
Q1. What is NLP? Describe Levels/Phases of NLP.
Natural Language Processing (NLP) refers to AI method of communicating with an intelligent
systems using a natural language such as English. Natural language processing helps computers
communicate with humans in their own language and scales other language-related tasks. For
example, NLP makes it possible for computers to read text, hear speech, interpret it, measure
sentiment and determine which parts are important.

The field of NLP involves making computers to perform useful tasks with the natural
language’s humans use. The input and output of an NLP system can be −
Speech
Written Text

Morphological (the study of the forms of things.):

It is the first phase of NLP. The purpose of this phase is to break chunks of language input
into sets of tokens corresponding to paragraphs, sentences and words. For example, a word
like “uneasy” can be broken into two sub-word tokens as “un-easy”.
The morphological level of linguistic processing deals with the study of word
structures and word formation, focusing on the analysis of the individual
components of words. The most important unit of morphology, defined as having the
“minimal unit of meaning”, is referred to as the morpheme.

Lexical Analysis − It involves identifying and analyzing the structure of words. Lexicon of a
language means the collection of words and phrases in a language. Lexical analysis is dividing
the whole chunk of txt into paragraphs, sentences, and words.

Syntactic Analysis (Parsing) − It is the second phase of NLP. The purpose of this phase is
two folds: to check that a sentence is well formed or not and to break it up into a structure that
shows the syntactic relationships between the different words.

For example, the sentence like “The school goes to the boy” would be rejected by syntax
analyser or parser.
The words are commonly accepted as being the smallest units of syntax. The syntax refers to
the principles and rules that govern the sentence structure of any individual
languages.
Syntax focus about the proper ordering of words which can affect its meaning. This involves
analysis of the words in a sentence by following the grammatical structure of the sentence.
The words are transformed into the structure to show how's the word are related to each other.
Semantic Analysis − It is the third phase of NLP. The purpose of this phase is to draw exact
meaning, or you can say dictionary meaning from the text. The text is checked for
meaningfulness. For example, semantic analyser would reject a sentence like “Hot ice-cream”.
Semantic Analysis is a structure created by the syntactic analyser which assigns meanings.
This component transfers linear sequences of words into structures. Its how’s the words are
associated with each other. Semantics focuses only on the literal meaning of words, phrases,
and sentences. This only abstracts the dictionary meaning or the real meaning from the given
context. The structures assigned by the syntactic analyser always have assigned meaning.
E.g. "colour less green idea." This would be rejected by the Symantec analysis as colour less
Here; green doesn't make any sense.

Pragmatic Analysis - It is the fourth phase of NLP. Pragmatic analysis simply fits the actual
objects/events, which exist in a given context with object references obtained during the last
phase (semantic analysis). For example, the sentence “Put the banana in the basket on the
shelf” can have two semantic interpretations and pragmatic analyser will choose between
these two possibilities. Pragmatic Analysis deals with the overall communicative and social
content and its effect on interpretation. It means abstracting or deriving the meaningful use of
language in situations. In this analysis, the main focus always on what was said in
reinterpreted on what is meant. Pragmatic analysis helps users to discover this intended effect
by applying a set of rules that characterize cooperative dialogues. E.g., "close the window?"
should be interpreted as a request instead of an order.

Q2. Explain different ambiguities faced by NLP.

Ambiguity is a common
challenge in natural
language processing (NLP)
due to the
inherent complexity and
richness of human
language. Ambiguity arises
when a
word, phrase, or sentence
has multiple possible
interpretations or
meanings within
a given context.
Lexical Ambiguity: Lexical
ambiguity occurs when a
single word has multiple
meanings. This is one of
the most common forms of
ambiguity in language. For
example, the word "bank"
can refer to a financial
institution or the side of a
river.
Syntactic Ambiguity:
This kind of ambiguity
occurs when a sentence is
parsed in different ways.
For
example, the sentence
“The man saw the girl with
the telescope”. It is
ambiguous
whether the man saw the
girl carrying a telescope or
he saw her through his
telescope.
Semantic Ambiguity
This kind of ambiguity
occurs when the meaning
of the words themselves
can be
misinterpreted. In other
words, semantic ambiguity
happens when a sentence
contains an ambiguous
word or phrase. For
example, the sentence
“The car hit the
pole while it was moving”
is having semantic
ambiguity because the
interpretations
can be “The car, while
moving, hit the pole” and
“The car hit the pole while
the pole
was moving”.
Anaphoric Ambiguity:
Anaphoric ambiguity arises
when a pronoun in a
sentence refers to more
than one
possible antecedent. For
instance, "She saw her
brother and hugged him."
It's
unclear who "her" and
"him" refer to.
Pragmatic ambiguity
Such kind of ambiguity
refers to the situation
where the context of a
phrase gives it
multiple interpretations. In
simple words, we can say
that pragmatic ambiguity
arises when the statement
is not specific. For
example, the sentence “I
like you too”
can have multiple
interpretations like I like
you (just like you like me),
I like you (just
like someone else does).
Ambiguity is a common challenge in natural language processing (NLP) due to the inherent
complexity and richness of human language. Ambiguity arises when a word, phrase, or
sentence has multiple possible interpretations or meanings within a given context.
Lexical Ambiguity: Lexical ambiguity occurs when a single word has multiple meanings. This
is one of the most common forms of ambiguity in language. For example, the word "bank"
can refer to a financial institution or the side of a river.
Syntactic Ambiguity: This kind of ambiguity occurs when a sentence is parsed in different
ways. For example, the sentence “The man saw the girl with the telescope”. It is ambiguous
whether the man saw the girl carrying a telescope or he saw her through his telescope.
Semantic Ambiguity This kind of ambiguity occurs when the meaning of the words
themselves can be misinterpreted. In other words, semantic ambiguity happens when a
sentence contains an ambiguous word or phrase. For example, the sentence “The car hit the
pole while it was moving” is having semantic ambiguity because the interpretations can be
“The car, while moving, hit the pole” and “The car hit the pole while the pole was moving”.
Anaphoric Ambiguity: Anaphoric ambiguity arises when a pronoun in a sentence refers to
more than one possible antecedent. For instance, "She saw her brother and hugged him." It's
unclear who "her" and "him" refer to.
Pragmatic ambiguity: Such kind of ambiguity refers to the situation where the context of a
phrase gives it multiple interpretations. In simple words, we can say that pragmatic ambiguity
arises when the statement is not specific. For example, the sentence “I like you too” can have
multiple interpretations like I like you (just like you like me), I like you (just like someone
else does).

Q3. State and Explain different challenges faced by NLP.

Natural Language Processing (NLP) Challenges: NLP is a powerful tool with huge benefits,
but there are still a number of Natural Language Processing limitations and problems:
Contextual words and phrases and homonyms: The same words and phrases can have
different meanings
•Synonyms: Synonyms can lead to issues similar to contextual understanding because we use
many different words to express the same idea.
•Ambiguity: Same like before
•Errors in text or speech: Misspelled or misused words can create problems for text analysis.
With spoken language, mispronunciations, different accents, stutters, etc., can be difficult for
a machine to understand.

•Colloquialisms [बोलचाल की भाषा] and slang [बोल चाल की शैली या ढंग]: Colloquialisms

may have no “dictionary definition” at all, and these expressions may even have different

meanings in different geographic areas. Furthermore, cultural slang is constantly morphing

and expanding, so new words pop up every day.

•Domain-specific language: Different businesses and industries often use very different
language.
•Low-resource languages: AI machine learning NLP applications have been largely built for
the most common, widely used languages and not for rarely used languages.
•Lack of research and development: Research should be done for new machine learning
techniques and custom algorithms.

Q4. Describe different applications of NLP.

Natural Language Processing (NLP) has a wide range of applications that span across various
domains and industries. NLP technology enables computers to understand, interpret, and
generate human language, making it useful in numerous contexts. Here are some key
applications of NLP:
1. **Sentiment Analysis**: NLP is used to analyse text data and determine the sentiment
expressed, whether it's positive, negative, or neutral. This is valuable for understanding
customer opinions, social media trends, and brand perception.
2. **Text Classification**: NLP can categorize text documents into predefined classes or
topics, aiding in tasks like spam detection, news categorization, and content recommendation.
3. **Named Entity Recognition (NER)**: NER identifies and classifies named entities (e.g.,
names, locations, organizations) in text, useful for information extraction, data enrichment,
and summarization
4. **Machine Translation**: NLP powers machine translation systems that automatically
translate text from one language to another, enabling cross-lingual communication and content
localization.
5. **Speech Recognition**: NLP technology converts spoken language into text, used in
voice assistants, transcription services, and accessibility tools for the hearing impaired.
6. **Text Generation**: NLP can generate coherent and contextually relevant text, used in
chat bots, content creation, and text completion suggestions.
7. **Question Answering**: NLP-based question-answering systems can retrieve relevant
information from large datasets to answer user queries, such as those used in search engines or
AI assistants.
8. **Chatbots and Virtual Assistants**: NLP powers conversational agents that can engage
in natural language conversations, assist users, answer questions, and perform tasks.
9. **Information Retrieval**: NLP improves search engine results by understanding the
user's query and retrieving relevant documents or information.
10. **Language Translation**: Beyond machine translation, NLP is used for specialized
translation tasks like legal or medical translation, where domain knowledge is essential.
11. **Text Summarization**: NLP can automatically generate concise summaries of longer
texts, making it useful for news articles, research papers, and content curation.
12. **Emotion Analysis**: NLP can detect emotions and emotional tones in text, assisting
in market research, customer feedback analysis, and mental health applications.
13. **Language Understanding Interfaces**: NLP enables voice-controlled interfaces,
allowing users to interact with devices and systems using natural language commands.
14. **Legal Document Analysis** NLP can process and extract information from legal
documents, contracts, and regulations, assisting in legal research and compliance.
15. **Clinical Text Analysis**: In healthcare, NLP can analyse medical records, clinical
notes, and research papers to extract insights and improve patient care.
16. **Social Media Analysis**: NLP can analyse social media posts, comments, and trends
to understand public sentiment, track social campaigns, and monitor brand reputation.
17. **Financial Analysis**: NLP can analyse financial news, reports, and earnings calls to
provide insights for investment decisions and market predictions.
18. **Content Recommendation**: NLP-powered recommendation systems analyse user
preferences and behaviours to suggest relevant content, products, or services.
19. **Academic Research**: NLP aids researchers in analysing and summarizing academic
literature, finding relevant papers, and exploring relationships between concepts.
These applications showcase the versatility and impact of NLP across diverse industries,
improving communication, decision-making, automation, and user experience. NLP continues
to evolve, enabling more advanced and specialized applications in the future.

Introduction to Grammar in NLP

Grammar in NLP is a set of rules for constructing sentences in a language used to understand
and analyze the structure of sentences in text data.

This includes identifying parts of speech such as nouns, verbs, and adjectives, determining the
subject and predicate of a sentence, and identifying the relationships between words and
phrases.

As humans, we talk in a language that is easily understandable to other humans and not
computers. To make computers understand language, they must have a structure to
follow. Syntax describes a language's regularity and productivity, making sentences' structure
explicit.

The word syntax here refers to the way the words are arranged together. Regular languages
and parts of speech refer to how words are arranged together but cannot support easily, such
as grammatical or What is Grammar?

Grammar is defined as the rules for forming well-structured sentences. Grammar also plays an
essential role in describing the syntactic structure of well-formed programs, like denoting the
syntactical rules used for conversation in natural languages.

 In the theory of formal languages, grammar is also applicable in Computer Science,

mainly in programming languages and data structures. Example - In the C
programming language, the precise grammar rules state how functions are made with
the help of lists and statements.
 Mathematically, a grammar G can be written as a 4-tuple (N, T, S, P) where:
o N or VN = set of non-terminal symbols or variables.

o S = Start symbol where S ∈ N

o T or ∑ = set of terminal symbols.

o P = Production rules for Terminals as well as Non-terminals.

Syntax

Each natural language has an underlying structure usually referred to under Syntax. The
fundamental idea of syntax is that words group together to form the constituents like groups of
words or phrases which behave as a single unit. These constituents can combine to form
bigger constituents and, eventually, sentences.

 Syntax describes the regularity and productivity of a language making explicit the
structure of sentences, and the goal of syntactic analysis or parsing is to detect if a
sentence is correct and provide a syntactic structure of a sentence.

Syntax also refers to the way words are arranged together. Let us see some basic ideas related
to syntax:

 Constituency: Groups of words may behave as a single unit or phrase - A constituent,

for example, like a Noun phrase.
 Grammatical relations: These are the formalization of ideas from traditional
grammar. Examples include - subjects and objects.
 Subcategorization and dependency relations: These are the relations between words
and phrases, for example, a Verb followed by an infinitive verb.
 Regular languages and part of speech: Refers to the way words are arranged
together but cannot support easily. Examples are Constituency, Grammatical relations,
and Subcategorization and dependency relations.
 Syntactic categories and their common denotations in NLP: np - noun phrase, vp -
verb phrase, s - sentence, det - determiner (article), n - noun, tv - transitive verb (takes
an object), iv - intransitive verb, prep - preposition, pp - prepositional phrase, adj -
adjective

Context-Free Grammars (CFGs) are like sets of rules that help us understand how
sentences in a language are built. Imagine you're playing with building blocks. Each
type of block represents a different part of a sentence, like nouns, verbs, or
adjectives.
Context-Free Grammars (CFGs) are like the building blocks of sentences in natural
language processing (NLP). They help us understand how words can be put together
to form sentences by defining rules and structures.
Imagine you're playing with LEGO blocks. Each block represents a different part of
a sentence, like nouns, verbs, or adjectives. CFGs use symbols to represent these
parts of speech.
Now, let's say we want to create a sentence. CFGs provide rules that tell you how
these blocks (symbols) can be combined. For example, a rule might say that a
sentence can start with a noun followed by a verb and then another noun. These
rules guide the construction of valid sentences.
In NLP, CFGs are used in various ways:
Parsing: CFGs help break down sentences into their grammatical components. It's
like taking apart a LEGO structure to understand how each piece fits.
Generating Sentences: They're used to create sentences following the grammar rules.
It's akin to using a set of LEGO blocks to build something new.

Language Understanding: CFGs aid in understanding the structure of sentences,

which is crucial for tasks like language translation or text generation.

Remember, just as LEGO instructions guide you on how to build something specific,
CFGs provide the rules to structure sentences in a language. They're a fundamental
tool in NLP for understanding, creating, and processing human language.


Context Free Grammar

Context-free grammar consists of a set of rules expressing how symbols of the language can
be grouped and ordered together and a lexicon of words and symbols.

 One example rule is to express an NP (or noun phrase) that can be composed of either a
ProperNoun or a determiner (Det) followed by a Nominal, a Nominal in turn can consist of
one or more Nouns: NP → DetNominal, NP → ProperNoun; Nominal → Noun | NominalNoun
 Context-free rules can also be hierarchically embedded, so we can combine the previous
rules with others, like the following, that express facts about the lexicon: Det → a Det → the
Noun → flight
 Context-free grammar is a formalism power enough to represent complex relations and can
be efficiently implemented. Context-free grammar is integrated into many language
applications
 A Context free grammar consists of a set of rules or productions, each expressing the ways
the symbols of the language can be grouped, and a lexicon of words

Context-free grammar (CFG) can also be seen as the list of rules that define the set of all well-
formed sentences in a language. Each rule has a left-hand side that identifies a syntactic
category and a right-hand side that defines its alternative parts reading from left to right.
- Example: The rule s --> np vp means that "a sentence is defined as a noun phrase followed
by a verb phrase."

 Formalism in rules for context-free grammar: A sentence in the language defined by a CFG is
a series of words that can be derived by systematically applying the rules, beginning with a
rule that has s on its left-hand side.
o Use of parse tree in context-free grammar: A convenient way to describe a parse is to
show its parse tree, simply a graphical display of the parse.
o A parse of the sentence is a series of rule applications in which a syntactic category is
replaced by the right-hand side of a rule that has that category on its left-hand side,
and the final rule application yields the sentence itself.
 Example: A parse of the sentence "the giraffe dreams" is: s => np vp => det n vp => the n vp
=> the giraffe vp => the giraffe iv => the giraffe dreams
 If we look at the example parse tree for the sample sentence in the illustration the giraffe
dreams, the graphical illustration shows the parse tree for the sentence
 We can see that the root of every subtree has a grammatical category that appears on the
left-hand side of a rule, and the children of that root are identical to the elements on the
right-hand side of that rule.

English Grammar Rules

English grammar rules help us understand how words fit together to form meaningful
sentences. Here are some basic grammar rules often used in NLP:
Subject-Verb Agreement: This rule means that the form of the verb must agree with the
subject of the sentence. For instance, "He runs" is correct, while "He run" is not.

Pronoun Usage: Pronouns like "he," "she," or "it" should match the gender or number
of the noun they represent. For example, "She loves ice cream" matches the singular
"she," while "They love ice cream" matches the plural "they."

Sentence Structure: Sentences typically follow a structure with a subject, verb, and
sometimes an object. For example, "The dog (subject) chased (verb) the ball (object)."
Tense and Verb Forms: Verbs change their form to indicate when an action happens.
For instance, "I walk" (present tense) versus "I walked" (past tense).

Word Order: English usually follows a specific word order: Subject-Verb-Object (SVO).
For example, "She (subject) eats (verb) apples (object)."

Articles (a, an, the): Articles are used to specify nouns. "A" and "an" are used for non-
specific nouns, while "the" is used for specific nouns.

Punctuation: Punctuation marks like periods, commas, question marks, and exclamation
points help structure sentences and convey meaning.

Understanding these rules helps NLP models make sense of text, generate coherent
sentences, and analyze language patterns. They're like the grammar guidelines that keep
our sentences clear and understandable!

Transition Networks
Transition networks are a way of representing and understanding language
structures or patterns. They're often used in computational linguistics and
natural language processing (NLP) to describe how sentences or phrases are
formed.

At its core, a transition network consists of nodes and arcs. Nodes represent
different states or elements of a language (like words or parts of speech),
while arcs depict the transitions between these states based on specific
conditions or rules.

These networks are particularly helpful for parsing sentences or

understanding the syntactic structure of language. As a sentence is
analyzed, the network progresses through different states (nodes) following
the arcs based on grammar rules or linguistic constraints until it reaches a
final state that represents a valid sentence or phrase.

Transition networks are flexible and can accommodate various complexities

of language. They allow for the representation of context-sensitive rules,
making them suitable for analyzing more intricate linguistic structures
beyond what simple grammars like context-free grammars can handle.

In summary, transition networks serve as a graphical representation or

model that helps computers understand and process the rules and
structures of natural language, making them valuable tools in NLP and
computational linguistics.

Semantics attachment-word senses

Relations between senses
In the realm of natural language processing and computational linguistics,
dealing with word senses and their relations is crucial for understanding
meaning. Here's a breakdown:

Word Senses:

Polysemy: Words often have multiple meanings. For instance, "bank"

can refer to a financial institution or the edge of a river. Each meaning
represents a different sense of the word.
**WordNet**: It's a lexical database that organizes words into sets of
synonyms called synsets and links them by semantic relations. For example,
it categorizes different senses of a word and shows how they relate to each
other.
Let's take the word "bank" as an example and see how WordNet
categorizes its different senses and their relations:

Word: "Bank"
Sense 1: Financial Institution

Synonyms: bank, banking concern, depository financial institution

Hypernym (General Category): Financial Institution
Hyponyms (Specific Types): Central Bank, Commercial Bank, Investment
Bank, etc.
Example Sentence Relation: "I need to deposit money in the bank."

Sense 2: Sloping Land by a Body of Water

Synonyms: bank, mound

Hypernym (General Category): Landform
Hyponyms (Specific Types): Riverbank, Embankment, Shore, etc.
Example Sentence Relation: "They sat on the bank of the river and enjoyed
the view."
WordNet distinguishes these two senses of "bank" and organizes them into
separate synsets. It shows how these senses relate to broader categories
(hypernyms), specific types (hyponyms), and their synonyms. This
organization helps in understanding the different meanings of the word and
how they are semantically connected to other related words or concepts.
Relations between Senses:

Hyponymy/Hypernymy: These relations show the hierarchical

structure of words. For instance, 'rose' is a hyponym (specific type) of
'flower,' which is its hypernym (general category).

**Antonymy**: Words with opposite meanings, like 'hot' and 'cold,' are
antonyms. They provide contrasting senses.

Meronymy/Holonymy: These relations show part-whole connections.

For example, 'finger' is a meronym of 'hand,' while 'hand' is a holonym of
'finger.'

Synonymy: Words with similar meanings are synonymous. WordNet,

for instance, groups synonyms into synsets.

Word Sense Disambiguation (WSD): This task involves determining

the correct sense of a word in a given context. For instance, in the sentence
"She caught a fish," 'fish' could mean an animal or a tool, and WSD helps
choose the right sense based on the surrounding words.

Semantic Attachment:

In NLP, semantic attachment refers to associating the correct meaning

(sense) of a word to its context within a sentence or a larger linguistic
structure. Understanding the intended sense is crucial for accurate
language processing tasks like translation, summarization, or sentiment
analysis.
By analyzing the relationships between word senses and how they connect
within sentences or texts, NLP systems can better understand the shades of
language and provide more accurate interpretations or translations.

These concepts and relations help NLP systems navigate the complexities of
language by understanding not just words but their meanings and how they
interact within different contexts.

Natural Language Processing
No ratings yet
Natural Language Processing
4 pages
Natural Language Processing (NPL) : Group Name: Goal Diggers
No ratings yet
Natural Language Processing (NPL) : Group Name: Goal Diggers
22 pages
Sped 277 Udl Lesson Plan Templated Inferences
No ratings yet
Sped 277 Udl Lesson Plan Templated Inferences
2 pages
Natural Language Processing: Bachelor of Technology Computer Science and Engineering
No ratings yet
Natural Language Processing: Bachelor of Technology Computer Science and Engineering
7 pages
Natural Language Processing
100% (1)
Natural Language Processing
3 pages
NLP Presentation
No ratings yet
NLP Presentation
19 pages
What Is NLP?
No ratings yet
What Is NLP?
5 pages
3.1 Natural Language Processing
No ratings yet
3.1 Natural Language Processing
5 pages
Unit 1 Extra
No ratings yet
Unit 1 Extra
6 pages
AI Unit 5
No ratings yet
AI Unit 5
10 pages
Guided by Dinesh Sir Presented by Sam
No ratings yet
Guided by Dinesh Sir Presented by Sam
10 pages
Natural Language Processing
No ratings yet
Natural Language Processing
16 pages
Class 1 - NLP
No ratings yet
Class 1 - NLP
28 pages
Natural Language Processing
100% (1)
Natural Language Processing
6 pages
What Is NLP?: Natural Language Processing Computer Science, Human Language, Artificial Intelligence
No ratings yet
What Is NLP?: Natural Language Processing Computer Science, Human Language, Artificial Intelligence
10 pages
What Is NLP
No ratings yet
What Is NLP
3 pages
4 - Aisc
No ratings yet
4 - Aisc
14 pages
What Is NLP?: Natural Language Processing in AI
No ratings yet
What Is NLP?: Natural Language Processing in AI
5 pages
CL Unit 1
No ratings yet
CL Unit 1
11 pages
AI Chapter 6 and 7 New
No ratings yet
AI Chapter 6 and 7 New
48 pages
Seminar Report
No ratings yet
Seminar Report
12 pages
NLP for AI and Tech Enthusiasts
No ratings yet
NLP for AI and Tech Enthusiasts
30 pages
1 Introduction
No ratings yet
1 Introduction
45 pages
Chapter 1
No ratings yet
Chapter 1
5 pages
Natural Language Processing
No ratings yet
Natural Language Processing
14 pages
Essay Questions
100% (1)
Essay Questions
5 pages
NLP Lecture
No ratings yet
NLP Lecture
18 pages
Chapter 1
No ratings yet
Chapter 1
31 pages
NLP Ass 1&2
No ratings yet
NLP Ass 1&2
18 pages
Natural Language Processing
No ratings yet
Natural Language Processing
24 pages
Natural Language Procesing Notes-3-21
No ratings yet
Natural Language Procesing Notes-3-21
19 pages
Natural Language Processing
No ratings yet
Natural Language Processing
30 pages
1 Natural Language Processing-Intro
No ratings yet
1 Natural Language Processing-Intro
16 pages
2 Introduction
No ratings yet
2 Introduction
15 pages
NLP Presentation
No ratings yet
NLP Presentation
19 pages
Digital Marketing Ashutosh
No ratings yet
Digital Marketing Ashutosh
13 pages
NLP Exam Notes
No ratings yet
NLP Exam Notes
15 pages
Foundation For NLP
No ratings yet
Foundation For NLP
14 pages
NLP UNIT 1 Part 1
No ratings yet
NLP UNIT 1 Part 1
24 pages
Legal Appeal on Property Dispute
No ratings yet
Legal Appeal on Property Dispute
7 pages
How To Troubleshoot Identity Awareness Issues - Checkpoint
No ratings yet
How To Troubleshoot Identity Awareness Issues - Checkpoint
17 pages
Module 1 Part1 NLP
No ratings yet
Module 1 Part1 NLP
24 pages
Hematology & Drug Study Guide
No ratings yet
Hematology & Drug Study Guide
19 pages
Sm-Ii Group Assignment: Adani Enterprises LTD
No ratings yet
Sm-Ii Group Assignment: Adani Enterprises LTD
9 pages
NLP for Tech Enthusiasts
No ratings yet
NLP for Tech Enthusiasts
35 pages
Unit V
No ratings yet
Unit V
16 pages
NLP Exam Prep Guide
No ratings yet
NLP Exam Prep Guide
27 pages
NLP Notes
No ratings yet
NLP Notes
18 pages
Final Exam Denis Bonilla
100% (1)
Final Exam Denis Bonilla
7 pages
UNIT - 03 (All Topics)
No ratings yet
UNIT - 03 (All Topics)
54 pages
Ai Unit4
No ratings yet
Ai Unit4
36 pages
Service Manual: Finisher
No ratings yet
Service Manual: Finisher
235 pages
Iot Physical Devices and Endpoints: Bahga & Madisetti, © 2015
No ratings yet
Iot Physical Devices and Endpoints: Bahga & Madisetti, © 2015
14 pages
NLP Basics for Computer Science Students
No ratings yet
NLP Basics for Computer Science Students
87 pages
ENGM90006 Assignment 14 v1
No ratings yet
ENGM90006 Assignment 14 v1
3 pages
NLP Unit 1 1
No ratings yet
NLP Unit 1 1
67 pages
Chapter - Ii Muslim Law of Testamentary Succession
100% (1)
Chapter - Ii Muslim Law of Testamentary Succession
51 pages
The Philippine Cultural Values and Entrepreneurship
100% (2)
The Philippine Cultural Values and Entrepreneurship
16 pages
Unit 1
No ratings yet
Unit 1
18 pages
Music Listening and Critical Thinking
No ratings yet
Music Listening and Critical Thinking
15 pages
Business Apps Boost Efficiency
No ratings yet
Business Apps Boost Efficiency
3 pages
20210730201058D6214 - 3. Paraphrasing, Summarising, Combining Sources
No ratings yet
20210730201058D6214 - 3. Paraphrasing, Summarising, Combining Sources
22 pages
Web Guide
No ratings yet
Web Guide
60 pages
The Autoimmune Epidemic by Human Garage
No ratings yet
The Autoimmune Epidemic by Human Garage
12 pages
Optimize ACC Cement Distribution
100% (3)
Optimize ACC Cement Distribution
4 pages
Embracing Identity Through Names
No ratings yet
Embracing Identity Through Names
1 page
English For Aviation: Course Outline and Sample Materials
0% (1)
English For Aviation: Course Outline and Sample Materials
14 pages
16-Limpan Investment Corp. v. CIR G.R. No. L-21570 July 26, 1966
No ratings yet
16-Limpan Investment Corp. v. CIR G.R. No. L-21570 July 26, 1966
4 pages
PT Science-6 Q1
No ratings yet
PT Science-6 Q1
6 pages
Pubmed Microneedl Set
No ratings yet
Pubmed Microneedl Set
3 pages
NLP - Natural Language Processing and APPLICATION
No ratings yet
NLP - Natural Language Processing and APPLICATION
31 pages
Week 5
No ratings yet
Week 5
8 pages
Invasion Games
No ratings yet
Invasion Games
2 pages
NLP Module1-4
No ratings yet
NLP Module1-4
100 pages
Chapter 6 Natural Language Processing
No ratings yet
Chapter 6 Natural Language Processing
6 pages
TOPIC 4 Natural Language Processing
No ratings yet
TOPIC 4 Natural Language Processing
26 pages
NEUPANE - Richa - Biochar Production Process Optimisation and Product Characterisation
No ratings yet
NEUPANE - Richa - Biochar Production Process Optimisation and Product Characterisation
114 pages
Module1 Chapter1
No ratings yet
Module1 Chapter1
23 pages
What Is NLP
No ratings yet
What Is NLP
14 pages
NLP Meterial 5 Units
No ratings yet
NLP Meterial 5 Units
151 pages
NLP Unit 1 To 5
No ratings yet
NLP Unit 1 To 5
91 pages
NLP Notes (Ch-1)
No ratings yet
NLP Notes (Ch-1)
5 pages
Ict Literacy
No ratings yet
Ict Literacy
36 pages
S5 M1 Quiz 8 - Binomial and Poisson (II)
No ratings yet
S5 M1 Quiz 8 - Binomial and Poisson (II)
2 pages
NLP 230920 150745
No ratings yet
NLP 230920 150745
17 pages
Problems
No ratings yet
Problems
3 pages
NLP Introduction
No ratings yet
NLP Introduction
36 pages
Standard Exception
No ratings yet
Standard Exception
5 pages
Immediate Access Engineering Fluid Mechanics 10th Edition Verified PDF Download
0% (1)
Immediate Access Engineering Fluid Mechanics 10th Edition Verified PDF Download
406 pages
Transformational Grammar.
No ratings yet
Transformational Grammar.
19 pages
Operators
No ratings yet
Operators
9 pages
NLP Qa
No ratings yet
NLP Qa
2 pages
Difference Between Human Expert and Expert System.
No ratings yet
Difference Between Human Expert and Expert System.
5 pages
Unit 2 KB - 1
No ratings yet
Unit 2 KB - 1
14 pages
Variable
No ratings yet
Variable
6 pages
Expert System
No ratings yet
Expert System
11 pages
Dictonary & Build Methods
No ratings yet
Dictonary & Build Methods
4 pages
Commandlime Argument
No ratings yet
Commandlime Argument
2 pages
Guidelines
No ratings yet
Guidelines
5 pages
String
No ratings yet
String
4 pages
Unit 1_MMA
No ratings yet
Unit 1_MMA
12 pages

NLP Notes

Uploaded by

NLP Notes

Uploaded by

What is NLP?

NLP stands for Natural Language Processing, which is a part

The Natural Languages Processing started in the year 1940s.

In 1957, Chomsky also introduced the idea of Generative Grammar, which

(1960-1980) - Flavored with Artificial Intelligence (AI)

o NLP may not show context.

1. Natural Language Understanding (NLU)

Natural Language Understanding (NLU) helps the machine to understand

NLU mainly used in Business applications to understand the customer's

NLU involves the following tasks -

2. Natural Language Generation (NLG)

Natural Language Generation (NLG) acts as a translator that converts the

Difference between NLU and NLG

It produces non-linguistic outputs It produces constructing natural language

Question Answering focuses on building systems that automatically

Spam detection is used to detect unwanted e-mails getting to a user's

Machine translation is used to translate text or speech from one natural

Example: Google Translator

Microsoft Corporation provides word processor software like MS-word,

Implementing the Chatbot is one of the important applications of NLP. It is

Information extraction is one of the most important applications of NLP. It

9. Natural Language Understanding (NLU)

It converts a large set of text into more formal representations such as

code as a stream of characters and converts it into meaningful lexemes

2. Syntactic Analysis (Parsing)

Syntactic Analysis is used to check grammar, word arrangements, and

Example: Agra goes to the Poonam

Semantic analysis is concerned with the meaning representation. It mainly

For Example: "Open the door" is interpreted as a request instead of an

Why NLP is difficult?

There are the following three ambiguities -

Lexical Ambiguity exists in the presence of two or more possible meanings

Manya is looking for a match.

Syntactic Ambiguity exists in the presence of two or more possible

I saw the girl with the binocular.

Example: Kiran went to Sunita. She said, "I am hungry."

Difference between Natural language and Computer

Natural Language Computer Language

Natural language is easily Computer language is easily understood

Natural language is ambiguous in Computer language is unambiguous.

f the sentence. The words

Morphological (the study of the forms of things.):

Q2. Explain different ambiguities faced by NLP.

Q3. State and Explain different challenges faced by NLP.

meanings in different geographic areas. Furthermore, cultural slang is constantly morphing

and expanding, so new words pop up every day.

Q4. Describe different applications of NLP.

Introduction to Grammar in NLP

 In the theory of formal languages, grammar is also applicable in Computer Science,

o S = Start symbol where S ∈ N

o P = Production rules for Terminals as well as Non-terminals.

 Constituency: Groups of words may behave as a single unit or phrase - A constituent,

Language Understanding: CFGs aid in understanding the structure of sentences,

Context Free Grammar

English Grammar Rules

These networks are particularly helpful for parsing sentences or

Transition networks are flexible and can accommodate various complexities

In summary, transition networks serve as a graphical representation or

Semantics attachment-word senses

**Polysemy**: Words often have multiple meanings. For instance, "bank"

Synonyms: bank, banking concern, depository financial institution

Sense 2: Sloping Land by a Body of Water

Synonyms: bank, mound

**Hyponymy/Hypernymy**: These relations show the hierarchical

**Meronymy/Holonymy**: These relations show part-whole connections.

**Synonymy**: Words with similar meanings are synonymous. WordNet,

**Word Sense Disambiguation (WSD)**: This task involves determining

In NLP, semantic attachment refers to associating the correct meaning

You might also like

Polysemy: Words often have multiple meanings. For instance, "bank"

Hyponymy/Hypernymy: These relations show the hierarchical

Meronymy/Holonymy: These relations show part-whole connections.

Synonymy: Words with similar meanings are synonymous. WordNet,

Word Sense Disambiguation (WSD): This task involves determining