Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
26 views8 pages

NLP One Mark Questions With Answers

The document provides an overview of Natural Language Processing (NLP) covering various topics including applications, components, phases, and key concepts such as morphology, typology, and parsing. It discusses techniques like stemming, lemmatization, and the use of libraries like NLTK, along with the importance of syntactic analysis and treebanks. Additionally, it addresses N-gram models, smoothing techniques, and the limitations of N-gram models in NLP.

Uploaded by

harini.konkala
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views8 pages

NLP One Mark Questions With Answers

The document provides an overview of Natural Language Processing (NLP) covering various topics including applications, components, phases, and key concepts such as morphology, typology, and parsing. It discusses techniques like stemming, lemmatization, and the use of libraries like NLTK, along with the importance of syntactic analysis and treebanks. Additionally, it addresses N-gram models, smoothing techniques, and the limitations of N-gram models in NLP.

Uploaded by

harini.konkala
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

NLP

One mark Questions with answers


UNIT - I

1.) list out few applications of NLP.

• Question Answering
• spam detection
• machine translation
• speech correction
• Chatbot
• Speech recognition

2.) Components of NLP

• NLU (natural language understanding)


• NLG (natural language generation)

3.) Name five phases involved in NLP.

• lexical Analysis and morphological


• Syntactic Analysis
• semantic
• discourse integration
• pragmatic analysis

4.) Differentiate lexeme and lemma

Aspect Lexeme Lemma


The base unit of meaning; an abstract unit The dictionary form or canonical form
Definition
representing all inflected forms of a word. of a lexeme.
All the inflectional variants (e.g., walk, walks, A single standard form, typically used as
Represents
walked, walking). a headword in dictionaries.
Example Lexeme: RUN → run, runs, ran, running Lemma: run
Used in Linguistic analysis, corpus studies, NLP Dictionaries, NLP, morphological parsing
Nature Abstract and general Specific and representative

5.) Define Morphology

Morphology is the branch of linguistics that studies the structure and formation of words. It
analyzes how morphemes (the smallest units of meaning) combine to form words, including roots,
prefixes, and suffixes.

Example: In the word “unhappiness”, un- (prefix), happy (root), and -ness (suffix) are all morphem

6.) What is typology


Typology in linguistics is the study and classification of languages based on their structural features,
such as word order, sentence structure, or morphological patterns. It helps identify similarities and
differences among languages, regardless of their historical or genetic relationships.

Example: English follows SVO (Subject-Verb-Object) word order, while Hindi follows SOV (Subject-
Object-Verb).

7.) Mention about Fusional language

Fusional languages are defined by their feature-per-morpheme ratio higher than one (as in Arabic,
Czech, Latin, Sanskrit, German, etc.).

Ex: Word: Head

She nodded her head

She is the head of the department

check the head of the page

We should head back home now

The toothpaste came out of the head of the tube

8.) Features of NLTK

• Tokenization
• Lowercasing
• Removing stopwords
• Punctuation removal
• Stemming
• Lemmatization
• POS tagging
• Named Entity Recognition (NER)

9.) Define stemming

Stemming is the process of reducing a word to its base or root form, called a "stem.“

It helps group related words together so they can be analyzed as a single item, regardless of tense or
form.

Ex: Helping - help

studying - studi

flying - fli

helper - help

10.) Define Lemmatizing

Lemmatizing is the process of reducing a word to its lemma, or base form. Unlike stemming, it
produces a valid English word that makes sense on its own.

Stemming:
→ "caring" → "car" (not meaningful in context)

→ Fast but less accurate.

Lemmatizing:

→ "caring" → "care" (meaningful root word)

→ Slower but context-aware and grammatically correct.

11.) List out the libraries that are imported with respect to NLTK

import contractions

import nltk

import re

from nltk.tokenize import word_tokenize

from nltk.corpus import stopwords

from nltk.stem import PorterStemmer, WordNetLemmatizer

from nltk import pos_tag

12.) Differentiate chunking and chinking

Chunking: the process of identifying and grouping phrases in a sentence — like noun phrases,
verb phrases, etc.

Chinking: Removes specific patterns within a chunk (like verbs or adverbs that don't belong)

13.) Define NER (Named Entity Recognition)

Process of identifying entities in the given sentence

Ex: Person names (e.g., Mahatma Gandhi)

Organizations (e.g., MRCET)

Locations (e.g., Hyderabad, India)

Dates (e.g., 20 June 2025)

Monetary values (e.g., ₹500, $1000)(AMOUNT MENTIONED IN TEXT)

Time, Percentages, Events, etc.

UNIT-II

1.) Define Parsing/Syntax Analysis.

A. the process of analyzing a sentence's grammatical structure according to the rules of a formal
grammar. It identifies the syntactic structure of a sentence and determines how the words relate to
each other.

2.) Applications of Syntactic analysis


• Grammar checking (e.g., Grammarly)

• Question answering systems

• Chatbots

• Machine translation

• Text summarization

3.) List out Approaches to Syntax Analysis.

 Top-Down Parsing – Starts from the start symbol and tries to derive the sentence.
 Bottom-Up Parsing – Builds the parse tree from the input up to the start symbol.
 Chart Parsing – Uses dynamic programming to store intermediate parsing results.
 Shift-Reduce Parsing – A bottom-up method using a stack to shift and reduce tokens.
 Recursive Descent Parsing – A top-down parser using recursive functions for grammar rules.
 Dependency Parsing – Focuses on word-to-word relations (head-dependent).
 Constituency Parsing – Breaks sentences into phrase structures (like NP, VP).
 Probabilistic Parsing – Uses probabilities to select the most likely parse tree.

4.) Define Treebanks

Treebanks are annotated text corpora that include syntactic or grammatical structure (usually in
the form of parse trees) for each sentence. They are used in Natural Language Processing (NLP) and
linguistics to train and evaluate parsers and grammar models.

Example: A sentence like "The cat sat on the mat." would be annotated to show how words group
into phrases (like noun phrases and verb phrases).

5.) Types of Syntax trees and what are they?

There are two main types of syntax trees in linguistics:

1. Constituency Tree (Phrase Structure Tree):


Shows how words group into phrases (like noun phrases or verb phrases) based on grammar
rules.
Example: [NP The cat] [VP sat [PP on [NP the mat]]]
2. Dependency Tree:
Shows word-to-word relationships, where one word (the "head") governs the others (its
"dependents").
Example: In "The cat sat," "sat" is the main verb, and "cat" is its subject dependent.

These trees help analyze sentence structure and grammatical relationships.

6.) Uses of Treebanks.

• Training parsers (e.g., probabilistic context-free grammar parsers, neural parsers)

• Evaluating parsing algorithms

• Linguistic research
• Building tools for translation

• sentiment analysis, etc.

7.) Write about data driven approach

A data-driven approach in linguistics and NLP relies on large annotated datasets (corpora) to learn
patterns and make predictions. Instead of using fixed grammar rules, this approach uses statistical
models or machine learning algorithms trained on real language data.

Example: A machine translation system trained on parallel corpora learns how to translate based on
patterns in the data, not predefined rules.

8.) Define dependency graph

A. A Dependency Graph is defined as how words in a sentence are connected based on their
grammar roles.

Ex:"Don't drink and drive.“

9.) Where do dependency graph is used.

• A. NLP parsers (like spaCy, Stanford NLP)

• Grammar checking tools

• Machine translation

• Information extraction

10.) List out the tools used to build Phrase structure trees.

• NLTK (Natural Language Toolkit) — Python

• Stanford Parser / CoreNLP

• spaCy + Benepar (Berkeley Neural Parser)

• RSyntaxTree (Web GUI Tool)

• SyntaxNet

11.) Write about types of Parsing algorithms.

• Shift-Reduce Parsing

• Chart Parsing (CYK Algorithm)

• Hypergraph-based Parsing

12.) Define Hypergraph.


hypergraph is a type of graph in which an edge, called a hyperedge, can connect more than two
vertices. It is used to represent multi-way relationships between elements.

Vertices: A, B, C, D

A B C D

●-------●-------●

\ | /

\_____|_____/

Hyperedge E1

13.) Write about Probabilistic Context-free Grammer.

A. Probabilistic Context-Free Grammar (PCFG) is an extension of CFG (Context-Free Grammar) where:

• Each production rule has an associated probability.

• These probabilities help choose the most likely parse tree when a sentence has multiple
possible meanings.

14.) List out Types of Generative models.

• PCFG (Probabilistic Context-Free Grammar)


• Lexicalized PCFG
• Generative Neural Parsers
• Data-Oriented Parsing (DOP)
• Bayesian Generative Models
• Stochastic Tree-Substitution Grammars (STSG)
• Generative Dependency Parsers
• Minimalist Grammars (generative, theoretical)

15.) What are the advantages of Discriminative models for parsing.

• Can use rich and overlapping features (lexical, syntactic, semantic).


• Do not take own decisions
• Provides higher accuracy parsing

UNIT-III

1.) How many types of n-gram models are there. What are they?

Types of N-Gram Model

• Unigram
• Bigram

• Trigram

• Higher-order N-gram Models

2.) What is the purpose of language model evaluation?

• The accuracy of word predictions

• The fluency and naturalness of generated text

• How well the model captures language structure and meaning

3.) Define perplexity.

Perplexity is a measurement of how well a language model predicts a sequence of words.


It tells user how “confused” or “surprised” the model is when it sees the actual text.

4.) Types of Smoothing techniques.

• Add-One (Laplace) Smoothing


• Add-k Smoothing
• Good-Turing Discounting
• Backoff and Interpolation

5.) Describe the role of smoothing in N-gram models. Why is it necessary?

Answer:
Smoothing helps when some N-grams in the test sentence do not appear in the training corpus,
resulting in zero probabilities.

Example: If "I enjoy mango" never appeared in training, then:


P("mango" | "enjoy") = 0 → Whole sentence probability = 0

Solution:

 Laplace Smoothing: Adds 1 to all counts to avoid zeros.


 Backoff Models: Fall back to smaller N-grams if higher ones are missing.

Smoothing ensures the model assigns non-zero probabilities to unseen sequences.

6) What are the limitations of N-gram models and how can they be addressed?:

Limitations:

 Data sparsity: Many possible word sequences may not appear in training data.
 Limited context: N-gram models only look at a few previous words.
 High memory: Storing large N-gram tables is resource-heavy.

Solutions:

 Smoothing: Adjusts probabilities of unseen N-grams (e.g., Laplace Smoothing).


 Backoff and Interpolation: Uses lower-order N-grams when higher-order ones are
unavailable.

You might also like