Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
47 views18 pages

SNLP

Uploaded by

programerpj69
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views18 pages

SNLP

Uploaded by

programerpj69
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Natural Language Processing Topic , Question Paper and

Important Topic :-

Unit 1: Natural Language Processing (NLP) Basics

 Introduction to NLP:
 Applications of NLP and key challenges.
 Understanding the lexicon and morphology in natural language.

Unit 2: Phrase Structure Grammars and English Syntax

 Phrase Structure Grammars:


 Study of the structure of phrases in natural language.
 Understanding English syntax using grammatical rules.
 Part of Speech Tagging:
 Assigning grammatical categories to words in a sentence.
 Techniques and algorithms for part-of-speech tagging.

Unit 3: Syntactic Parsing

 Syntactic Parsing:
 Analysis of the grammatical structure of sentences.
 Top-down and bottom-up parsing strategies for understanding sentence structure.

Unit 4: Semantics and Discourse Analysis

 Semantics:
 Understanding the meaning of words and sentences.
 Word Sense Disambiguation techniques.
 Semantic parsing for extracting meaning from sentences.
 Subjectivity and sentiment analysis in text data.

Unit 5: Information Extraction and Summarization

 Information Extraction:
 Techniques for extracting structured information from unstructured text.
 Automatic summarization of documents for condensing information.
Unit 6: Information Retrieval and Question Answering

 Information Retrieval:
 Retrieval of relevant documents or passages from a large text corpus.
 Techniques for indexing, ranking, and retrieving information.
 Question Answering:
 Methods for automatically generating answers to user queries based on text data.

Additional Topics:

 Machine Translation:
 Translation of text from one language to another using computational methods.

Question paper :-
a) Explain Natural Language Processing.

Natural Language Processing (NLP) is a field of computer science, artificial intelligence, and
linguistics concerned with the interactions between computers and human (natural) languages. It
involves programming computers to process and analyze large amounts of natural language data.
NLP enables machines to read, understand, and derive meaning from human languages. Key
applications include machine translation, speech recognition, sentiment analysis, and chatbots.

b) List and explain different phases of analysis in Natural Language Processing with an
example for each.

1. Lexical Analysis: This phase involves identifying and analyzing the structure of words.
For example, breaking down "unhappiness" into "un-", "happy", and "-ness".
2. Syntactic Analysis (Parsing): This involves analyzing the grammatical structure of
sentences. For example, parsing "The cat sat on the mat" to identify "The cat" as the noun
phrase and "sat on the mat" as the verb phrase.
3. Semantic Analysis: This phase determines the meaning of words and sentences. For
example, understanding that "bat" can mean both an animal and a piece of sports
equipment.
4. Pragmatic Analysis: This involves understanding the context in which a sentence is
used. For example, "Can you pass the salt?" is understood as a request, not a question
about ability.
5. Discourse Analysis: This looks at the structure of texts and conversations. For example,
understanding reference and coherence in a multi-sentence text.

c) How stemming affects the performance of IR systems?

Stemming is the process of reducing words to their base or root form. In Information Retrieval
(IR) systems, stemming helps in matching documents with queries by reducing words to a
common form. For example, "running", "runner", and "ran" might all be reduced to "run". This
increases the recall of the system by retrieving documents that contain variations of the query
term, but it may decrease precision as sometimes different words are stemmed to the same root
(e.g., "universe" and "university").

d) Explain different types of inferences.

1. Deductive Inference: Derives specific conclusions from general rules. For example, "All
men are mortal. Socrates is a man. Therefore, Socrates is mortal."
2. Inductive Inference: Generalizes from specific instances to broader generalizations. For
example, "All observed swans are white; therefore, all swans are white."
3. Abductive Inference: Involves reasoning from effects to causes. For example, "The
grass is wet; therefore, it probably rained."

e) Distinguish between semantics, pragmatics, and discourse.

 Semantics: Studies the meaning of words and sentences. For example, understanding that
"bank" can mean the side of a river or a financial institution.
 Pragmatics: Studies how context influences the interpretation of meaning. For example,
understanding that "Can you pass the salt?" is a request, not a question about capability.
 Discourse: Studies how sentences connect and flow in larger texts and conversations. For
example, ensuring coherence and reference across multiple sentences in a paragraph.

f) Explain Sentiment Analysis.

Sentiment Analysis is a subfield of NLP that focuses on determining the emotional tone behind a
body of text. It involves classifying text into categories such as positive, negative, or neutral.
Applications include analyzing customer feedback, monitoring social media for public sentiment,
and improving customer service.

Q 2. Answer any four parts of the following.

5x4=20

a) State the difference between homonymy and polysemy along with examples of each.

 Homonymy: Words that sound alike but have different meanings and origins. Example:
"bat" (a flying mammal) and "bat" (a piece of sports equipment).
 Polysemy: A single word that has multiple related meanings. Example: "bank" can mean
the side of a river and a financial institution, with both meanings related to a place where
something is stored or managed.

b) Explain lexicon.

A lexicon is a database of words and their meanings, along with other information such as
pronunciation, part of speech, and syntactic properties. In NLP, a lexicon serves as a reference
for various linguistic tasks, providing essential data for understanding and processing language.

c) Explain the different parts of speech. Differentiate between open class and closed class of
words.

 Parts of Speech:
o Nouns: Name people, places, things, or ideas (e.g., cat, London).
o Verbs: Describe actions or states (e.g., run, is).
o Adjectives: Describe or modify nouns (e.g., happy, blue).
o Adverbs: Modify verbs, adjectives, or other adverbs (e.g., quickly, very).
o Pronouns: Replace nouns (e.g., he, it).
o Prepositions: Show relationships between nouns and other words (e.g., in, on).
o Conjunctions: Connect words, phrases, or clauses (e.g., and, but).
o Interjections: Express strong emotion (e.g., wow, ouch).
 Open Class Words: Categories that frequently add new words (e.g., nouns, verbs,
adjectives, adverbs).
 Closed Class Words: Categories that rarely change or add new words (e.g., pronouns,
prepositions, conjunctions).

d) Explain phonology.

Phonology is the study of the sound system of a language, including the organization and
patterning of sounds. It examines how sounds function in particular languages, how they interact
with each other, and how they are used to convey meaning.

e) Explain the key issues faced in natural language processing.

1. Ambiguity: Words and sentences can have multiple meanings.


2. Complexity of Human Language: Variability in syntax, semantics, and pragmatics.
3. Context Understanding: The need to understand context to interpret meaning correctly.
4. Domain-Specific Language: Language varies across different domains (e.g., medical vs.
legal).
5. Resource Limitations: Lack of annotated data for training NLP models.

f) Explain Phrase structure grammar.


Phrase structure grammar (PSG) is a type of generative grammar that represents the syntactic
structure of sentences using hierarchically nested phrases. PSG consists of rules that define how
words and phrases combine to form sentences. For example:

 S → NP VP
 NP → Det N
 VP → V NP

These rules can generate sentences like "The cat sat on the mat" by breaking it down into its
constituent parts.

Q 3. Answer any two parts of the following.

10x2=20

a) Explain in detail Automatic summarization.

Automatic summarization is the process of shortening a set of data computationally to create a


summary that retains the most important points. There are two main types:

 Extractive Summarization: Selects key sentences or phrases directly from the original
text to form a summary.
 Abstractive Summarization: Generates new sentences that convey the main points of
the original text, often rephrasing the information.

The process involves several steps:

1. Text Preprocessing: Tokenization, removing stop words, stemming, etc.


2. Feature Extraction: Identifying features such as term frequency, sentence position, and
similarity to the title.
3. Scoring and Selection: Assigning scores to sentences based on extracted features and
selecting the top-scoring sentences.
4. Output Generation: Combining selected sentences to form the summary, ensuring
coherence and readability.

b) What are feature sets? How are they represented?

Feature sets are collections of measurable properties or characteristics used to represent data for
machine learning tasks. In NLP, features might include word frequencies, part-of-speech tags,
syntactic dependencies, and more. They are represented as vectors, where each element
corresponds to a specific feature.

For example, a feature vector for text classification might look like:
[term1_frequency, term2_frequency, ..., termN_frequency, avg_sentence_length, num_of_nouns, ...]...]
}[term1_frequency, term2_frequency, ..., termN_frequency, avg_sentence_length, num_of_noun
s, ...]
c) Explain in detail the application of Natural Language Processing.

NLP has a wide range of applications, including but not limited to:

1. Machine Translation: Automatically translating text from one language to another, such
as Google Translate.
2. Speech Recognition: Converting spoken language into text, as used in virtual assistants
like Siri and Alexa.
3. Text Summarization: Producing concise summaries of larger texts, useful for news
aggregation and academic research.
4. Sentiment Analysis: Determining the sentiment expressed in a text, commonly used in
social media monitoring and customer feedback analysis.
5. Chatbots: Enabling conversational agents to interact with users, used in customer service
and support.
6. Information Retrieval: Improving search engines by understanding and retrieving
relevant documents based on user queries.
7. Named Entity Recognition: Identifying and classifying entities (e.g., people,
organizations, locations) in text.
8. Part-of-Speech Tagging: Assigning parts of speech to each word in a text, aiding in
syntactic parsing and text analysis.

Q 4. Answer any two parts of the following.

10x2=20

a) What is Parsing? For the given CFG, illustrate the steps to draw the Top-down parse
tree for the sentence: "The large can can hold the water."

Parsing is the process of analyzing the syntactic structure of a sentence according to a given
grammar. It involves breaking down the sentence into its constituent parts and identifying their
grammatical relationships.

For the sentence "The large can can hold the water," using the given CFG:

 CFG:
o S → NP VP
o DT → the
o NP → DT ADJ N
o ADJ → large
o NP → DT N
o N → can | hold | water
o V → hold
o VP → Aux. VP
o Aux. → can
o VP → V NP
Top-down Parsing Steps:

1. Start with the start symbol: S


2. S → NP VP
3. NP → DT ADJ N
4. DT → the
5. ADJ → large
6. N → can
7. VP → Aux. VP
8. Aux. → can
9. VP → V NP
10. V → hold
11. NP → DT N
12. DT → the
13. N → water

Parse Tree:

css
Copy code
S
/ \
NP VP
| / \
DT Aux VP
| | |
the can V NP
| / \
hold DT N
| |
the water

b) Explain Machine Translation.

Machine Translation (MT) is the process of automatically translating text from one language to
another using computational methods. There are several approaches to MT:

 Rule-based MT: Uses linguistic rules and dictionaries to translate text.


 Statistical MT: Uses statistical models based on bilingual text corpora to predict
translations.
 Neural MT: Uses neural networks, particularly sequence-to-sequence models, to learn
and generate translations. This approach has significantly improved translation quality.

Key components of MT systems include:

1. Preprocessing: Tokenization, normalization, and part-of-speech tagging.


2. Translation Model: Determines the best translation for each input segment.
3. Language Model: Ensures grammaticality and fluency in the target language.
4. Postprocessing: Detokenization and formatting to produce the final translated text.
c) Explain the Earley Algorithm.

The Earley Algorithm is a dynamic programming algorithm for parsing sentences in context-free
grammars (CFGs). It can parse all CFGs, including ambiguous and left-recursive grammars. The
algorithm consists of three main steps: prediction, scanning, and completion.

Steps:

1. Prediction: Adds new states based on the grammar rules. If a state predicts a non-
terminal symbol, new states are added for each production of that non-terminal.
2. Scanning: Reads the next input symbol and adds new states for matching terminals.
3. Completion: When a state is complete (i.e., all symbols on the right-hand side of the
production have been parsed), it finds and completes previous states that were waiting for
this non-terminal.

The algorithm uses an Earley table with entries corresponding to input positions, where each
entry contains states representing partial parses.

Q 5. Answer any two parts of the following.

10x2=20

a) Explain in detail the POS tagging with examples of each.

Part-of-Speech (POS) tagging involves assigning a part of speech to each word in a sentence,
such as noun, verb, adjective, etc. It is crucial for understanding the syntactic structure of
sentences.

Examples of POS Tagging:

1. He/PRP is/VBZ running/VBG fast/RB.


2. The/DT quick/JJ brown/JJ fox/NN jumps/VBZ over/IN the/DT lazy/JJ dog/NN.

Approaches to POS Tagging:

 Rule-based Tagging: Uses hand-written rules to identify the POS tags. Example: If a
word ends in "ing," tag it as a verb (VBG).
 Statistical Tagging: Uses machine learning models trained on annotated corpora to
predict POS tags. Example: Hidden Markov Models (HMM), Conditional Random Fields
(CRF).
 Neural Tagging: Uses neural networks, such as recurrent neural networks (RNNs) or
transformers, to predict POS tags based on the context provided by surrounding words.

b) Explain Word Sense Disambiguation.


Word Sense Disambiguation (WSD) is the process of identifying which sense of a word is used
in a given context. For example, in the sentence "The bank can hold the water," the word "bank"
could mean the side of a river.

Approaches to WSD:

 Knowledge-based Methods: Use dictionaries, thesauri, and lexical databases like


WordNet to determine word senses.
 Supervised Learning: Uses annotated corpora to train machine learning models that can
disambiguate words based on context features.
 Unsupervised Learning: Clusters word contexts and assigns senses based on the
similarity of contexts.

Question paper 2:-


a) Discuss the importance of data pre-processing in NLP.

Data pre-processing is crucial in NLP as it prepares raw text data for further analysis and
modeling, ensuring that the data is clean, consistent, and structured. Key benefits include:

 Improved Accuracy: Cleaning and normalizing text data helps in reducing noise and
variability, leading to better model performance.
 Reduced Complexity: Simplifying text data by removing irrelevant parts (e.g.,
stopwords) reduces dimensionality and computational load.
 Enhanced Interpretability: Pre-processing steps like tokenization and lemmatization
make the data more understandable for both humans and machines.
 Consistency: Standardizing text data (e.g., lowercasing) ensures uniformity across the
dataset, crucial for reliable analysis.

b) What are the main types of phrases and what are their roles in forming sentences?

The main types of phrases in English grammar include:

 Noun Phrase (NP): Contains a noun and its modifiers (e.g., "the big dog"). It functions
as a subject, object, or complement.
 Verb Phrase (VP): Contains a main verb and its auxiliaries, objects, or complements
(e.g., "is running quickly"). It expresses action or state.
 Adjective Phrase (AdjP): Contains an adjective and its modifiers (e.g., "very tall"). It
describes nouns or pronouns.
 Adverb Phrase (AdvP): Contains an adverb and its modifiers (e.g., "quite slowly"). It
modifies verbs, adjectives, or other adverbs.
 Prepositional Phrase (PP): Contains a preposition and its object (e.g., "in the park"). It
functions as an adjective or adverb, providing additional information.
These phrases are building blocks of sentences, each contributing to the overall syntactic and
semantic structure.

c) Explain the concept of context-free grammars and how they relate to phrase structure
grammars.

A Context-Free Grammar (CFG) is a type of formal grammar used to define the syntactic
structure of languages. CFGs consist of a set of production rules that specify how symbols (non-
terminals) can be expanded into sequences of other symbols (terminals and non-terminals). Each
rule takes the form: A→αA \rightarrow \alphaA→α, where AAA is a non-terminal and α\alphaα
is a sequence of terminals and non-terminals.

Phrase Structure Grammars (PSG) are a type of CFG specifically used to describe the
hierarchical structure of phrases in a language. PSGs define how words and phrases combine to
form larger syntactic units (e.g., sentences), emphasizing the nested structure of language.

d) What is syntactic parsing and why is it important in natural language processing?

Syntactic parsing is the process of analyzing the grammatical structure of a sentence to identify
its constituent parts and their relationships. It involves breaking down a sentence into its parts of
speech (e.g., nouns, verbs) and determining how these parts are connected (e.g., subject, object).

Importance in NLP:

 Understanding Structure: Parsing helps in understanding the syntactic structure of


sentences, which is essential for tasks like translation, summarization, and information
extraction.
 Disambiguation: Helps resolve ambiguities by clarifying grammatical relationships,
improving the accuracy of subsequent NLP tasks.
 Foundation for Semantics: Provides a basis for semantic analysis, enabling deeper
understanding of sentence meaning.

e) Discuss briefly the concept of semantic parsing and its relationship to natural language
processing.

Semantic parsing involves converting natural language into a structured representation that
captures the meaning of the text. This structured representation can be in the form of logical
forms, semantic graphs, or other formal structures that facilitate understanding and manipulation
by machines.

Relationship to NLP:

 Core Task: Essential for applications like question answering, machine translation, and
dialogue systems, where understanding meaning is crucial.
 Integration: Builds on syntactic parsing by adding layers of meaning, enabling more
sophisticated text analysis and interpretation.
 Applications: Used in tasks requiring precise understanding of user intents and the
relationships between different entities in the text.

Q 2. Answer any four parts of the following.

5x4=20

a) What are the main techniques used for tokenization, stemming, and stopword removal
in NLP?

1. Tokenization:
o Whitespace Tokenization: Splits text based on spaces.
o Punctuation-based Tokenization: Uses punctuation marks to define token
boundaries.
o Regex Tokenization: Employs regular expressions to identify tokens based on
patterns.
2. Stemming:
o Porter Stemmer: Uses a series of rules to iteratively remove suffixes.
o Lancaster Stemmer: A more aggressive version of the Porter Stemmer.
o Snowball Stemmer: An improved version of the Porter Stemmer, offering better
performance.
3. Stopword Removal:
o Predefined Lists: Uses standard lists of stopwords (e.g., "the", "is", "in").
o Frequency-based Methods: Removes the most frequent words in a corpus.
o Customized Lists: Tailors stopwords based on specific application needs.

b) Describe various techniques used for text classification in NLP.

1. Naive Bayes Classifier: Uses probability theory to classify text based on the frequency
of words.
2. Support Vector Machines (SVM): Finds the optimal hyperplane to separate different
classes in a high-dimensional space.
3. Decision Trees: Classifies text by making a series of decisions based on feature values.
4. Neural Networks: Uses layers of interconnected nodes to learn complex patterns in text
data.
5. Ensemble Methods: Combines multiple classifiers (e.g., Random Forests) to improve
accuracy.
6. Deep Learning Models: Includes architectures like Convolutional Neural Networks
(CNNs) and Recurrent Neural Networks (RNNs) for handling text classification tasks.

c) Describe the main components of an English sentence according to phrase structure


grammar.

1. Noun Phrase (NP): Functions as the subject or object (e.g., "The quick brown fox").
2. Verb Phrase (VP): Contains the main verb and its objects or complements (e.g., "jumps
over the lazy dog").
3. Prepositional Phrase (PP): Provides additional context or information (e.g., "over the
lazy dog").
4. Adjective Phrase (AdjP): Modifies nouns (e.g., "very quick").
5. Adverb Phrase (AdvP): Modifies verbs, adjectives, or other adverbs (e.g., "extremely
quickly").

These components are structured hierarchically to form complete sentences.

d) Describe the top-down parsing strategy in syntactic parsing.

Top-down parsing starts from the highest-level rule and recursively breaks it down into its
constituent parts until reaching the terminal symbols (words) of the sentence.

Steps:

1. Start with the start symbol (e.g., S for sentence).


2. Expand the non-terminal using production rules.
3. Match the terminals with the input sentence.
4. Backtrack if a rule does not match, trying alternative rules.

Advantages:

 Simple and intuitive.


 Easy to implement with recursive methods.

Disadvantages:

 Inefficient for left-recursive grammars.


 Can lead to significant backtracking.

e) What are some common approaches to semantic parsing?

1. Grammar-based Approaches: Use predefined rules and grammars to convert text into
logical forms.
2. Machine Learning-based Approaches: Train models on annotated datasets to learn
mappings from text to semantic representations.
3. Neural Network-based Approaches: Utilize deep learning models, such as sequence-to-
sequence architectures, to generate semantic parses directly from text.
4. Hybrid Approaches: Combine rule-based and statistical methods to leverage the
strengths of both.

f) What is information retrieval, and how does it differ from traditional database systems?

Information Retrieval (IR) involves finding relevant documents or information within large
datasets based on user queries. It focuses on unstructured data (e.g., text, multimedia).
Differences from Traditional Database Systems:

 Data Type: IR deals with unstructured or semi-structured data, while databases handle
structured data.
 Querying: IR uses keyword-based or natural language queries, whereas databases use
structured query languages like SQL.
 Indexing and Searching: IR uses inverted indexes and relevance scoring, while
databases rely on primary and secondary indexes for exact matches.
 Flexibility: IR systems handle ambiguity and partial matches better, providing ranked
results based on relevance.

Q 3. Answer any two parts of the following.

10x2= 20

a) What is Natural Language Processing and what are its main applications? Explain the
difference between rule-based and statistical approaches to NLP.

Natural Language Processing (NLP) is the field of AI that focuses on the interaction between
computers and humans through natural language. It involves processing and analyzing large
amounts of natural language data to enable machines to understand, interpret, and generate
human language.

Main Applications:

 Machine Translation: Automatically translating text from one language to another.


 Speech Recognition: Converting spoken language into text.
 Text Summarization: Creating concise summaries of longer texts.
 Sentiment Analysis: Determining the sentiment expressed in text (e.g., positive,
negative).
 Chatbots: Automated conversational agents for customer service.
 Information Retrieval: Finding relevant documents based on user queries.
 Named Entity Recognition: Identifying

Model paper only 10 marks:-

Model Paper 1
Unit 1: Natural Language Processing: applications and key issues, The lexicon and
morphology

1. Explain the various applications of Natural Language Processing (NLP) in today's


technology landscape. (10 marks)
2. Discuss the key challenges faced in NLP. How can these challenges be mitigated? (10
marks)
3. Define morphology in the context of NLP. Differentiate between inflectional and
derivational morphology with examples. (10 marks)

Unit 2: Phrase structure grammars and English syntax, Part of speech tagging

4. Describe the role of phrase structure grammars in syntactic analysis. Provide an


example to illustrate your explanation. (10 marks)
5. What is part of speech tagging? Explain any one algorithm used for part of speech
tagging. (10 marks)

Unit 3: Syntactic parsing, top-down and bottom-up parsing strategies

6. Compare and contrast top-down and bottom-up parsing strategies. (10 marks)

Model Paper 2

Unit 4: Semantics, Word Sense Disambiguation, Semantic parsing, Subjectivity and


sentiment analysis

1. What is semantics in NLP? Discuss the importance of word sense disambiguation in


semantic analysis. (10 marks)
2. Explain the process of semantic parsing with an example. (10 marks)
3. How is subjectivity and sentiment analysis performed in NLP? Discuss its
applications. (10 marks)

Unit 5: Information extraction, Automatic summarization

4. Define information extraction in NLP. Describe the main techniques used for
information extraction. (10 marks)
5. What is automatic summarization? Differentiate between extractive and abstractive
summarization. (10 marks)

Unit 6: Information retrieval and Question answering, Machine translation

6. Explain the basic concepts of information retrieval. How does it differ from
information extraction? (10 marks)

Model Paper 3
Unit 1: Natural Language Processing: applications and key issues, The lexicon and
morphology

1. Outline the historical evolution of NLP and its key milestones. (10 marks)
2. Describe the structure and function of a lexicon in NLP. (10 marks)
3. Explain the concept of morphological analysis with suitable examples. (10 marks)

Unit 2: Phrase structure grammars and English syntax, Part of speech tagging

4. Discuss the significance of English syntax in NLP. Provide examples of different


syntactic structures. (10 marks)
5. Evaluate the performance of Hidden Markov Model (HMM) in part of speech
tagging. (10 marks)

Unit 3: Syntactic parsing, top-down and bottom-up parsing strategies

6. Describe the algorithm for a top-down parsing strategy. Provide an example. (10
marks)

Model Paper 4

Unit 4: Semantics, Word Sense Disambiguation, Semantic parsing, Subjectivity and


sentiment analysis

1. Discuss different approaches to word sense disambiguation. (10 marks)


2. Explain the role of semantic parsing in understanding natural language. (10 marks)
3. Describe techniques used for sentiment analysis in social media. (10 marks)

Unit 5: Information extraction, Automatic summarization

4. Illustrate the process of named entity recognition in information extraction. (10


marks)
5. Discuss the challenges involved in automatic summarization. (10 marks)

Unit 6: Information retrieval and Question answering, Machine translation

6. What are the main components of a question answering system? How does it work?
(10 marks)

Model Paper 5

Unit 1: Natural Language Processing: applications and key issues, The lexicon and
morphology

1. Highlight the current trends and future directions in NLP. (10 marks)
2. Discuss the role of morphology in text normalization. (10 marks)
3. Explain the concept of a morphological analyzer with an example. (10 marks)

Unit 2: Phrase structure grammars and English syntax, Part of speech tagging

4. What are context-free grammars? Explain their relevance in NLP. (10 marks)
5. Describe the process of developing a part of speech tagger. (10 marks)

Unit 3: Syntactic parsing, top-down and bottom-up parsing strategies

6. Compare Earley parser and CYK parser. (10 marks)

Model Paper 6

Unit 4: Semantics, Word Sense Disambiguation, Semantic parsing, Subjectivity and


sentiment analysis

1. Explain the principle of compositional semantics. (10 marks)


2. How is semantic parsing different from syntactic parsing? (10 marks)
3. Describe a real-world application of sentiment analysis. (10 marks)

Unit 5: Information extraction, Automatic summarization

4. Discuss the role of machine learning in information extraction. (10 marks)


5. Evaluate the performance metrics used in automatic summarization. (10 marks)

Unit 6: Information retrieval and Question answering, Machine translation

6. Outline the challenges faced in machine translation. (10 marks)

Model Paper 7

Unit 1: Natural Language Processing: applications and key issues, The lexicon and
morphology

1. Discuss the role of NLP in the development of intelligent personal assistants. (10
marks)
2. Explain the importance of lexicons in machine translation systems. (10 marks)
3. Describe different types of morphemes with examples. (10 marks)

Unit 2: Phrase structure grammars and English syntax, Part of speech tagging

4. Illustrate the use of dependency grammars in syntactic analysis. (10 marks)


5. How does a rule-based part of speech tagger work? (10 marks)

Unit 3: Syntactic parsing, top-down and bottom-up parsing strategies


6. Explain shift-reduce parsing strategy with an example. (10 marks)

Model Paper 8

Unit 4: Semantics, Word Sense Disambiguation, Semantic parsing, Subjectivity and


sentiment analysis

1. Discuss the significance of lexical semantics in NLP. (10 marks)


2. What are the main challenges in word sense disambiguation? (10 marks)
3. Describe the methods used for detecting subjectivity in text. (10 marks)

Unit 5: Information extraction, Automatic summarization

4. Explain the process of relation extraction in information extraction. (10 marks)


5. Discuss the role of deep learning in automatic summarization. (10 marks)

Unit 6: Information retrieval and Question answering, Machine translation

6. How does an information retrieval system rank documents? (10 marks)

Model Paper 9

Unit 1: Natural Language Processing: applications and key issues, The lexicon and
morphology

1. Describe the applications of NLP in healthcare. (10 marks)


2. What is the role of lexicons in sentiment analysis? (10 marks)
3. Explain the difference between compounding and affixation in morphology. (10
marks)

Unit 2: Phrase structure grammars and English syntax, Part of speech tagging

4. How do phrase structure grammars help in natural language understanding? (10


marks)
5. Discuss the limitations of statistical part of speech tagging. (10 marks)

Unit 3: Syntactic parsing, top-down and bottom-up parsing strategies

6. Provide an example of a bottom-up parsing strategy and explain its steps. (10 marks)

Model Paper 10

Unit 4: Semantics, Word Sense Disambiguation, Semantic parsing, Subjectivity and


sentiment analysis

1. Describe the process of creating a semantic network for NLP. (10 marks)
2. What techniques are used for automatic word sense disambiguation? (10 marks)
3. Explain the concept of sentiment polarity and its determination. (10 marks)

Unit 5: Information extraction, Automatic summarization

4. Illustrate the challenges in extracting structured information from unstructured


text. (10 marks)
5. Discuss the evaluation methods for summarization algorithms. (10 marks)

Unit 6: Information retrieval and Question answering, Machine translation

6. Explain the architecture of a machine translation system. (10 marks)

You might also like