Lecture 2.1 - L1 Token, Pattern and Lexemes

The lexical analyzer's primary role is to scan source programs and break them into tokens, while also removing comments, converting cases, and eliminating whitespace. It differentiates between tokens, lexemes, and patterns, and handles lexical errors through various strategies. Error recovery strategies include panic mode, statement mode, error productions, and global correction, each with its own approach to managing errors during parsing.

Uploaded by

shahinsimo6242s

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views13 pages

Lecture 2.1 - L1 Token, Pattern and Lexemes

Uploaded by

shahinsimo6242s

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

The Role of the Lexical Analyzer

 Roles
 Primary role: Scan a source program (a string) and break it up into small, meaningful
units, called tokens.
 Example: position := initial + rate * 60;
 Transform into meaningful units: identifiers, constants, operators, and punctuation.

 Other roles:
 Removal of comments
 Case conversion
 Removal of white spaces

Why separate LA from parser?

 Simpler design of both LA and parser
 More efficient compiler
 More portable compiler
Tokens
 Examples of Tokens
 Operators = + − > ( { := == <>
 Keywords if while for int double
 Numeric literals 43 6.035 -3.6e10 0x13F3A
 Character literals ‘a’ ‘~’ ‘\’’
 String literals “3.142” “aBcDe” “\”
• Examples of non-tokens
 White space space(‘ ’) tab(‘\t’) eoln(‘\n’)
 Comments /*this is not a token*/
Interaction of Lexical analyzer and parser
token
 Example
Source Lexical parser
program analyzer
Nexttoken()

symbol
table
How it works
 The Lexical analyzer perform certain other tasks besides
identification of tokens. One such task is stripping out
comments and whitespace (blank, newline, tab, and perhaps
other characters that are used to separate tokens in the input).

Sometimes, lexical analyzers are divided into two processes:

 a) Scanning consists of the simple processes that do not require

tokenization of the input, such as deletion of comments and
compaction of consecutive whitespace characters into one.

 b) Lexical analysis proper is the more complex portion, where the

scanner produces the sequence of tokens as output.
 Type of tokens in C++:
 Constants: main() {
 char constants: ‘a’ int i, j;
for (I=0; I<50; I++) {
 string constants: “I=%d” printf(“I = %d”, I);
}
 int constants: 50 }
 float point constants
 Identifiers: i, j, counter, ……
 Reserved words: main, int, for, …
 Operators: +, =, ++, /, …
 Misc. symbols: (, ), {, }, …
Tokens, Patterns, and Lexemes
 Token: a certain classification of entities of a program.
 four kinds of tokens in previous example: identifiers,
operators, constraints, and punctuation.

 Lexeme: A specific instance of a token. Used to

differentiate tokens. For instance, both position and initial
belong to the identifier class, however each a different
lexeme.

 Patterns: Rule describing how tokens are specified in a

program.
Example…cntd
printf (“Total = %d\n”, score) ;
Lexical Errors
fi (a==f(x)) - fi is misspelled or keyword? Or undeclared
function identifier?
 If fi is a valid lexeme for the token id, the lexical analyzer
must return the token id to the parser and let some other
phase of the compiler - handle the error
How?
1. Delete one character from the remaining input.
2. Insert a missing character into the remaining input.
3. Replace a character by another character.
4. Transpose two adjacent characters.
Type of Errors
 Lexical : name of some identifier typed incorrectly
 Syntactical : missing semicolon or unbalanced
parenthesis
 Semantical : incompatible value assignment
 Logical : code not reachable, infinite loop
Errors Recovery Strategies
 Panic mode
 Statement mode
 Error productions
 Global correction
Errors Recovery Strategies(Cont.)
Panic Mode:
When a parser encounters an error anywhere in the statement, it
ignores the rest of the statement by not processing input from
erroneous input to delimiter, such as semi-colon. This is the easiest
way of error-recovery and also, it prevents the parser from
developing infinite loops.

Statement Mode:
When a parser encounters an error, it tries to take corrective
measures so that the rest of inputs of statement allow the parser to
parse ahead. For example, inserting a missing semicolon, replacing
comma with a semicolon etc. Parser designers have to be careful
here because one wrong correction may lead to an infinite loop.
Errors Recovery Strategies(Cont.)
Error productions:
Some common errors are known to the compiler designers that
may occur in the code. In addition, the designers can create
augmented grammar to be used, as productions that generate
erroneous constructs when these errors are encountered.

Global correction:
The parser considers the program in hand as a whole and tries to
figure out what the program is intended to do and tries to find out
a closest match for it, which is error-free. When an erroneous input
(statement) X is fed, it creates a parse tree for some closest error-
free statement Y. This may allow the parser to make minimal
changes in the source code, but due to the complexity (time and
space) of this strategy, it has not been implemented in practice yet.

Past Simple. Irregular Verbs
100% (1)
Past Simple. Irregular Verbs
6 pages
Modals of Speculation 19560
100% (1)
Modals of Speculation 19560
1 page
Sky High 3 Course PDF
No ratings yet
Sky High 3 Course PDF
7 pages
02 Lexical Analysis
No ratings yet
02 Lexical Analysis
86 pages
Lexical Analyzer
No ratings yet
Lexical Analyzer
56 pages
Compiler Design
No ratings yet
Compiler Design
117 pages
Adverb Clauses: Until It Stops Raining, We Will Stay Inside. When Your Father Gets Here, We Will Go
No ratings yet
Adverb Clauses: Until It Stops Raining, We Will Stay Inside. When Your Father Gets Here, We Will Go
3 pages
Lexical Analysis: Risul Islam Rasel
No ratings yet
Lexical Analysis: Risul Islam Rasel
148 pages
English Grammar and Correct Usage Sample Tests
No ratings yet
English Grammar and Correct Usage Sample Tests
9 pages
CD Unit - 2
50% (2)
CD Unit - 2
148 pages
Ayuda 1.1. Definite and Indefinite Articles
No ratings yet
Ayuda 1.1. Definite and Indefinite Articles
12 pages
Compiler Construction: Tahir Iqbal
No ratings yet
Compiler Construction: Tahir Iqbal
28 pages
2-Lexical Analysis
No ratings yet
2-Lexical Analysis
52 pages
Gerund and Infinitive Exercises
100% (1)
Gerund and Infinitive Exercises
14 pages
Notes - IAE-1-CD
No ratings yet
Notes - IAE-1-CD
14 pages
Lexical and Syntax Analysis: CSE 325/CSE 425: Concepts of Programming Language
No ratings yet
Lexical and Syntax Analysis: CSE 325/CSE 425: Concepts of Programming Language
41 pages
Compiler Design Chapter 2
No ratings yet
Compiler Design Chapter 2
14 pages
Compiler Rewind
No ratings yet
Compiler Rewind
52 pages
English Grammar - Preposition
No ratings yet
English Grammar - Preposition
23 pages
Soal Bahas TBI 02
No ratings yet
Soal Bahas TBI 02
18 pages
3a. Context Free Grammar
No ratings yet
3a. Context Free Grammar
18 pages
Chapter 2 - Lexical Analysis
No ratings yet
Chapter 2 - Lexical Analysis
74 pages
CD - CH2 - Lexical Analysis
No ratings yet
CD - CH2 - Lexical Analysis
59 pages
4 Lexical Analysis
No ratings yet
4 Lexical Analysis
60 pages
Capitalization Quiz for Students
No ratings yet
Capitalization Quiz for Students
3 pages
Transformation of Sentences
No ratings yet
Transformation of Sentences
8 pages
Grammar & Sentence Structure Guide
No ratings yet
Grammar & Sentence Structure Guide
54 pages
CD - Ch.1
No ratings yet
CD - Ch.1
28 pages
Compiler Lexical Analysis Guide
No ratings yet
Compiler Lexical Analysis Guide
56 pages
CD Notes
No ratings yet
CD Notes
194 pages
Chapter 1
No ratings yet
Chapter 1
28 pages
Modal Verbs: How Many Are There?
No ratings yet
Modal Verbs: How Many Are There?
4 pages
CD KCS502 Unit 1 B
No ratings yet
CD KCS502 Unit 1 B
12 pages
21CS51 ATCD MODULE 2 - 2 Lexical Analyser Part2
No ratings yet
21CS51 ATCD MODULE 2 - 2 Lexical Analyser Part2
62 pages
Programming Language Syntax
No ratings yet
Programming Language Syntax
41 pages
Simple Present Tense Quiz
No ratings yet
Simple Present Tense Quiz
2 pages
Lexical Analysis for CS Students
No ratings yet
Lexical Analysis for CS Students
31 pages
Active Voice
No ratings yet
Active Voice
3 pages
Module 5 Lexical Analyser
No ratings yet
Module 5 Lexical Analyser
10 pages
Lexical and Syntax Analysis
No ratings yet
Lexical and Syntax Analysis
63 pages
21CS51 ATCD MODULE 2 - 2 Lexical Analyser Part1
No ratings yet
21CS51 ATCD MODULE 2 - 2 Lexical Analyser Part1
63 pages
Lesson 08 2
No ratings yet
Lesson 08 2
33 pages
CD - Ch.1
No ratings yet
CD - Ch.1
28 pages
Unit-2 F&CD
No ratings yet
Unit-2 F&CD
31 pages
Compiler Design Lexical Analysis
No ratings yet
Compiler Design Lexical Analysis
24 pages
Comp Final
No ratings yet
Comp Final
16 pages
Coherence Is Product of Many Different Factors, Which Combine To Make Every Paragraph, Every
No ratings yet
Coherence Is Product of Many Different Factors, Which Combine To Make Every Paragraph, Every
1 page
Assignment On Syntax
No ratings yet
Assignment On Syntax
8 pages
Lecture 4 Lexical Analysis
No ratings yet
Lecture 4 Lexical Analysis
23 pages
Compiler Lexical Analysis Guide
No ratings yet
Compiler Lexical Analysis Guide
26 pages
Lexical Analysis
No ratings yet
Lexical Analysis
45 pages
Chapter 2 - Lexical Analysis
No ratings yet
Chapter 2 - Lexical Analysis
10 pages
SP Unit III-2024-25
No ratings yet
SP Unit III-2024-25
126 pages
Compiler Construction CS-4207: Lecture 4-5 Instructor Name: Atif Ishaq
100% (1)
Compiler Construction CS-4207: Lecture 4-5 Instructor Name: Atif Ishaq
37 pages
Lecture3 E
No ratings yet
Lecture3 E
153 pages
Detailed Lesson Plan in English 3 - Super Final
No ratings yet
Detailed Lesson Plan in English 3 - Super Final
4 pages
En4Ss-Iiic-1.4: "Big Feet-Big Heart" Adapted From Chicken Soup For The Soul by Jack Canfield and Mark Hansel
No ratings yet
En4Ss-Iiic-1.4: "Big Feet-Big Heart" Adapted From Chicken Soup For The Soul by Jack Canfield and Mark Hansel
4 pages
CD UNIT-1
No ratings yet
CD UNIT-1
60 pages
CC LL
No ratings yet
CC LL
15 pages
Pronoun Antecedent Agreement Slides
No ratings yet
Pronoun Antecedent Agreement Slides
23 pages
Lexical Analysis Overview
No ratings yet
Lexical Analysis Overview
17 pages
Past Simple & Adjectives Exercise
No ratings yet
Past Simple & Adjectives Exercise
1 page
Compiler Construction Lec 1b
No ratings yet
Compiler Construction Lec 1b
37 pages
Compiler Design Essentials
No ratings yet
Compiler Design Essentials
18 pages
Chapter 2 Lexical Analysis (Scanning)
No ratings yet
Chapter 2 Lexical Analysis (Scanning)
56 pages
Present Perfect Tense Exercises
No ratings yet
Present Perfect Tense Exercises
1 page
Ieo Sample Paper Class-1
No ratings yet
Ieo Sample Paper Class-1
2 pages
Linguistics: Phrase Structure Rules
No ratings yet
Linguistics: Phrase Structure Rules
5 pages
CD Unit-1 (Part-1)
No ratings yet
CD Unit-1 (Part-1)
18 pages
HW 31712
No ratings yet
HW 31712
22 pages
Gold B1+ Pre-First - Pages 1 To 4
No ratings yet
Gold B1+ Pre-First - Pages 1 To 4
4 pages
Lecture 02
No ratings yet
Lecture 02
150 pages
Unit 2
No ratings yet
Unit 2
14 pages
Compiler Designnotes
No ratings yet
Compiler Designnotes
18 pages
DLP Active and Passive Voice
No ratings yet
DLP Active and Passive Voice
17 pages
Asmarani RPP
No ratings yet
Asmarani RPP
13 pages
Acd 2.1
No ratings yet
Acd 2.1
20 pages
Clause Theory and Questions
No ratings yet
Clause Theory and Questions
5 pages
Lexical Analyser Lecture 4, 5, 6
No ratings yet
Lexical Analyser Lecture 4, 5, 6
66 pages
All Passive Forms Guided Discovery
No ratings yet
All Passive Forms Guided Discovery
3 pages
2 Lexing
No ratings yet
2 Lexing
71 pages
Lexical Analysis
No ratings yet
Lexical Analysis
153 pages
When Does A Lexical Analyzer Report An Error
No ratings yet
When Does A Lexical Analyzer Report An Error
7 pages
Unit NO.03 Phases in Compilers-Lexical Analysis& Syntax Analysis
No ratings yet
Unit NO.03 Phases in Compilers-Lexical Analysis& Syntax Analysis
43 pages
Punctuations Lecture Handout-2
No ratings yet
Punctuations Lecture Handout-2
5 pages
Unit II
No ratings yet
Unit II
35 pages

Lecture 2.1 - L1 Token, Pattern and Lexemes

Uploaded by

Lecture 2.1 - L1 Token, Pattern and Lexemes

Uploaded by

The Role of the Lexical Analyzer

Why separate LA from parser?

Sometimes, lexical analyzers are divided into two processes:

 a) Scanning consists of the simple processes that do not require

 b) Lexical analysis proper is the more complex portion, where the

 Lexeme: A specific instance of a token. Used to

 Patterns: Rule describing how tokens are specified in a

You might also like