0% found this document useful (0 votes)

11 views26 pages

Lexical Analysis 1

Uploaded by

Believer

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views26 pages

Lexical Analysis 1

Uploaded by

Believer

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 26

Lexical Analysis

Dr. Alok Kumar

Department of Computer Science
and Engineering
UIET, CSJM University, Kanpur
Topic Covered
 Role of Lexical Analyzer
 Tokens, Patterns, Lexemes
 Lexical Errors and Recovery
 Specification of Tokens
 Recognition of Tokens
 Finite Automata
 Tool lex
 Conclusion

2 Lexical Analysis- Dr. Alok Kumar

Lexical analyzer
The main task of lexical analysis is to
read input characters in the code and
produce tokens.
"Get next token" is a command which is
sent from the parser to the lexical
analyzer.
On receiving this command, the lexical
analyzer scans the input until it finds the
next token.

3 Lexical Analysis- Dr. Alok Kumar

Role of lexical analyzer

4 Lexical Analysis- Dr. Alok Kumar

Why to separate Lexical
analysis and parsing
 Simplicity of design
 Improving compiler efficiency
 Enhancing compiler portability

5 Lexical Analysis- Dr. Alok Kumar

Tokens, Patterns and
Lexemes
• A token is a pair – a token name and an optional token
value
• A pattern is a description of the form that the
lexemes of a token may take
• A lexeme is a sequence of characters in the source
program that matches the pattern for a token

6 Lexical Analysis- Dr. Alok Kumar

Example

7 Lexical Analysis- Dr. Alok Kumar

Attributes for tokens
• E = M * C ** 2
– <id, pointer to symbol table entry for E>
– <assign-op>
– <id, pointer to symbol table entry for M>
– <mult-op>
– <id, pointer to symbol table entry for C>
– <exp-op>
– <number, integer value 2>

8 Lexical Analysis- Dr. Alok Kumar

Lexical errors
• Some errors are out of power of lexical analyzer to
recognize:
– fi (a == f(x)) …
• However it may be able to recognize errors like:
– d = 2r
• Such errors are recognized when no pattern
for tokens matches a character sequence

9 Lexical Analysis- Dr. Alok Kumar

Error recovery
• Panic mode: successive characters are ignored
until we reach to a well formed token
• Delete one character from the remaining input
• Insert a missing character into the remaining
input
• Replace a character by another character
• Transpose two adjacent characters

10 Lexical Analysis- Dr. Alok Kumar

Input Buffering
• Sometimes lexical analyzer needs to look ahead some
symbols to decide about the token to return
– In C language: we need to look after -, = or < to decide what
 token to return
– In Fortran: DO 5 I = 1.25
• We need to introduce a two buffer scheme to handle
large look-aheads safely

11 Lexical Analysis- Dr. Alok Kumar

Specification of tokens
• In theory of compilation regular expressions are used to
formalize the
• specification of tokens
• Regular expressions are means for specifying regular
languages
• Example:
• letter(letter | digit)*
• Each regular expression is a pattern specifying the form of
strings

12 Lexical Analysis- Dr. Alok Kumar

Regular Expressions
• Ɛ is a regular expression denoting the language L(Ɛ) = {Ɛ}, containing
only the empty string
• If a is a symbol in ∑then a is a regular expression, L(a) =
{a}
• If r and s are two regular expressions with languages L(r)
and L(s), then
– r|s is a regular expression denoting the language L(r)∪L(s),
containing all strings of L(r) and L(s)
– rs is a regular expression denoting the language L(r)L(s),
created by
 concatenating the strings of L(s) to L(r)
– r* is a regular expression denoting (L(r))*, the set containing
zero or more occurrences of the strings of L(r)
– (r) is a regular expression corresponding to the language L(r)

13 Lexical Analysis- Dr. Alok Kumar

Regular definitions
d1 -> r1
d2 -> r2
…
dn -> rn

• Example:
letter_ -> A | B | … | Z | a | b | … | Z | _
digit -> 0 | 1 | … | 9
Id -> letter_ (letter_ | digit)*

14 Lexical Analysis- Dr. Alok Kumar

Extensions
• One or more instances: (r)+
• Zero of one instances: r?
• Character classes: [abc]

• Example:
• letter_-> [A-Za-z_]
• digit-> [0-9]
• ID-> letter_(letter_|digit)*

15 Lexical Analysis- Dr. Alok Kumar

Examples with ∑= {0,1}
• (0|1)*: All binary strings including the empty
string
• (0|1)(0|1)*: All nonempty binary strings
• 0(0|1)*0: All binary strings of length at least 2,
starting and ending with 0s
• (0|1)*0(0|1)(0|1)(0|1): All binary strings with at
least three characters in which the third-last
character is always 0
• 0*10*10*10*: All binary strings possessing
exactly three 1s
16 Lexical Analysis- Dr. Alok Kumar
Recognition of tokens
• Starting point is the language grammar to understand the
tokens:
 stmt -> if expr then stmt
| if expr then stmt else stmt
|Ɛ
 expr -> term relop term
|term
term -> id
|

numb
er

17 Lexical Analysis- Dr. Alok Kumar

Recognition of tokens
(cont.)
• The next step is to formalize the patterns:
digit -> [0-9]
Digits -> digit+
number -> digit(.digits)? (E[+-]? Digit)?
letter -> [A-Za-z_]
id -> letter (letter|digit)*
If -> if
Then -> then
Else -> else
Relop -> < | > | <= | >= | = | <>
• We also need to handle whitespaces:
ws -> (blank | tab | newline)+
18 Lexical Analysis- Dr. Alok Kumar
Transition diagrams
Transition diagram for relop

19 Lexical Analysis- Dr. Alok Kumar

Transition diagrams
(cont.)
Transition diagram for reserved words and
identifiers

Transition diagram for unsigned numbers

20 Lexical Analysis- Dr. Alok Kumar

Architecture of a transition-
diagram-based lexical analyzer
TOKEN getRelop()
{
TOKEN retToken = new (RELOP)
while (1) { /* repeat character processing until a
return or failure occurs */
switch(state) {
case 0: c= nextchar();
if (c == ‘<‘) state = 1;
else if (c == ‘=‘) state = 5;
else if (c == ‘>’) state = 6;
else fail(); /* lexeme is not a relop */
break;
case 1: …
…
case 8: retract();
retToken.attribute = GT; return(retToken);
}

21 Lexical Analysis- Dr. Alok Kumar

Finite Automata
• Regular expressions = specification
• Finite automata = implementation

• A finite automaton consists of

– An input alphabet 
– A set of states S
– A start state n
– A set of accepting states F  S

A set of transitions state  state

22 Lexical Analysis- Dr. Alok Kumar

Lexical Analyzer
Generator - Lex

23 Lexical Analysis- Dr. Alok Kumar

Structure of Lex programs

declarations
%%
translation rules Pattern {Action}
%%
auxiliary functions

24 Lexical Analysis- Dr. Alok Kumar

Example
. %{
/* definitions of manifest constants
LT, LE, EQ, NE, GT, GE,
IF, THEN, ELSE, ID, NUMBER, RELOP
%} */

/* regular definitions
delim [ \t\n]
ws {delim}+
letter[A-Za-z]
digit [0-9]
id {letter}({letter}|{digit})*
number {digit}+(\.{digit}+)?(E[+-]?{digit}+)?

%%
{ws} {/* no action and no return */}
if {return(IF);}
then {return(THEN);}
else {return(ELSE);}
{id} {yylval = (int)
installID(); return(ID);
}
{number}
{yylval = (int)
installNum(); 25
return(NUMBER);}
Lexical
… Analysis- Dr. Alok Kumar
Conclusion
• Words of a language can be specified using regular
expressions
• NFA and DFA can act as acceptors
• Regular expressions can be converted to NFA
• NFA can be converted to DFA
• Automated tool lex can be used to generate lexical
analyser for a language

26 Lexical Analysis- Dr. Alok Kumar

Counseling A Comprehensive Profession by Samuel T
100% (13)
Counseling A Comprehensive Profession by Samuel T
558 pages
Medical Technology Laws and Bioethics
No ratings yet
Medical Technology Laws and Bioethics
12 pages
TCS PYQ Series
No ratings yet
TCS PYQ Series
45 pages
CH 3
No ratings yet
CH 3
66 pages
Unit NO.03 Phases in Compilers-Lexical Analysis& Syntax Analysis
No ratings yet
Unit NO.03 Phases in Compilers-Lexical Analysis& Syntax Analysis
43 pages
Class 11 Economics Sample Paper Set 8
No ratings yet
Class 11 Economics Sample Paper Set 8
8 pages
(3rd Year) Compiler PPT RM
No ratings yet
(3rd Year) Compiler PPT RM
50 pages
A Review of Literature On Emotional Intelligence: Doa Naqvi
No ratings yet
A Review of Literature On Emotional Intelligence: Doa Naqvi
14 pages
CC Unit 2
No ratings yet
CC Unit 2
80 pages
Compiler Design Lexical Analysis
No ratings yet
Compiler Design Lexical Analysis
24 pages
Lexical Analysis
No ratings yet
Lexical Analysis
62 pages
Bach John... Analisis
No ratings yet
Bach John... Analisis
147 pages
Chapter 2
No ratings yet
Chapter 2
77 pages
ASSIGNMENT 2 (25%) : Diploma Programmes Introduction To Information Technology (CSC40704/ CSC40104)
No ratings yet
ASSIGNMENT 2 (25%) : Diploma Programmes Introduction To Information Technology (CSC40704/ CSC40104)
4 pages
Orientation Groupings Auxilio
No ratings yet
Orientation Groupings Auxilio
12 pages
Compiler-Lexical Analysis
100% (1)
Compiler-Lexical Analysis
59 pages
Chapter 2
No ratings yet
Chapter 2
56 pages
CD Unit-2
No ratings yet
CD Unit-2
64 pages
Lexical Analysis
No ratings yet
Lexical Analysis
57 pages
Grsim - RoboCup Small Size Robot Soccer Simulator PDF
No ratings yet
Grsim - RoboCup Small Size Robot Soccer Simulator PDF
11 pages
CSC 415 Compiler Design: Lexical Analysis
No ratings yet
CSC 415 Compiler Design: Lexical Analysis
40 pages
Chapter 2 Lexical - Analysis
No ratings yet
Chapter 2 Lexical - Analysis
38 pages
Personal Statement
100% (1)
Personal Statement
3 pages
E-Poster Clinical Project
No ratings yet
E-Poster Clinical Project
1 page
Floor Based Class Solution Sheet
No ratings yet
Floor Based Class Solution Sheet
4 pages
Compiler - Lexical Analyzer-2
No ratings yet
Compiler - Lexical Analyzer-2
16 pages
Lec 2
No ratings yet
Lec 2
30 pages
Acknowledgements: The Slides For This Lecture Are A Modified Versions of The Offering by
No ratings yet
Acknowledgements: The Slides For This Lecture Are A Modified Versions of The Offering by
40 pages
Ch2 - Lexical Analysis
No ratings yet
Ch2 - Lexical Analysis
71 pages
Chapter2-Lexical Analysis
No ratings yet
Chapter2-Lexical Analysis
64 pages
Wonderlic Test Answer Key & Ranks
No ratings yet
Wonderlic Test Answer Key & Ranks
2 pages
Compiler Construction Lexical Analysis
No ratings yet
Compiler Construction Lexical Analysis
63 pages
Ch3 Modified
No ratings yet
Ch3 Modified
80 pages
Chapter 2 Lexical Analysis
No ratings yet
Chapter 2 Lexical Analysis
33 pages
Compiler
No ratings yet
Compiler
60 pages
2024 CD-Ch02 Lexical Analysis
No ratings yet
2024 CD-Ch02 Lexical Analysis
25 pages
Lexical Analysis
No ratings yet
Lexical Analysis
62 pages
4 LexicalAnalysis
No ratings yet
4 LexicalAnalysis
27 pages
Daily Quant Booster
No ratings yet
Daily Quant Booster
2 pages
Compiler Lexical Analysis Guide
No ratings yet
Compiler Lexical Analysis Guide
56 pages
MOD 04 - Language Description & Lexical Analysis
No ratings yet
MOD 04 - Language Description & Lexical Analysis
107 pages
Imp QA Formula Sheet - Arithmetic
No ratings yet
Imp QA Formula Sheet - Arithmetic
3 pages
2-Lexical Analysis
No ratings yet
2-Lexical Analysis
52 pages
DZS791 Part1
No ratings yet
DZS791 Part1
61 pages
React Projects To Build in 2024
No ratings yet
React Projects To Build in 2024
9 pages
Lecture 2 10022025 035804pm
No ratings yet
Lecture 2 10022025 035804pm
27 pages
Hcm65r-Hcm65b Manual - 2002 - Issue 1
No ratings yet
Hcm65r-Hcm65b Manual - 2002 - Issue 1
6 pages
Slides 02 - Compiler Construction - UET CS - Lexical Analyzer Rev 2
No ratings yet
Slides 02 - Compiler Construction - UET CS - Lexical Analyzer Rev 2
69 pages
L4 - Lexical Analysis
No ratings yet
L4 - Lexical Analysis
44 pages
Invoice Details For Plab
100% (1)
Invoice Details For Plab
3 pages
Chapter 2
No ratings yet
Chapter 2
39 pages
CD Unit-2
No ratings yet
CD Unit-2
64 pages
Lexical Analysis1
No ratings yet
Lexical Analysis1
44 pages
Percentage
No ratings yet
Percentage
22 pages
Day 21 Coding Mock
No ratings yet
Day 21 Coding Mock
4 pages
E-Mail Instructions Part1
No ratings yet
E-Mail Instructions Part1
3 pages
Resume Part-01
No ratings yet
Resume Part-01
3 pages
Compiler Design: Lexical Analysis
No ratings yet
Compiler Design: Lexical Analysis
27 pages
Chase Sapphire Reserve Strategy Analysis
No ratings yet
Chase Sapphire Reserve Strategy Analysis
2 pages
Digital Logic for GATE CSE Prep
No ratings yet
Digital Logic for GATE CSE Prep
48 pages
SSC Module2 LexicalAnalysis
No ratings yet
SSC Module2 LexicalAnalysis
26 pages
Compiler Design 2
No ratings yet
Compiler Design 2
9 pages
Reissuance Process - Lost Owner's Duplicate
No ratings yet
Reissuance Process - Lost Owner's Duplicate
5 pages
A Lesson Learnt: Read The Text Below and Answer Questions 17 To 24
100% (1)
A Lesson Learnt: Read The Text Below and Answer Questions 17 To 24
4 pages
Week 5-6
No ratings yet
Week 5-6
33 pages
Backend
No ratings yet
Backend
46 pages
Demo Chat Bot Using IBMWatson Assitant
No ratings yet
Demo Chat Bot Using IBMWatson Assitant
22 pages
Discrete Maths
No ratings yet
Discrete Maths
30 pages
Compiler Design
No ratings yet
Compiler Design
19 pages
Theory of Computation: For Practising Decidability and Undecidability
No ratings yet
Theory of Computation: For Practising Decidability and Undecidability
39 pages
Operating System
No ratings yet
Operating System
48 pages
Chpater 2 Lexical Analysis
No ratings yet
Chpater 2 Lexical Analysis
48 pages
L3 FSM
No ratings yet
L3 FSM
20 pages
CC Note 1
No ratings yet
CC Note 1
11 pages
03 Lex Analysis
No ratings yet
03 Lex Analysis
61 pages
Ch2+3 Compiler
No ratings yet
Ch2+3 Compiler
21 pages
L2 Lexical Analysis
No ratings yet
L2 Lexical Analysis
22 pages
AC Motors Vs DC Motors
No ratings yet
AC Motors Vs DC Motors
5 pages
Pdf&rendition 1
No ratings yet
Pdf&rendition 1
14 pages
Lecture 03
No ratings yet
Lecture 03
42 pages
Lexical Analysis
No ratings yet
Lexical Analysis
9 pages
Madrid Vs Mapoy
No ratings yet
Madrid Vs Mapoy
2 pages
Constantine, Sirmium & Early Christianity
No ratings yet
Constantine, Sirmium & Early Christianity
82 pages
Internship Progress Report Vivek
No ratings yet
Internship Progress Report Vivek
10 pages
HRMS Guide for Employees
No ratings yet
HRMS Guide for Employees
30 pages
Rawabi Pearl Compound Brochure 2021
No ratings yet
Rawabi Pearl Compound Brochure 2021
9 pages
Pubmed Microneedl Set
No ratings yet
Pubmed Microneedl Set
3 pages
A Typical Lexical Analyzer Generator Nfa To Dfa DFA Analysis
No ratings yet
A Typical Lexical Analyzer Generator Nfa To Dfa DFA Analysis
64 pages
Lexical Analysis in Compiler Design
No ratings yet
Lexical Analysis in Compiler Design
88 pages
Lexical Analyzer in Perspective: Parser Source Program Token
No ratings yet
Lexical Analyzer in Perspective: Parser Source Program Token
22 pages
Promotion Form
No ratings yet
Promotion Form
2 pages
04 Lexi Cal A Analysis
No ratings yet
04 Lexi Cal A Analysis
39 pages
4 Lexical Analysis
No ratings yet
4 Lexical Analysis
60 pages
Httpbin (1) - HTTP Client Testing Service
No ratings yet
Httpbin (1) - HTTP Client Testing Service
2 pages
Compiler Syntax Analysis Guide
No ratings yet
Compiler Syntax Analysis Guide
39 pages
2 Lex
No ratings yet
2 Lex
45 pages
A Study of Timing in Two Louis Armstrong Solos
No ratings yet
A Study of Timing in Two Louis Armstrong Solos
21 pages
Loader: Big Data Huawei Course
No ratings yet
Loader: Big Data Huawei Course
11 pages
Lay Language
No ratings yet
Lay Language
14 pages
Friends - The One With Russ
No ratings yet
Friends - The One With Russ
15 pages

Lexical Analysis 1

Uploaded by

Lexical Analysis 1

Uploaded by

Lexical Analysis

Dr. Alok Kumar

2 Lexical Analysis- Dr. Alok Kumar

3 Lexical Analysis- Dr. Alok Kumar

4 Lexical Analysis- Dr. Alok Kumar

5 Lexical Analysis- Dr. Alok Kumar

6 Lexical Analysis- Dr. Alok Kumar

7 Lexical Analysis- Dr. Alok Kumar

8 Lexical Analysis- Dr. Alok Kumar

9 Lexical Analysis- Dr. Alok Kumar

10 Lexical Analysis- Dr. Alok Kumar

11 Lexical Analysis- Dr. Alok Kumar

12 Lexical Analysis- Dr. Alok Kumar

13 Lexical Analysis- Dr. Alok Kumar

14 Lexical Analysis- Dr. Alok Kumar

15 Lexical Analysis- Dr. Alok Kumar

17 Lexical Analysis- Dr. Alok Kumar

19 Lexical Analysis- Dr. Alok Kumar

Transition diagram for unsigned numbers

20 Lexical Analysis- Dr. Alok Kumar

21 Lexical Analysis- Dr. Alok Kumar

• A finite automaton consists of

A set of transitions state  state

22 Lexical Analysis- Dr. Alok Kumar

23 Lexical Analysis- Dr. Alok Kumar

24 Lexical Analysis- Dr. Alok Kumar

26 Lexical Analysis- Dr. Alok Kumar

You might also like