Compiler
Construction
Sohail Aslam
Lecture 9
DFA Minimization
The generated DFA may
have a large number of
states.
Hopcroft’s algorithm:
minimizes DFA states
2
DFA Minimization
The generated DFA may
have a large number of
states.
Hopcroft’s algorithm:
minimizes DFA states
3
DFA Minimization
Idea: find groups of
equivalent states.
All transitions from
states in one group G1
go to states in the same
group G2
4
DFA Minimization
Idea: find groups of
equivalent states.
All transitions from
states in one group G1
go to states in the same
group G2
5
DFA Minimization
Construct the minimized
DFA such that there is
one state for each group
of states from the initial
DFA.
6
DFA Minimization
a a
a b b
A B D E
a
b a
b
C
b
DFA for (a | b )*abb
7
DFA Minimization
b a a
a b b
A,C B D E
a
b
Minimized DFA for (a | b )*abb
8
Optimized Acceptor
RE R RE=>NFA
NFA=>DFA
Min. DFA
input w Simulate yes, if w L(R)
string DFA no, if w L(R)
9
Lexical Analyzers
Lexical analyzers (scanners)
use the same mechanism
but they:
• Have multiple RE descriptions
for multiple tokens
• Have a character stream at the
input
10
Lexical Analyzers
Lexical analyzers (scanners)
use the same mechanism
but they:
• Have multiple RE descriptions
for multiple tokens
• Have a character stream at the
input
11
Lexical Analyzers
Lexical analyzers (scanners)
use the same mechanism
but they:
• Have multiple RE descriptions
for multiple tokens
• Have a character stream at the
input
12
Lexical Analyzers
• Return a sequence of matching
tokens at the output (or an
error)
• Always return the longest
matching token
13
Lexical Analyzers
• Return a sequence of matching
tokens at the output (or an
error)
• Always return the longest
matching token
14
Lexical Analyzers
R1…R2 RE=>NFA
NFA=>DFA
Min. DFA
character Simulate Token
stream DFA stream
15
Lexical Analyzer Generators
The lexical analysis process
can automated
We only need to specify
• Regular expressions for tokens
• Rule priorities for multiple
longest match cases
16
Lexical Analyzer Generators
The lexical analysis process
can automated
We only need to specify
• Regular expressions for tokens
• Rule priorities for multiple
longest match cases
17
Lexical Analyzer Generators
Flex
generates lexical analyzer in C
or C++
Jlex
written in Java. Generates
lexical analyzer in Java
18
Lexical Analyzer Generators
Flex
generates lexical analyzer in C
or C++
Jlex
written in Java. Generates
lexical analyzer in Java
19
Using Flex
Provide a specification file
Flex reads this file and
produces C or C++ output file
contains the scanner.
The file consists of three
sections
20
Using Flex
Provide a specification file
Flex reads this file and
produces C or C++ output file
contains the scanner.
The file consists of three
sections
21
Using Flex
Provide a specification file
Flex reads this file and
produces C or C++ output file
contains the scanner.
The file consists of three
sections
22
Flex Specification File
1 C or C++ and flex definitions
23
Flex Specification File
1 C or C++ and flex definitions
2 %%
token definitions and actions
24
Flex Specification File
1 C or C++ and flex definitions
%%
2 token definitions and actions
%%
3 user code
25
Specification File lex.l
%{
#include “tokdefs.h”
%}
D [0-9]
L [a-zA-Z_]
id {L}({L}|{D})*
%%
"void" {return(TOK_VOID);}
"int" {return(TOK_INT);}
"if" {return(TOK_IF);}
26
Specification File lex.l
"else" {return(TOK_ELSE);}
"while"{return(TOK_WHILE)};
"<=" {return(TOK_LE);}
">=" {return(TOK_GE);}
"==" {return(TOK_EQ);}
"!=" {return(TOK_NE);}
{D}+ {return(TOK_INT);}
{id} {return(TOK_ID);}
[\n]|[\t]|[ ];
%% 27
File tokdefs.h
#define TOK_VOID 1
#define TOK_INT 2
#define TOK_IF 3
#define TOK_ELSE 4
#define TOK_WHILE 5
#define TOK_LE 6
#define TOK_GE 7
#define TOK_EQ 8
#define TOK_NE 9
#define TOK_INT 10
#define TOK_ID 111
28
Invoking Flex
lex.l flex lex.cpp
29
Using Generated Scanner
void main()
{
FlexLexer lex;
int tc = lex.yylex();
while(tc != 0)
cout << tc << “,”
<<lex.YYText() << endl;
tc = lex.yylex();
}
30
Creating Scanner EXE
flex lex.l
g++ –c lex.cpp
g++ –c main.cpp
g++ –o lex.exe lex.o main.o
lex <main.cpp
31