Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
53 views9 pages

2 Lexical Analyser Generator

Lex and flex are lexical analyzer generators that take regular expression definitions as input and output C code for an efficient scanner. The generated C code can then be compiled and linked with other code to perform lexical analysis on an input stream and return a sequence of tokens. A lex specification consists of regular expression definitions, C declarations, translation rules that match patterns and specify actions, and optional user-defined procedures.

Uploaded by

insaan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views9 pages

2 Lexical Analyser Generator

Lex and flex are lexical analyzer generators that take regular expression definitions as input and output C code for an efficient scanner. The generated C code can then be compiled and linked with other code to perform lexical analysis on an input stream and return a sequence of tokens. A lex specification consists of regular expression definitions, C declarations, translation rules that match patterns and specify actions, and optional user-defined procedures.

Uploaded by

insaan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Lexical Analyzer Generator

The Lex
Lex and its newer cousin flex are scanner generators
Systematically translate regular definitions into C source code for
efficient scanning
Generated code is easy to integrate in C applications

2
Lexical Analyzer Generator - Lex

Lex Source program Lexical lex.yy.c


lex.l Compiler

lex.yy.c
C a.out
compiler

Input stream Sequence


a.out
of tokens

3
4

Lex Specification
A lex specification consists of three parts:
regular definitions, C declarations in %{ %}
%%
translation rules
%%
user-defined auxiliary procedures
The translation rules are of the form:
p1 { action1 }
p2 { action2 }

pn { actionn }

4
5

Regular Expressions in Lex


x match the character x
\. match the character .
“string”match contents of string of characters
. match any character except newline
^ match beginning of a line
$ match the end of a line
[xyz] match one character x, y, or z (use \ to escape -)
[^xyz]match any character except x, y, and z
[a-z] match one of a to z
r* closure (match zero or more occurrences)
r+ positive closure (match one or more occurrences)
r? optional (match zero or one occurrence)
r1r2 match r1 then r2 (concatenation)
r1|r2 match r1 or r2 (union)
(r) grouping
r1\r2 match r1 when followed by r2
{d} match the regular expression defined by d 5
6

Example Lex Specification 1

Contains
%{ the matching
Translation #include <stdio.h> lexeme
%}
rules %%
[0-9]+ { printf(“%s\n”, yytext); }
.|\n { }
%% Invokes
main() the lexical
{ yylex(); analyzer
}

lex spec.l
gcc lex.yy.c -ll
./a.out spec.l
6
7

Example Lex Specification 2

%{
#include <stdio.h> Regular
int ch = 0, wd = 0, nl = 0;
definition
Translation %}
delim [ \t]+
rules
%%
\n { ch++; wd++; nl++; }
^{delim} { ch+=yyleng; }
{delim} { ch+=yyleng; wd++; }
. { ch++; }
%%
main()
{ yylex();
printf("%8d%8d%8d\n", nl, wd, ch);
}

7
8

Example Lex Specification 3

%{
#include <stdio.h> Regular
%}
definitions
Translation digit [0-9]
letter [A-Za-z]
rules
id {letter}({letter}|{digit})*
%%
{digit}+ { printf(“number: %s\n”, yytext); }
{id} { printf(“ident: %s\n”, yytext); }
. { printf(“other: %s\n”, yytext); }
%%
main()
{ yylex();
}

8
9

Example Lex Specification 4


%{ /* definitions of manifest constants */
#define LT (256)

%}
delim [ \t\n]
ws {delim}+
letter [A-Za-z] Return
digit [0-9]
id {letter}({letter}|{digit})* token to
number {digit}+(\.{digit}+)?(E[+\-]?{digit}+)?
%%
parser
{ws} { }
if {return IF;} Token
then {return THEN;}
else {return ELSE;}
attribute
{id} {yylval = install_id(); return ID;}
{number} {yylval = install_num(); return NUMBER;}
“<“ {yylval = LT; return RELOP;}
“<=“ {yylval = LE; return RELOP;}
“=“ {yylval = EQ; return RELOP;}
“<>“ {yylval = NE; return RELOP;}
“>“ {yylval = GT; return RELOP;}
“>=“
%%
{yylval = GE; return RELOP;} Install yytext as
int install_id() identifier in symbol table9

You might also like