Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
25 views24 pages

Compiler LAB4 22BPS1073

The document outlines multiple experiments involving the implementation of parsers using Lex and Yacc for various arithmetic and string pattern evaluations. It details the algorithms, source code, and conclusions for each experiment, including a simple calculator, an Abstract Syntax Tree (AST) parser, and checks for specific string patterns. Each experiment demonstrates key concepts in lexical analysis, syntax parsing, and error handling.

Uploaded by

teeshtakparmar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views24 pages

Compiler LAB4 22BPS1073

The document outlines multiple experiments involving the implementation of parsers using Lex and Yacc for various arithmetic and string pattern evaluations. It details the algorithms, source code, and conclusions for each experiment, including a simple calculator, an Abstract Syntax Tree (AST) parser, and checks for specific string patterns. Each experiment demonstrates key concepts in lexical analysis, syntax parsing, and error handling.

Uploaded by

teeshtakparmar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 24

Name: Teeshta Parmar

Registration Number: 22BPS1073

ASSESSMENT 4
EXPERIMENT 1:

Aim/Objective:
To implement a simple calculator using Lex and Yacc that can evaluate
arithmetic expressions involving addition, subtraction, multiplication, and
division, while also handling parentheses and unary minus.

Algorithm:
Lex (calculator.l) - Tokenizer

1. Define token patterns for numbers (NUMBER), arithmetic operators (+,


-, *, /), parentheses ((, )), and whitespace.

2. Convert numbers from text to integer values using atoi(yytext).

3. Return corresponding tokens to be used by Yacc.

4. Handle errors for invalid characters.

5. Define yywrap() to indicate the end of input.

Yacc (calculator.y) - Parser and Evaluator

1. Define token types for numbers and arithmetic operators.

2. Set operator precedence (+, - have lower precedence than *, /).

3. Define grammar rules:

○ input: Accepts multiple lines of expressions.

○ expr: Handles addition and subtraction.

○ term: Handles multiplication and division, with error


handling for division by zero.

○ factor: Handles numbers and parentheses.


4. Implement semantic actions to evaluate expressions.

5. Print the result when a valid expression is parsed.

6. Handle syntax errors with yyerror().

Source Code:

calculator.l:

%{

#include <stdio.h>

#include

<stdlib.h>

#include

"y.tab.h"

%}

%%

[0-9]+ { yylval.ival = atoi(yytext); return

NUMBER; } [ \t] ; // Ignore whitespace

"+" { return ADD; }

"-" { return SUB; }

"*" { return MUL; }

"/" { return DIV; }

"(" { return '('; }

")" { return ')'; }

\n { return '\n'; }

. { fprintf(stderr, "Invalid character: %s\n", yytext); return yytext[0]; }

%%

int

yywrap()

{ return
1;
}

calculator.y:

%{

#include

<stdio.h>

#include

<stdlib.h> int

yylex(void);

void yyerror(const char *s);

%}

%union

{ int

ival;

%token <ival> NUMBER

%token ADD SUB MUL DIV

%type <ival> expr term factor

%left ADD SUB

%left MUL DIV

%%

input:

line

| input line

;
line:

expr '\n' { printf("Result: %d\n", $1); }

| '\n'

expr:

term

| expr ADD term { $$ = $1 + $3; }

| expr SUB term { $$ = $1 - $3; }

term:

factor

| term MUL factor { $$ = $1 * $3; }

| term DIV factor

{ if ($3 == 0)

yyerror("Division by

zero"); YYERROR;

$$ = $1 / $3;

| SUB factor { $$ = -$2; } // Simplified unary minus handling

factor:

NUMBER

| '(' expr ')' { $$ = $2; }


;

%%

int main() {

printf("Simple Calculator\n");

printf("Enter expressions (e.g., 2 + 3 * 4). Press Enter for a new line.


Ctrl+D to exit.\n");

if (yyparse() != 0) {

fprintf(stderr, "Parsing

failed\n"); return 1;

return 0;

void yyerror(const char *s) {

fprintf(stderr, "Error: %s\n", s);

Input/ Output:
Conclusion:
The Lex and Yacc-based calculator successfully parses and evaluates
arithmetic expressions by following operator precedence and handling
errors such as division by zero. It serves as a foundational example of
compiler design, demonstrating lexical analysis and syntax parsing in
action.
EXPERIMENT 2:

Aim/Objective:

To implement an Abstract Syntax Tree (AST) based expression parser using


Lex and Yacc, which converts infix expressions into postfix notation and
handles various arithmetic operations including addition, subtraction,
multiplication, division, exponentiation, and unary minus.

Algorithm:

Lexical Analysis (parser_lex.l)

1. Define tokens for numbers (NUMBER), operators (+, -, *, /,


^), and parentheses ((, )).

2. Use regular expressions to identify numbers and whitespace.

3. Convert recognized numbers into integer values and return


corresponding tokens to Yacc.

4. Handle unrecognized characters with an error message.

Syntax Parsing and AST Generation (parser_yacc.y)

1. Define tokens and operator precedence (+, -, *, /, ^ with right


associativity for exponentiation).

2. Define grammar rules:

○ expression: Handles addition and subtraction.

○ term: Handles multiplication and division.

○ factor: Handles numbers, parentheses, exponentiation,


and unary minus.

3. Construct an AST using createNode(), where:

○ Operators form internal nodes.

○ Operands (numbers) form leaf nodes.

4. Implement postfix traversal (generatePostfix()) to convert


infix expressions into postfix notation.
5. Free allocated memory using freeAST().

6. Handle syntax errors with yyerror().

7. In main(), read input, parse expressions, generate and print postfix


notation.

Source Code:
%{

#include <stdio.h>

#include

<stdlib.h>

#include

<string.h>

// Operator stack (simple array implementation)

#define STACK_SIZE 100

char operator_stack[STACK_SIZE];

int stack_top = -1;

// Output buffer (for demonstration - can print directly)

#define OUTPUT_BUFFER_SIZE 200

char output_buffer[OUTPUT_BUFFER_SIZE];

int output_buffer_index = 0;

// Function to get operator precedence

int get_precedence(char operator)

switch

(operator) {
case '+':

case '-':
return 1;

case '*':

case '/':

return 2;

case '^':

return 3; // Exponentiation

default:

return 0; // For parentheses or invalid operators

// Function to push operator onto

stack void push_operator(char

operator) {

if (stack_top < STACK_SIZE - 1) {

operator_stack[++stack_top] = operator;

} else {

fprintf(stderr, "Operator stack

overflow!\n"); exit(1);

// Function to pop operator from

stack char pop_operator() {

if (stack_top >= 0) {

return operator_stack[stack_top--];
} else {

return '\0'; // Stack is empty

// Function to peek at the top operator on

stack char peek_operator() {

if (stack_top >= 0) {

return operator_stack[stack_top];

} else {

return '\0'; // Stack is empty

// Function to append to output buffer (or directly

print) void append_output(const char *token) {

int token_len = strlen(token);

if (output_buffer_index + token_len + 1 < OUTPUT_BUFFER_SIZE)

{ strcat(output_buffer + output_buffer_index, token);

output_buffer_index += token_len;

output_buffer[output_buffer_index++] = ' '; // Add space

output_buffer[output_buffer_index] = '\0';

} else {

fprintf(stderr, "Output buffer

overflow!\n"); exit(1);
}

void process_operator(char

operator); void flush_operators();

void reset_output_buffer();

%}

%%

[0-9]+ {

append_output(yytext);

[a-zA-Z]+ {

append_output(yytext);

"+"|"-"|"*"|"/"|"^" {

process_operator(yytext[0]);

"(" {

push_operator('(');

}
")" {

char operator;

while ((operator = pop_operator()) != '\0' &&

operator != '(') { char op_str[2] = {operator, '\0'};

append_output(op_str);

if (operator != '(' && operator != '\0') {

fprintf(stderr, "Mismatched parentheses\n"); // Should not happen in


valid expressions

[ \t] { /* Ignore whitespace */ }

\n {

flush_operators();

printf("%s\n",

output_buffer);

reset_output_buffer();

stack_top = -1; // Reset operator stack for next line

.{

fprintf(stderr, "Invalid character: %s\n", yytext);

}
%%

void process_operator(char

operator) { char top_operator =

peek_operator();

while (top_operator != '\0' && top_operator != '(' &&

((get_precedence(top_operator) > get_precedence(operator)) ||

(get_precedence(top_operator) == get_precedence(operator) &&


operator != '^') )) // Left associative except for ^

char op_to_output[2] = {pop_operator(), '\0'}; // Correctly create string

from char append_output(op_to_output);

top_operator = peek_operator();

push_operator(operator);

void flush_operators() {

while (stack_top >=

0) {

char operator =

pop_operator(); if (operator

== '(') {

fprintf(stderr, "Mismatched parentheses (extra opening

parenthesis)\n"); return; // Or handle error as needed

char op_str[2] = {operator, '\0'};


append_output(op_str);

void

reset_output_buffer()

{ output_buffer[0] =

'\0';

output_buffer_index =

0;

int

yywrap()

{ return

1;

int main() {

printf("Enter infix expressions (one per line, press Enter to evaluate):\n");

reset_output_buffer(); // Initialize output buffer

yylex();

return 0;

Input/Output:
Conclusion:
The NFA was successfully converted into a DFA using the subset
construction method. This process ensures the resulting DFA is
deterministic and efficient for practical applications. By representing DFA
states as subsets of NFA states, we guarantee complete transitions for
every input symbol, which validates the correctness of the conversion
method.
EXPERIMENT 3(a):

Aim/Objective:

To implement a Lex and Yacc-based parser that checks if an input string


follows the pattern "aⁿ bᵐ where m ≠ n". It counts occurrences of 'a' and 'b'
and determines whether the input follows the valid form.

Algorithm:

Lexical Analysis (Lex Code)

1. Count occurrences of 'a' and 'b' using yyleng:

○ a+ → Counts consecutive 'a' characters and returns token A.

○ b+ → Counts consecutive 'b' characters and returns token B.

2. Ignore other characters.

3. Return 0 when encountering a newline (\n) to indicate the end of


input.

Syntax Parsing (Yacc Code)

1. Define tokens A and B representing sequences of 'a' and 'b'.

2. Define a rule:

○ S → A B: If a_count ≠ b_count, print "Valid: aⁿ bᵐ where m ≠ n".

○ Else, print "Invalid: aⁿ bⁿ".

3. Handle errors using yyerror().

4. In main(), read user input and parse it.

Source Code:

YACC Code:

%{

#include <stdio.h>

#include <stdlib.h>
extern int a_count, b_count; // Access from Lex

void yyerror(const char

*s); int yylex();

%}

%token A B

%%

S:AB{

if (a_count != b_count)

printf("Valid: a^n b^m where m ≠ n\

n"); else

printf("Invalid: a^n b^n\n");

%%

void yyerror(const char *s) {

printf("Error: %s\n", s);

int main() {

printf("Enter a string:

"); yyparse();
return 0;

LEX CODE :

%{

#include "y.tab.h"

#include <string.h>

int a_count = 0, b_count = 0; // To store counts of 'a' and 'b'

%}

%%

a+ { a_count = yyleng; return

A; } b+ { b_count = yyleng;

return B; }

\n { return 0; }

. { /* Ignore other characters */ }

%%

int yywrap() { return 1; }

Input and Output:


Conclusion:
This Lex and Yacc-based implementation successfully verifies whether an
input string follows the pattern "aⁿ bᵐ where m ≠ n" by counting occurrences
of 'a' and 'b'. It demonstrates lexical analysis, syntax parsing, and token-
based validation.

EXPERIMENT 3(b):

Aim/Objective:

This Lex and Yacc-based parser checks whether an input string follows the
pattern:
S → AB (BBAA)ⁿ BBA (BA)ⁿ
where:

● "AB" is a required prefix.

● "BBAA" can repeat zero or more times ((BBAA)ⁿ).

● "BBA" follows after "AB" and optional "BBAA".

● "BA" can repeat zero or more times ((BA)ⁿ).

Algorithm:
Lexical Analysis (Lex Code)

1. Identify specific patterns in the input:

○ "ab" → Token AB

○ "bbaa" → Token BBAA

○ "bba" → Token BBA

○ "ba" → Token BA

2. Ignore other characters.

3. Stop processing on encountering a newline (\n).

Syntax Parsing (Yacc Code)

1. Define tokens AB, BBAA, BBA, and BA.


2. Define grammar rules:

○ S → AB P BBA Q

○ P → (BBAA)* (zero or more repetitions of "bbaa")

○ Q → (BA)* (zero or more repetitions of "ba")

3. If the input matches the grammar, print "Valid string".

4. Handle errors with yyerror().

5. Use yyparse() in main() to parse user input.

Source Code:

YACC CODE:

%{

#include <stdio.h>

#include

<stdlib.h>

void yyerror(const char *s);

int yylex();

%}

%token AB BBAA BBA BA

%%

S : AB P BBA Q { printf("Valid string\n"); }

P : /* Empty */
| BBAA P /* (bbaa)^n */

Q : /* Empty */

| BA Q /* (ba)^n */

%%

void yyerror(const char *s) {

printf("Error: %s\n", s);

int main() {

printf("Enter a string:

"); yyparse();

return 0;

LEX CODE:

%{

#include "y.tab.h"

%}

%%

ab { return AB; }

bbaa { return

BBAA; } bba

{ return BBA; }
ba { return BA; }

\n { return 0; }

. { /* Ignore other characters */ }

%%

int yywrap() { return 1; }

Input and Output:

Conclusion:

This Lex and Yacc parser validates whether an input string follows the
pattern "AB
(BBAA)ⁿ BBA (BA)ⁿ". It demonstrates lexical tokenization, syntax analysis,
and pattern validation.

You might also like