Name: Teeshta Parmar
Registration Number: 22BPS1073
ASSESSMENT 4
EXPERIMENT 1:
Aim/Objective:
To implement a simple calculator using Lex and Yacc that can evaluate
arithmetic expressions involving addition, subtraction, multiplication, and
division, while also handling parentheses and unary minus.
Algorithm:
Lex (calculator.l) - Tokenizer
1. Define token patterns for numbers (NUMBER), arithmetic operators (+,
-, *, /), parentheses ((, )), and whitespace.
2. Convert numbers from text to integer values using atoi(yytext).
3. Return corresponding tokens to be used by Yacc.
4. Handle errors for invalid characters.
5. Define yywrap() to indicate the end of input.
Yacc (calculator.y) - Parser and Evaluator
1. Define token types for numbers and arithmetic operators.
2. Set operator precedence (+, - have lower precedence than *, /).
3. Define grammar rules:
○ input: Accepts multiple lines of expressions.
○ expr: Handles addition and subtraction.
○ term: Handles multiplication and division, with error
handling for division by zero.
○ factor: Handles numbers and parentheses.
4. Implement semantic actions to evaluate expressions.
5. Print the result when a valid expression is parsed.
6. Handle syntax errors with yyerror().
Source Code:
calculator.l:
%{
#include <stdio.h>
#include
<stdlib.h>
#include
"y.tab.h"
%}
%%
[0-9]+ { yylval.ival = atoi(yytext); return
NUMBER; } [ \t] ; // Ignore whitespace
"+" { return ADD; }
"-" { return SUB; }
"*" { return MUL; }
"/" { return DIV; }
"(" { return '('; }
")" { return ')'; }
\n { return '\n'; }
. { fprintf(stderr, "Invalid character: %s\n", yytext); return yytext[0]; }
%%
int
yywrap()
{ return
1;
}
calculator.y:
%{
#include
<stdio.h>
#include
<stdlib.h> int
yylex(void);
void yyerror(const char *s);
%}
%union
{ int
ival;
%token <ival> NUMBER
%token ADD SUB MUL DIV
%type <ival> expr term factor
%left ADD SUB
%left MUL DIV
%%
input:
line
| input line
;
line:
expr '\n' { printf("Result: %d\n", $1); }
| '\n'
expr:
term
| expr ADD term { $$ = $1 + $3; }
| expr SUB term { $$ = $1 - $3; }
term:
factor
| term MUL factor { $$ = $1 * $3; }
| term DIV factor
{ if ($3 == 0)
yyerror("Division by
zero"); YYERROR;
$$ = $1 / $3;
| SUB factor { $$ = -$2; } // Simplified unary minus handling
factor:
NUMBER
| '(' expr ')' { $$ = $2; }
;
%%
int main() {
printf("Simple Calculator\n");
printf("Enter expressions (e.g., 2 + 3 * 4). Press Enter for a new line.
Ctrl+D to exit.\n");
if (yyparse() != 0) {
fprintf(stderr, "Parsing
failed\n"); return 1;
return 0;
void yyerror(const char *s) {
fprintf(stderr, "Error: %s\n", s);
Input/ Output:
Conclusion:
The Lex and Yacc-based calculator successfully parses and evaluates
arithmetic expressions by following operator precedence and handling
errors such as division by zero. It serves as a foundational example of
compiler design, demonstrating lexical analysis and syntax parsing in
action.
EXPERIMENT 2:
Aim/Objective:
To implement an Abstract Syntax Tree (AST) based expression parser using
Lex and Yacc, which converts infix expressions into postfix notation and
handles various arithmetic operations including addition, subtraction,
multiplication, division, exponentiation, and unary minus.
Algorithm:
Lexical Analysis (parser_lex.l)
1. Define tokens for numbers (NUMBER), operators (+, -, *, /,
^), and parentheses ((, )).
2. Use regular expressions to identify numbers and whitespace.
3. Convert recognized numbers into integer values and return
corresponding tokens to Yacc.
4. Handle unrecognized characters with an error message.
Syntax Parsing and AST Generation (parser_yacc.y)
1. Define tokens and operator precedence (+, -, *, /, ^ with right
associativity for exponentiation).
2. Define grammar rules:
○ expression: Handles addition and subtraction.
○ term: Handles multiplication and division.
○ factor: Handles numbers, parentheses, exponentiation,
and unary minus.
3. Construct an AST using createNode(), where:
○ Operators form internal nodes.
○ Operands (numbers) form leaf nodes.
4. Implement postfix traversal (generatePostfix()) to convert
infix expressions into postfix notation.
5. Free allocated memory using freeAST().
6. Handle syntax errors with yyerror().
7. In main(), read input, parse expressions, generate and print postfix
notation.
Source Code:
%{
#include <stdio.h>
#include
<stdlib.h>
#include
<string.h>
// Operator stack (simple array implementation)
#define STACK_SIZE 100
char operator_stack[STACK_SIZE];
int stack_top = -1;
// Output buffer (for demonstration - can print directly)
#define OUTPUT_BUFFER_SIZE 200
char output_buffer[OUTPUT_BUFFER_SIZE];
int output_buffer_index = 0;
// Function to get operator precedence
int get_precedence(char operator)
switch
(operator) {
case '+':
case '-':
return 1;
case '*':
case '/':
return 2;
case '^':
return 3; // Exponentiation
default:
return 0; // For parentheses or invalid operators
// Function to push operator onto
stack void push_operator(char
operator) {
if (stack_top < STACK_SIZE - 1) {
operator_stack[++stack_top] = operator;
} else {
fprintf(stderr, "Operator stack
overflow!\n"); exit(1);
// Function to pop operator from
stack char pop_operator() {
if (stack_top >= 0) {
return operator_stack[stack_top--];
} else {
return '\0'; // Stack is empty
// Function to peek at the top operator on
stack char peek_operator() {
if (stack_top >= 0) {
return operator_stack[stack_top];
} else {
return '\0'; // Stack is empty
// Function to append to output buffer (or directly
print) void append_output(const char *token) {
int token_len = strlen(token);
if (output_buffer_index + token_len + 1 < OUTPUT_BUFFER_SIZE)
{ strcat(output_buffer + output_buffer_index, token);
output_buffer_index += token_len;
output_buffer[output_buffer_index++] = ' '; // Add space
output_buffer[output_buffer_index] = '\0';
} else {
fprintf(stderr, "Output buffer
overflow!\n"); exit(1);
}
void process_operator(char
operator); void flush_operators();
void reset_output_buffer();
%}
%%
[0-9]+ {
append_output(yytext);
[a-zA-Z]+ {
append_output(yytext);
"+"|"-"|"*"|"/"|"^" {
process_operator(yytext[0]);
"(" {
push_operator('(');
}
")" {
char operator;
while ((operator = pop_operator()) != '\0' &&
operator != '(') { char op_str[2] = {operator, '\0'};
append_output(op_str);
if (operator != '(' && operator != '\0') {
fprintf(stderr, "Mismatched parentheses\n"); // Should not happen in
valid expressions
[ \t] { /* Ignore whitespace */ }
\n {
flush_operators();
printf("%s\n",
output_buffer);
reset_output_buffer();
stack_top = -1; // Reset operator stack for next line
.{
fprintf(stderr, "Invalid character: %s\n", yytext);
}
%%
void process_operator(char
operator) { char top_operator =
peek_operator();
while (top_operator != '\0' && top_operator != '(' &&
((get_precedence(top_operator) > get_precedence(operator)) ||
(get_precedence(top_operator) == get_precedence(operator) &&
operator != '^') )) // Left associative except for ^
char op_to_output[2] = {pop_operator(), '\0'}; // Correctly create string
from char append_output(op_to_output);
top_operator = peek_operator();
push_operator(operator);
void flush_operators() {
while (stack_top >=
0) {
char operator =
pop_operator(); if (operator
== '(') {
fprintf(stderr, "Mismatched parentheses (extra opening
parenthesis)\n"); return; // Or handle error as needed
char op_str[2] = {operator, '\0'};
append_output(op_str);
void
reset_output_buffer()
{ output_buffer[0] =
'\0';
output_buffer_index =
0;
int
yywrap()
{ return
1;
int main() {
printf("Enter infix expressions (one per line, press Enter to evaluate):\n");
reset_output_buffer(); // Initialize output buffer
yylex();
return 0;
Input/Output:
Conclusion:
The NFA was successfully converted into a DFA using the subset
construction method. This process ensures the resulting DFA is
deterministic and efficient for practical applications. By representing DFA
states as subsets of NFA states, we guarantee complete transitions for
every input symbol, which validates the correctness of the conversion
method.
EXPERIMENT 3(a):
Aim/Objective:
To implement a Lex and Yacc-based parser that checks if an input string
follows the pattern "aⁿ bᵐ where m ≠ n". It counts occurrences of 'a' and 'b'
and determines whether the input follows the valid form.
Algorithm:
Lexical Analysis (Lex Code)
1. Count occurrences of 'a' and 'b' using yyleng:
○ a+ → Counts consecutive 'a' characters and returns token A.
○ b+ → Counts consecutive 'b' characters and returns token B.
2. Ignore other characters.
3. Return 0 when encountering a newline (\n) to indicate the end of
input.
Syntax Parsing (Yacc Code)
1. Define tokens A and B representing sequences of 'a' and 'b'.
2. Define a rule:
○ S → A B: If a_count ≠ b_count, print "Valid: aⁿ bᵐ where m ≠ n".
○ Else, print "Invalid: aⁿ bⁿ".
3. Handle errors using yyerror().
4. In main(), read user input and parse it.
Source Code:
YACC Code:
%{
#include <stdio.h>
#include <stdlib.h>
extern int a_count, b_count; // Access from Lex
void yyerror(const char
*s); int yylex();
%}
%token A B
%%
S:AB{
if (a_count != b_count)
printf("Valid: a^n b^m where m ≠ n\
n"); else
printf("Invalid: a^n b^n\n");
%%
void yyerror(const char *s) {
printf("Error: %s\n", s);
int main() {
printf("Enter a string:
"); yyparse();
return 0;
LEX CODE :
%{
#include "y.tab.h"
#include <string.h>
int a_count = 0, b_count = 0; // To store counts of 'a' and 'b'
%}
%%
a+ { a_count = yyleng; return
A; } b+ { b_count = yyleng;
return B; }
\n { return 0; }
. { /* Ignore other characters */ }
%%
int yywrap() { return 1; }
Input and Output:
Conclusion:
This Lex and Yacc-based implementation successfully verifies whether an
input string follows the pattern "aⁿ bᵐ where m ≠ n" by counting occurrences
of 'a' and 'b'. It demonstrates lexical analysis, syntax parsing, and token-
based validation.
EXPERIMENT 3(b):
Aim/Objective:
This Lex and Yacc-based parser checks whether an input string follows the
pattern:
S → AB (BBAA)ⁿ BBA (BA)ⁿ
where:
● "AB" is a required prefix.
● "BBAA" can repeat zero or more times ((BBAA)ⁿ).
● "BBA" follows after "AB" and optional "BBAA".
● "BA" can repeat zero or more times ((BA)ⁿ).
Algorithm:
Lexical Analysis (Lex Code)
1. Identify specific patterns in the input:
○ "ab" → Token AB
○ "bbaa" → Token BBAA
○ "bba" → Token BBA
○ "ba" → Token BA
2. Ignore other characters.
3. Stop processing on encountering a newline (\n).
Syntax Parsing (Yacc Code)
1. Define tokens AB, BBAA, BBA, and BA.
2. Define grammar rules:
○ S → AB P BBA Q
○ P → (BBAA)* (zero or more repetitions of "bbaa")
○ Q → (BA)* (zero or more repetitions of "ba")
3. If the input matches the grammar, print "Valid string".
4. Handle errors with yyerror().
5. Use yyparse() in main() to parse user input.
Source Code:
YACC CODE:
%{
#include <stdio.h>
#include
<stdlib.h>
void yyerror(const char *s);
int yylex();
%}
%token AB BBAA BBA BA
%%
S : AB P BBA Q { printf("Valid string\n"); }
P : /* Empty */
| BBAA P /* (bbaa)^n */
Q : /* Empty */
| BA Q /* (ba)^n */
%%
void yyerror(const char *s) {
printf("Error: %s\n", s);
int main() {
printf("Enter a string:
"); yyparse();
return 0;
LEX CODE:
%{
#include "y.tab.h"
%}
%%
ab { return AB; }
bbaa { return
BBAA; } bba
{ return BBA; }
ba { return BA; }
\n { return 0; }
. { /* Ignore other characters */ }
%%
int yywrap() { return 1; }
Input and Output:
Conclusion:
This Lex and Yacc parser validates whether an input string follows the
pattern "AB
(BBAA)ⁿ BBA (BA)ⁿ". It demonstrates lexical tokenization, syntax analysis,
and pattern validation.