CS341
CONCEPT OF
PROGRAMMING
LANGUAGES
Ch1 : Preliminaries
Dr. Nada Mobark
COURSE GOALS
▪ Survey of language design issues and their implications for
translation and run-time support.
▪ Overview of modern programming languages and their
features including abstract data and control structures,
binding and scope rules, subprograms, parameter passing
mechanisms, Exception Handling, as well as support for
concurrency.
▪ Describe different paradigms of programming languages
such as: Object-oriented, functional, and Logic programming
languages.
Dr. Nada Mobark, 2025 2
TEXT BOOK
Robert W. Sebesta, Concepts
of Programming Languages
(12th edition), Pearson
Education (2019).
ISBN 9780134997186
Dr. Nada Mobark, 2025 3
SOFTWARE
▪ C, C++, Fortran, and Ada
gcc.gnu.org
▪ C# and F# microsoft.com Java
java.sun.com
▪ Scheme www.plt-
scheme.org/software/drscheme
▪ Python www.python.org
▪ Ruby www.ruby-lang.org
▪ JavaScript/PHP is included in
virtually all browsers;
Dr. Nada Mobark, 2025 4
ALL IN ONE!
Dr. Nada Mobark, 2025 5
TOPICS
1.1 Reasons for Studying Concepts of Programming Languages
1.2 Programming Domains
1.3 Language Evaluation Criteria
1.4 Influences on Language Design
1.5 Language Categories
1.6 Language Design Trade-Offs
1.7 language implementation methods
Dr. Nada Mobark, 2025 6
WHY CONCEPTS OF PROGRAMMING LANGUAGES??
▪ Better use of languages that are already known
▪ New features and unknown constructs
▪ Increased ability to express ideas
▪ Increased ability to learn new languages
▪ See how concepts are incorporated into the design of a language
▪ Improved background for choosing appropriate languages
▪ Choose based on features rather than familiarity
▪ Better understanding of significance of implementation
▪ Understanding design issues leads to intelligent use
Dr. Nada Mobark, 2025 7
JUST WONDERING!
▪ How Many Computer Programming Languages Are There?
▪ According to Wikipedia, there are about 700 programming
languages
▪ Other sources that only list notable languages still count up to an
impressive 245 languages.
▪ Another list called HOPL, that claims to include every
programming language to ever exist, puts the total number of
programming languages at 8,945.
Dr. Nada Mobark, 2025 8
PROGRAMMING DOMAINS
Scientific • Large numbers of floating-point
computations; use of arrays
applications • Fortran
Business • Produce reports, use decimal
numbers and characters
applications • COBOL
Artificial • Symbols rather than numbers
manipulated; use of linked lists
intelligence • LISP
Systems • Need efficiency because of
continuous use
programming •C
• Eclectic collection of languages:
Web Software markup (e.g., HTML), scripting
(e.g., PHP)
Dr. Nada Mobark, 2025 9
LANGUAGE EVALUATION CRITERIA
the ease with which
Readability programs can be read
and understood
the ease with which a
language can be used to
create programs
Writability
conformance to
Reliability specifications
the ultimate
total cost Cost
Dr. Nada Mobark, 2025 10
READABILITY
▪ Ease of maintenance is determined in large part by the
readability of programs
▪ Characteristics that affect readability:
▪ Syntax design
▪ meaningful keywords to indicate its purpose
▪ Special words and methods of forming compound statements ( endif in
Ada)
▪ Simplicity
▪ A manageable set of features and constructs
▪ Minimal feature multiplicity
▪ Orthogonality
▪ A relatively small set of primitive constructs can be combined in a
relatively small number of ways where every possible combination is
legal → less exceptions
Dr. Nada Mobark, 2025 11
WRITABILITY
▪ Writability must be considered in the context of the target
problem domain of a language
▪ VBasic vs. C for GUI application
▪ Characteristics that affect writability:
▪ Simplicity and orthogonality
▪ Few constructs, a small number of primitives, a small set of rules for
combining them
▪ Expressivity
▪ A set of relatively convenient ways of specifying operations
▪ Eg. Using for loops simplified counting loops
Dr. Nada Mobark, 2025 12
RELIABILITY
▪ A program is said to be reliable if it performs to its
specification under all conditions
▪ Related characteristics:
▪ Type checking
▪ Testing for type errors (eg. Function parameters)
▪ Exception handling
▪ Intercept run-time errors and take corrective measures
▪ Aliasing:
▪ Different names to the same memory cell.
Dr. Nada Mobark, 2025 13
COST
▪ Training programmers to use the language
▪ Function of simplicity and orthognality
▪ Writing programs
▪ closeness to particular applications
▪ Reliability: poor reliability leads to high costs
▪ Critical apps → very high
▪ Non-critical → lost future business or lawsuits
▪ Maintaining programs:
▪ Usually done by different programmers → readability is an issue!
▪ large software systems with relatively long lifetimes, maintenance costs
can be as high as two to four times as much as development costs
Dr. Nada Mobark, 2025 14
EVALUATION CRITERIA: OTHER
▪ Portability
▪ The ease with which programs can be moved from one
implementation to another
▪ Generality
▪ The applicability to a wide range of applications
▪ Well-defineness
▪ The completeness and precision of the language’s official
definition
Dr. Nada Mobark, 2025 15
INFLUENCES ON LANGUAGE DESIGN
▪ Computer Architecture ▪ Program Design
▪ Languages are developed Methodologies
around the prevalent ▪ New software development
computer architecture, methodologies (e.g., object-
known as the von Neumann oriented software
architecture development) led to new
programming paradigms
and by extension, new
programming languages
1950s Late
and early 1970s:
1960s: Process-
focus on oriented
machine to data-
efficiency oriented
Late Middle
1960s: 1980s:
structured Object-
programmi oriented
ng (top- program
down ming
Dr. Nada Mobark, 2025 design) 16
LANGUAGE CATEGORIES
Dr. Nada Mobark, 2025 17
LANGUAGE DESIGN TRADE-OFFS
▪ Readability vs. writability
▪ Example: APL provides many powerful operators (and a large
number of new symbols), allowing complex computations to be
written in a compact program but at the cost of poor readability
▪ Reliability vs. cost of execution
▪ Example: Java demands all references to array elements be
checked for proper indexing, which leads to increased execution
cost
▪ Writability (flexibility) vs. reliability
▪ Example: C++ pointers are powerful and very flexible but are
unreliable
▪ The easier a program to write, the more likely it is to be correct!
Dr. Nada Mobark, 2025 18
IMPLEMENTATION METHODS
▪ Translate high-level program (source language) into machine
code (machine language)
▪ Slow translation, fast execution
▪ Compilation process has several phases
Dr. Nada Mobark, 2025 19
IMPLEMENTATION METHODS
▪ Programs are interpreted by another
program known as an interpreter
▪ No translation
▪ Interpreter is a virtual machine with fetch-
decode-execute cycle
▪ produces a result from a program statement
▪ Easier implementation of programs (run-
time errors can easily and immediately be
displayed)
▪ Slower execution (10 to 100 times)
▪ Due to statement decoding
▪ Now rare for traditional high-level
languages
▪ Significant comeback with some Web scripting
languages (e.g., JavaScript, PHP)
Dr. Nada Mobark, 2025 20
COMPARISON
https://www.programiz.com/article/difference-compiler-interpreter
Dr. Nada Mobark, 2025 21
IMPLEMENTATION METHODS
▪ Hybrid Implementation Systems
▪ A compromise between compilers and pure
interpreters
▪ A high-level language program is
translated to an intermediate language
that allows easy interpretation
▪ Faster than pure interpretation
▪ Use: Small and medium systems when
efficiency is not the first concern
https://www.tutorialspoint.com/execute_ruby_online.php
Dr. Nada Mobark, 2025 22
SUMMARY
▪ The study of programming languages is valuable for a
number of reasons:
▪ Increase our capacity to use different constructs
▪ Enable us to choose languages more intelligently
▪ Makes learning new languages easier
▪ Most important criteria for evaluating programming
languages include:
▪ Readability, writability, reliability, cost
▪ Major influences on language design have been machine
architecture and software development methodologies
Dr. Nada Mobark, 2025 23
ANY Q??
CS341
CONCEPT OF
PROGRAMMING
LANGUAGES
Ch2 : Evolution of programming languages
Dr. Nada Mobark
GOAL
Explore the
environment and
motivation behind the
development of a
collection of
programming -
languages.
Dr. Nada Mobark, 2025 26
TOPICS
2.1 Zuse’s Plankalkül
Languages over years:
2.3 Fortran
2.4 ALGOL
2.5 Lisp
2.6 COBOL
2.7 BASIC
2.13 Prolog
2.14 Ada
Dr. Nada Mobark, 2025 27
LANGUAGE EVOLUTION
https://www.youtube.com/watch?v=Og847HVwRSI
Dr. Nada Mobark, 2025 28
MACHINE LANGUAGE
▪ In the late 1940s and early 1950s, machines were:
▪ slow, unreliable, expensive,
▪ with extremely small memories,
▪ no indexing or floating point
▪ difficult to program
▪ What was wrong with using machine code?
▪ Numeric codes 🡪 Poor readability
▪ Absolute addressing 🡪 Poor modifiability
Dr. Nada Mobark, 2025 29
2.1 ZUSE’S PLANKALKÜL
▪ by the German computer pioneer Konrad Zuse, the
creator of the first relay computer
▪ Designed in 1945 for the Z4, proposed in his PhD
dissertation.
▪ Plankalkül, means program calculus.
▪ Not published until 1972
▪ Never implemented
▪ Advanced data structures
▪ floating point, arrays, records
▪ Syntax:
| A + 1 => A
V | 4 5 (subscripts)
S | 1.n 1.n (data types)
Dr. Nada Mobark, 2025 30
GENEALOGY OF COMMON LANGUAGES
Dr. Nada Mobark, 2025 31
2.3 IBM 704 AND FORTRAN
▪ Fortran Environment of development
▪ Computers were small and unreliable
▪ Applications were scientific
▪ No efficient programming models
▪ Machine efficiency was the most
important concern
▪ Developed by John W. Backus at IBM
for their 704 mainframes.
▪ indexing and floating-point hardware
instructions
Dr. Nada Mobark, 2025 32
FORTRAN
▪ Highly optimizing compilers
▪ “A programmer writes only 5 percent of all instructions, and the
program generates (compiles) the remaining 95 percent for the
computer”
▪ helped open the door to modern computing
▪ made code comprehensible to mathematicians and scientists.
25th reunion of
Fortran team,1982
Dr. Nada Mobark, 2025 33
Fortran I_1957:
Fortran 0_1954 • index registers and Fortran
: floating point hardware II_1958:
• compiled programming • Independent
not implemented • Code very fast compilation
• Programs less than 400 • Fixed the bugs
lines
Fortran 90/95: Fortran IV_1960-
Fortran 77_1978:
• Modules
62:
• Character string
• Dynamic arrays, Pointers handling • Explicit type
• Recursion declarations
• Logical loop control
• CASE statement statement • Logical selection
statement
• Parameter type checking • IF-THEN-ELSE
statement • Subprogram names
could be parameters
Fortran 2003: Fortran 2008: Fortran 2018
• OOP • Concurrent • Parallel processing
• procedure pointers Programming
• interoperability
with C
Dr. Nada Mobark, 2025 34
Example code
Dr. Nada Mobark, 2025 35
FORTRAN EVALUATION
▪ The first widely used/acceptable programming language
▪ Originally designed to implement a compiler, only for IBM
machines.
▪ Impressive effect on use of computers and design of
programming languages.
▪ Static typing/allocation 🡪 simple, efficient yet not flexible.
Dr. Nada Mobark, 2025 36
2.4 THE FIRST STEP TOWARD SOPHISTICATION:
ALGOL
▪ Environment of development
▪ FORTRAN had (barely) arrived for IBM 70x
▪ Many other languages were being developed, all for specific
machines
▪ No portable language; all were machine-dependent
▪ No universal language for communicating algorithms
▪ ACM and GAMM met for four days for design (May 27 to June
1, 1958), Goals of the language
▪ Close to mathematical notation
▪ Good for describing algorithms
▪ Must be translatable to machine code
Dr. Nada Mobark, 2025 37
ALGOL EVOLUTION
Concept of type Block structure (local Design is based on the
ALGOL 63
ALGOL 58
ALGOL 60
formalized scope) concept of orthogonality
Names could be any Two parameter passing User-defined data
length methods structures
Arrays could have any Subprogram recursion Reference types
number of subscripts Stack-dynamic arrays Dynamic arrays (called
Subscripts were placed in Still no I/O flex arrays)
brackets new metalanguage ( key
no string handling
Parameters were words and terms)
separated by mode (in &
out)
Compound statements
(begin ... end)
Semicolon as a statement
separator
:= , Assignment operator
if had an else-if clause
No I/O
Dr. Nada Mobark, 2025 38
EXAMPLE CODE
Dr. Nada Mobark, 2025 39
ALGOL EVALUATION
Successes Failure
▪ It was the standard way to ▪ Never widely used,
publish algorithms for over especially in U.S.
20 years ▪ Lack of I/O and the
character set made
▪ All subsequent imperative programs non-portable
languages are based on it
▪ Too flexible--hard to
▪ First machine-independent implement
language ▪ Entrenchment of Fortran
▪ First language whose ▪ Formal syntax description
(BNF)
syntax was formally defined
(BNF) ▪ Lack of support from IBM
Dr. Nada Mobark, 2025 40
2.5 FUNCTIONAL PROGRAMMING: LISP
▪ AI research needed a language to
▪ Process data in lists (rather than arrays)
▪ Symbolic computation (rather than numeric)
▪ LISt Processing language
▪ Designed at MIT by McCarthy
▪ Declarative programming:
▪ What to do not how to do it!!
▪ Only two data types: atoms and lists
▪ Syntax is based on lambda calculus
Dr. Nada Mobark, 2025 41
REPRESENTATION OF TWO LISP LISTS
The lists
(A B C D)
and
(A (B C) D (E (F G)))
Dr. Nada Mobark, 2025 42
LISP EVALUATION
▪ Pioneered functional programming
▪ No need for variables or assignment
▪ Control via recursion and conditional expressions
▪ Still the dominant language for AI
▪ Common Lisp and Scheme are contemporary dialects of Lisp
▪ ML, Haskell, and F# are also functional programming
languages, but use very different syntax
Dr. Nada Mobark, 2025 43
2.6 COMPUTERIZING BUSINESS RECORDS:
COBOL
▪ Environment of development
▪ UNIVAC was beginning to use FLOW-MATIC
▪ USAF was beginning to use AIMACO
▪ IBM was developing COMTRAN (COMmercial TRANslator)
▪ First Design Meeting (Pentagon) - May 1959
▪ members were all from computer manufacturers and DoD
branches
▪ Design Goals:
▪ Must look like simple English
▪ Must be easy to use, even if that means it will be less powerful
▪ Must broaden the base of computer users
▪ Must not be biased by current compiler problems
Dr. Nada Mobark, 2025 44
RECORDS IN COBOL
▪ Record is a collection of fields that is used to describe an
entity.
▪ Field is used to indicate the data stored about an element.
▪ File is a collection of related records.
▪ Simple text files cannot be used in COBOL, instead PS (Physical
Sequential) and VSAM files are used.
Dr. Nada Mobark, 2025 45
EXAMPLE CODE
Dr. Nada Mobark, 2025 46
COBOL EVALUATION
▪ Contributions
▪ First macro facility in a high-level language
▪ Hierarchical data structures (records)
▪ Nested selection statements
▪ Long names (up to 30 characters), with hyphens
▪ Separate data division
▪ First language required by DoD
▪ The poor performance of the early compilers made the language
too expensive to use.
▪ Led to the electronic mechanization of accounting.
▪ Still the most widely used business applications language
Dr. Nada Mobark, 2025 47
2.7 THE BEGINNING OF TIMESHARING: BASIC
▪ Specially designed for ”liberal art” students
▪ Design Goals:
▪ Easy to learn and use for non-science students
▪ Must be “pleasant and friendly”
▪ Fast turnaround for homework
▪ Free and private access
▪ User time is more important than computer time
▪ Poorly structured programs
▪ Current popular dialect: Visual Basic , 1990s
▪ First widely used language with time sharing
▪ Terminals connected to a computer
Dr. Nada Mobark, 2025 48
EXAMPLE CODE
Dr. Nada Mobark, 2025 49
2.13 PROGRAMMING BASED ON LOGIC: PROLOG
▪ Developed, by Comerauer and Roussel (University of Aix-
Marseille), with help from Kowalski ( University of Edinburgh)
▪ Based on formal logic
▪ Non-procedural
▪ Can be summarized as being an intelligent database system
that uses an inferencing process to infer the truth of given
queries
▪ Comparatively inefficient
▪ Few application areas
Dr. Nada Mobark, 2025 50
2.14 HISTORY’S LARGEST DESIGN EFFORT: ADA
▪ Huge design effort, involving hundreds of people,
much money, and about eight years, Sequence of
requirements (1975-1978)
▪ Named after Augusta Ada Byron, the first programmer
▪ the first published algorithm ever specifically tailored for
implementation on a computer
Contributions Comments
• Packages - support for data • Competitive design
abstraction • Included all that was then known
• Exception handling - elaborate about software engineering and
• Generic program units language design
• Concurrency - through the • First compilers were very
tasking model difficult; the first really usable
compiler came nearly five years
after the language design was
completed
Dr. Nada Mobark, 2025 51
EXAMPLE CODE
Dr. Nada Mobark, 2025 52
SUMMARY
▪ Development, development environment, and evaluation of a
number of important programming languages
▪ Perspective into current issues in language design
Dr. Nada Mobark, 2025 53
ANY Q??
CS341
CONCEPT OF
PROGRAMMING
LANGUAGES
Ch3(a) : Describing Syntax
Dr. Nada Mobark
GOAL
How does a
computer
understand
computer programs’
representation?
Compile == understand !
Dr. Nada Mobark, 2025 56
TOPICS
3.1 Introduction
3.2 The General Problem of Describing Syntax
3.3 Formal Methods of Describing Syntax
▪ Grammar
▪ Rules
▪ Derivation
▪ Parse trees
▪ EBNF
Dr. Nada Mobark, 2025 57
WHAT IS A LANGUAGE?
▪ A natural language is a structured system of communication
used by humans consisting of sounds or gestures.
▪ A programming language is a formal language comprising a
set of instructions for computers.
Dr. Nada Mobark, 2025 58
HOW TO DESCRIBE A LANGUAGE?
▪ The study of programming languages, like the study of natural
languages, syntax and semantics are closely related.
• the form or • the meaning of
structure of the the expressions,
expressions, statements, and
statements, and program units
program units
Syntax Semantics
Dr. Nada Mobark, 2025 59
EXAMPLE : SENTENCE STRUCTURE
Dr. Nada Mobark, 2025 60
LANGUAGE DEFINITIONS
Recognizers Generators
• A recognition device • A device that generates sentences
(mechanism): of a language
• reads input strings over the • used to enumerate all of the
alphabet of the language sentences of a language
• decides whether the input • Like a button
strings belong to the language • Preferable by programmers to
• Accept/reject learn a language
Dr. Nada Mobark, 2025 61
WHY TO DESCRIBE A LANGUAGE?
Dr. Nada Mobark, 2025 62
WHY TO DESCRIBE A LANGUAGE?
▪ Syntax and semantics provide a language’s definition
▪ Difficult but essential!
▪ Challenge: diversity of users
▪ Initial evaluators, other language designers
▪ Implementers
▪ Users (Programmers)
Dr. Nada Mobark, 2025 63
FORMAL METHODS FOR DESCRIBING SYNTAX
Context-Free Backus-Naur
Grammars Form (BNF)
• Developed by Noam Chomsky • Invented by John Backus to
in the mid-1950s describe the syntax of
• Language generators, meant ALGOL58
to describe the syntax of • Revised by Peter Naur for
natural languages ALGOL60
• Two grammar classes: • Concise formal descriptions
context-free and regular • Not easily understandable,
new notation
• Not immediately accepted, but
later became the standard
grammars : formal language-generation mechanisms
Dr. Nada Mobark, 2025 64
TERMINOLOGY OF DESCRIBING SYNTAX
A language is a set of sentences
A sentence is a string of characters over
some alphabet
A lexeme is the lowest level syntactic
unit of a language (e.g., *, sum, begin)
A token is a category of lexemes (e.g.,
identifier)
Dr. Nada Mobark, 2025 65
EXAMPLE:
int index = 2 * count + 17;
Lexemes Tokens
▪ int ▪ keyword
▪ index ▪ identifier
▪ count ▪ identifier
▪ = ▪ equal_sign/assignment_
op
▪ +
▪ plus_op
▪ *
▪ mult_op
▪ 2
▪ int_literal
▪ 17
▪ int_literal
▪ ;
▪ semicolon or delimiter
Dr. Nada Mobark, 2025 66
GRAMMARS FUNDAMENTALS
▪ A metalanguage is a language what is used to describe
another language.
▪ BNF or grammar is a metalanguage for programming languages,
i.e., the language of languages.
▪ Single words correspond to terminals
▪ Keywords: class , public, while, for in Java
▪ Literals : 1234, ‘d’
▪ Separators and delimiters: semicolons, commas, brackets, braces
▪ All the structures built on top of terminals (sentences, periods,
paragraphs, chapters, and entire documents) correspond
to non-terminals.
Dr. Nada Mobark, 2025 67
GRAMMAR, RULES
▪ rule/production has a left-hand side (LHS), which is a
nonterminal, and a right-hand side (RHS), which is a string of
terminals and/or non-terminals
▪ nonterminal symbols :
▪ often enclosed in angle brackets
▪ act like syntactic variables
▪ Grammar:
▪ a finite non-empty set of rules
▪ A generative device for defining languages
Dr. Nada Mobark, 2025 68
GRAMMAR, MULTIPLE RULE DEFINITION
▪ Two or more possible syntactic forms in the language:
▪ multiple rules:
▪ Single rule ( | ➔ OR)
Dr. Nada Mobark, 2025 69
GRAMMAR, DESCRIBING LISTS
▪ Variable-length lists in mathematics are written using an
ellipsis (. . .)
▪ Example : 1, 2, . . .
▪ Syntactic lists are described using recursion
Dr. Nada Mobark, 2025 70
EXAMPLE
A start symbol is a special element of the
non-terminals of a grammar
Dr. Nada Mobark, 2025 71
GRAMMAR, DERIVATIONS
▪ A derivation is a repeated application of rules, starting with
the start symbol
▪ The derivation continues until the sentential form contains no non-
terminals.
▪ A derivation may be either leftmost or rightmost
▪ leftmost derivation is one in which the leftmost nonterminal in
each sentential form is the one that is expanded
Dr. Nada Mobark, 2025 72
begin A = B + C ; B = C end DERIVATION:
• symbol => is read
“derives.”
• sentential form: every
string of symbols in a
derivation
• generated sentence : a
sentential form,
consisting of only
terminals, or lexemes.
• Objective : recognition
Dr. Nada Mobark, 2025 73
EXAMPLE
A = B * ( A + C )
Dr. Nada Mobark, 2025 74
PARSE TREE
▪ A hierarchical representation of a derivation
▪ Every internal node of a parse tree is labeled with a
nonterminal symbol;
▪ every leaf is labeled with a terminal symbol.
▪ Every subtree of a parse tree describes one instance of an
abstraction in the sentence.
Dr. Nada Mobark, 2025 75
EXAMPLE
A = B * ( A + C )
Dr. Nada Mobark, 2025 76
AMBIGUITY IN GRAMMARS
▪ A grammar is ambiguous if and only if it generates a
sentential form that has two or more distinct parse trees
▪ If a language structure has more than one parse tree, then the
meaning of the structure cannot be determined uniquely.
▪ Reasons:
▪ Operator precedence
▪ Associativity
▪ Normally, an ambiguous grammar can be rewritten into an
unambiguous grammar.
▪ New non-terminals, new rules : to represent operands, and force
different operators to different levels in the parse tree.
Dr. Nada Mobark, 2025 77
EXAMPLE
Dr. Nada Mobark, 2025 78
EXAMPLE, PARSE TREE
Dr. Nada Mobark, 2025 79
ASSOCIATIVITY OF OPERATORS
▪ Associativity: two operators in an expression with the same
precedence
▪ a semantic rule is required to specify precedence
▪ Example,
A+B+C
▪ left and right associative orders of evaluation mean the same
thing:
(A + B) + C = A + (B + C)
▪ Subtraction and division are not associative, whether in
mathematics or in a computer
Dr. Nada Mobark, 2025 80
EXAMPLE
▪ In (+ and *) left and right
associative orders of
evaluation mean the same
thing:
(A + B) + C = A + (B + C)
▪ In Syntax, left recursion
specifies left associativity
Dr. Nada Mobark, 2025 81
AMBIGUITY EXAMPLE: “DANGLING-ELSE”
▪ Consider the grammar
Ambiguous <if_stmt> ➔ if <logic_expr> then <stmt>
or not?
| if <logic_expr> then <stmt> else <stmt>
<stmt> ➔ <if_stmt>
▪ How to derive the following statement?
if <logic_expr> then if <logic_expr> then
if <logic_expr> then if <logic_expr> then
<stmt> <stmt>
else else
<stmt> <stmt>
Dr. Nada Mobark, 2025 82
PARSE TREE Some languages (like Java)
match each else with the
nearest preceding elseless if
if <logic_expr> then
if <logic_expr> then
<stmt>
else
<stmt>
if <logic_expr> then
if <logic_expr> then
<stmt>
else
<stmt>
Dr. Nada Mobark, 2025 83
EXTENDED BNF
▪ New meta-symbols:
▪ Optional parts are placed in brackets [ ]
▪ Alternative parts of RHSs are placed inside parentheses and
separated via vertical bars
▪ Repetitions (0 or more) are placed inside braces { }
Dr. Nada Mobark, 2025 84
EXAMPLE
▪ If you have a rule such as:
<id> = <letter>
| <id><letter>
| <id><digit>
▪ You can replace it with:
<id> = <letter> {(<letter> | <digit>)}
Dr. Nada Mobark, 2025 85
EXAMPLE
BNF EBNF
<signed_int> = <signed_int> = [ +|- ]
<digit> {<digit>}
+ <int>
| - <int>
<int> = <digit> |
<int><digit>
Dr. Nada Mobark, 2025 86
MORE EXAMPLES …
BNF EBNF
<expr> → <expr> + <expr> → <term> {(+ | -)
<term> <term>}
| <expr>
- <term> <term> → <factor> {(*|/)
<factor>}
| <term>
<term> → <term> *
<factor>
| <term>
/ <factor>
|
<factor>
Dr. Nada Mobark, 2025 87
ANY Q??
CS341
CONCEPT OF
PROGRAMMING
LANGUAGES
Ch3(b) : Describing Semantics
Dr. Nada Mobark
GOAL
Briefly discuss formal
methods of describing
semantics
Dr. Nada Mobark, 2025 90
TOPICS
3.4 Attribute Grammars
3.5 Dynamic Semantics
Operational
4.1 Introduction to Compiler Design
Dr. Nada Mobark, 2025 91
WHY DESCRIBING SEMANTICS?
▪ Several needs for a methodology and notation for semantics:
▪ Programmers need to understand the statements of a language do
before developing programs.
▪ Compiler writers must know exactly what language constructs
mean to design implementations for them correctly.
▪ Correctness proofs would be possible without testing.
▪ Compiler generators would be possible
▪ language designers could discover ambiguities and
inconsistencies in their designs.
Dr. Nada Mobark, 2025 92
SEMANTICS DESCRIPTIONS
Static Dynamic
▪ Extension to BNF grammar ▪ Related to the program
▪ BNF cannot describe all of the meaning during execution
syntax of programming
languages ▪ Can be used to prove
correctness without testing
▪ Checked at compile time
▪ Example: loops
▪ Example : data-type
compatibility
Dr. Nada Mobark, 2025 93
94
3.4 ATTRIBUTE
GRAMMARS
Dr. Nada Mobark, 2025
ATTRIBUTE GRAMMARS: DEFINITION
▪ An attribute grammar is a context-free grammar with the
following additions:
▪ For each grammar symbol X there is a set A(X) of attribute values
consisting of:
▪ S(X): synthesized attributes
▪ I(X): inherited attributes
▪ intrinsic attributes on the leaves:
▪ symbol table → declaration
▪ Each rule has:
▪ a set of functions that define certain attributes of the non-terminals in
the rule
▪ a (possibly empty) set of predicates (Boolean functions) to check for
attribute consistency
▪ A false predicate function value indicates a violation of the syntax or
static semantics rules of the language.
Dr. Nada Mobark, 2025 95
EXAMPLE
▪ Rule → the name on the end of an Ada procedure must match
the procedure’s name.
Dr. Nada Mobark, 2025 96
EXAMPLE
▪ Syntax
<assign> -> <var> = <expr>
<expr> -> <var> + <var> | <var>
<var> -> A | B | C
Dr. Nada Mobark, 2025 97
EXAMPLE, PARSE TREE
Dr. Nada Mobark, 2025 98
EXAMPLE
▪ Syntax
<assign> -> <var> = <expr>
<expr> -> <var> + <var> | <var>
<var> -> A | B | C
▪ The variables can be one of two types: int or real.
▪ The types of operands in the right side can be mixed, but the
assignment is valid only if the target and the value resulting from
evaluating the right side have the same type.
Dr. Nada Mobark, 2025 99
EXAMPLE, RULES
<assign> -> <var> = <expr>
<expr> -> <var> + <var> | <var>
<var> A | B | C
Semantic rules:
1. When there are two variables on the right side of an
assignment:
▪ If , they have the same type, the expression type is that of the operands
▪ if the operand types are not the same is always real.
2. The type of the left side of the assignment must match the
type of the right side.
Dr. Nada Mobark, 2025 100
EXAMPLE, ATTRIBUTES
<assign> -> <var> = <expr>
<expr> -> <var> + <var> | <var>
<var> A | B | C
▪ actual_type: synthesized for <var> and <expr>
▪ used to store the actual type, int or real, of a variable or
expression
▪ variable → actual type is intrinsic
▪ Expression → determined from the actual types of the child node or
children nodes of the <expr> nonterminal.
▪ expected_type: inherited for <expr>
▪ used to store the type, int or real, that is expected for the
expression
▪ determined by the type of the variable on the left side of the
assignment statement.
Dr. Nada Mobark, 2025 101
Dr. Nada Mobark, 2025 102
DECORATING A PARSE TREE
▪ How are attribute
values computed?
▪ If all attributes are
inherited
▪ the tree is decorated in
top-down order.
▪ If all attributes are
synthesized
▪ the tree is decorated in
bottom-up order.
▪ Both kinds of attributes
are used
▪ some combination of
top-down and bottom-
up that must be used.
Dr. Nada Mobark, 2025 103
EXAMPLE, FINAL RESULT
Dr. Nada Mobark, 2025 104
EVALUATION
▪ Static semantics is an essential part of all compilers
▪ Decorating parse trees is an expensive process
▪ When used to describe modern programming languages,
attribute grammar becomes large in size and very complex.
Dr. Nada Mobark, 2025 105
106
3.5 DYNAMIC
SEMANTICS
Dr. Nada Mobark, 2025
DYNAMIC SEMANTICS
▪ dynamic semantics reflects the meaning, of the expressions,
statements, and program units of a programming language
➢ There is no single widely acceptable notation or formalism
for describing semantics
➢ Programmers usually rely on language manuals
➢ Imprecise, incomplete
▪ Three methods:
✓ Operational semantics
▪ Denotational semantics
▪ Axiomatic semantics
Dr. Nada Mobark, 2025 107
OPERATIONAL SEMANTICS
▪ Operational Semantics
▪ Describe the meaning of a program by executing its statements on
a machine, either simulated or actual.
▪ The change in the state of the machine (memory, registers, etc.)
defines the meaning of the statement
▪ the concept is frequently used in programming textbooks and
programming language reference manuals
Dr. Nada Mobark, 2025 108
THE BASIC PROCESS
▪ First step : design an appropriate intermediate language,
where the primary desired characteristic of the language is
clarity
▪ Every construct of the intermediate language must have an
obvious and unambiguous meaning
Dr. Nada Mobark, 2025 109
EXAMPLE
Dr. Nada Mobark, 2025 110
JAVA DO-WHILE
do:
…..
……..
while : if condition == 1 goto do
end:
Dr. Nada Mobark, 2025 111
EVALUATION
▪ Good if used informally (language manuals, etc.) or for
teaching programming languages
▪ Extremely complex if used formally
▪ Vienna Definition Language (VDL) was used for describing
semantics of PL/I.
▪ can lead to circularities, in which concepts are indirectly
defined in terms of themselves
Dr. Nada Mobark, 2025 112
4.1 INTRODUCTION
▪ Language implementation systems analyze source code,
regardless of the specific implementation approach
▪ Nearly all syntax analysis is based on a formal description of
the syntax of the source language (BNF)
▪ Advantages of Using BNF to describe Syntax
▪ Provides a clear and concise syntax description
▪ The parser can be build directly based on the BNF
▪ Parsers based on BNF are easy to update
Dr. Nada Mobark, 2025 113
Symbol Table
i Lexemes Tokens position = initial + rate * 60
1 position id …
2 initial id …
Source Code
3 rate id …
4 60 Int_lit
Compiler
Lexical Analyzer
=
<id 1> + Syntax Analyzer
<id 2> *
<id 3> Int_Lit
Semantic Analyzer
Optimized Code Intermediate Code
t1 = id3 * 60.0 t1 = inttofloat(60)
t3 = id2 + t2 t2 = id3 * t1
id1 = t3
Code Generator
t3 = id2 + t2
id1 = t3
Assembly/machine Code
LDF R2, id3
MULF R2, R2, #60.0
LDF R1, id2
ADDF R1, R1, R2 Computer
STF id1, R1
Dr. Nada Mobark, 2025 114
SUMMARY
▪ An attribute grammar is a descriptive formalism that can
describe both the syntax and the semantics of a language
▪ Operational semantics describe the meaning of a program by
executing its statements on a machine using an intermediate
language
Dr. Nada Mobark, 2025 115
ANY Q??
CS341
CONCEPT OF
PROGRAMMING
LANGUAGES
Ch4 : Lexical and Syntax Analysis
Dr. Nada Mobark
GOAL
Compile == understand !!
Dr. Nada Mobark, 2025 118
TOPICS
4.1 Introduction
4.2 Lexical Analysis
4.3 The Parsing Problem
▪ Top-down Parsing
▪ Bottom-Up Parsing
Dr. Nada Mobark, 2025 119
4.1 INTRODUCTION
▪ Language implementation systems analyze source code,
regardless of the specific implementation approach
▪ Nearly all syntax analysis is based on a formal description of
the syntax of the source language (BNF)
▪ Advantages of Using BNF to describe Syntax
▪ Provides a clear and concise syntax description
▪ The parser can be build directly based on the BNF
▪ Parsers based on BNF are easy to update
Dr. Nada Mobark, 2025 120
position = initial + rate * 60
SYNTAX ANALYSIS
Source Code
Compiler
▪ The syntax analysis portion of a
compiler nearly always consists Lexical Analyzer
of two parts:
▪ A low-level part is called a lexical Parser
analyzer
▪ A high-level part is called a
syntax analyzer, or parser (based Semantic Analyzer
on BNF)
Code Generator
Machine
Code
Computer
Dr. Nada Mobark, 2025 121
4.2 LEXICAL ANALYSIS
▪ A lexical analyzer is a pattern matcher for character strings
▪ It is the “front-end” of a parser
▪ Identifies substrings of the source program that belong
together - lexemes
▪ Lexemes match a character pattern (while), which is associated
with a category of lexemes (keyword), namely token
Dr. Nada Mobark, 2025 122
TERMINOLOGY
A language is a set of sentences
A sentence is a string of characters over some
alphabet
A lexeme is the lowest level syntactic unit of a
language (e.g., *, sum, begin)
A token is a category of lexemes (e.g.,
identifier)
Dr. Nada Mobark, 2025 123
EXAMPLE
▪ A lexeme is the lowest level syntactic unit of a language (e.g.,
*, sum, begin, ;, {, }, [, ])
▪ A token is a category of lexemes (e.g., identifier)
Lexeme Tokens
index = 2 * count + 17;
index identifier
count identifier
= equal_sign
* mult_op
+ plus_op
2 int_literal
17 int_literal
Dr. Nada Mobark, 2025 ; semicolon 124
LEXICAL ANALYSIS
▪ A lexical analyzer also . . .
▪ Skips comments
▪ Skips blanks outside lexemes
▪ Inserts lexemes for identifiers and literals into a symbol table
▪ Detects syntactic errors in lexemes
▪ For example, ill-formed floating-point literals, 12,345.21, 12,213, 4us
Symbol Table
i Lexeme Tokens
1 sum IDENT …
2 ( LEFT_PAREN …
3 + ADD_OP …
4 47 INT_LIT
Dr. Nada Mobark, 2025 125
LEXICAL ANALYSIS
▪ A lexical analyzer typically has several instance variables
▪ Character nextChar
▪ CharClass (letter, digit, etc.)
▪ String lexeme
▪ int tokenType
▪ And, essential functions of a Lexical Analyzer
▪ getChar - gets the next character of input, puts it in nextChar,
determines its class and puts the class in charClass
▪ addChar - puts the character from nextChar into the place the
lexeme is being accumulated, lexeme
▪ lookup - determines whether the string in lexeme is a reserved
word (returns a code)
Dr. Nada Mobark, 2025 126
STATE DIAGRAM
Gets the next character determines whether the string in
of input, and lexeme is a reserved word (returns a
determines its class code)
Token
add the
character
from
nextChar
to the Recognizes single-char tokens
lexeme (returns a code)
string
Dr. Nada Mobark, 2025 127
LEXICAL ANALYZER //Character classes
#define LETTER 0
#define DIGIT 1
#define UNKNOWN 99
Implementation: //Token codes
▪ front.c (Figure 4.1) #define INT_LIT 10
#define IDENT 11
#define ASSIGN_P 20
#define ADD_OP 21
(sum + 47) / total #define SUB_OP 22
#define MULT_OP 23
#define DIV_OP 24
Next token is: 25 Next lexeme is ( #define LEFT_PAREN 25
#define RIGHT_PAREN 26
Next token is: 11 Next lexeme is sum
Next token is: 21 Next lexeme is +
Next token is: 10 Next lexeme is 47
Next token is: 26 Next lexeme is )
Next token is: 24 Next lexeme is /
Next token is: 11 Next lexeme is total
Next token is: -1 Next lexeme is EOF
Dr. Nada Mobark, 2025 128
4.3 THE PARSING PROBLEM
Source Code
Compiler
▪ Goals of the parser
Lexical Analyzer
▪ Produce the parse tree
▪ Find all syntax errors
▪ produce an appropriate diagnostic Parser
message and recover quickly
▪ Two categories of parsers Semantic Analyzer
▪ Top down parser - produce the
parse tree, beginning at the root
▪ Bottom up parser - produce the Code Generator
parse tree, beginning at the
leaves
Machine
Code
Computer
Dr. Nada Mobark, 2025 129
THE TOP-DOWN PARSER
▪ An LL parser is a top-down parser.
▪ parses the input from Left to right
▪ performs Leftmost derivation of the sentence.
▪ A nonterminal symbol, A, can be replaced by a nonempty set
of production rules, namely A-rules
▪ Given a sentential form, xA , the parser must choose the correct
A-rule to get the next sentential form
▪ x is a string of terminal symbols
▪ A is a single nonterminal symbol
▪ is a mixed string of terminals and/or non-terminals
▪ leftmost derivation
▪ keep replacing the leftmost nonterminal A by the appropriate A-rules
▪ look only one token ahead in the input
Dr. Nada Mobark, 2025 130
TOP-DOWN PARSER EXAMPLE E
▪ Look at the following grammar
E → T + E | T
T → int * T | int | (E)
▪ Consider the string: T + E
int * int + int
▪ Left-most derivation, start from T T
int
root! *
int int
Dr. Nada Mobark, 2025 131
BOTTOM-UP PARSER
▪ A bottom-up parser is an LR parse
▪ parses the input from Left to right
▪ performs rightmost derivation of the sentence.
▪ Steps (reduction process ):
▪ Start with the tokens of the program and work back to the start
symbol
▪ continuously picks a substring of the input and attempts to reverse
it back to a nonterminal.
▪ Try to match the RHS of some production rule with a substring of
tokens (handle), and replace the substring with the LHS of the
production rule
Dr. Nada Mobark, 2025 132
BOTTOM-UP PARSER EXAMPLE E
E
E → E + T | T
T → T * int | int | (E)
int * int + int
▪ Apply right-most derivation, in T
reverse order → start from
terminals!
T
int * int + int T
T * int + int
T + int
T+ T
E+ T int
int * int +
E
Dr. Nada Mobark, 2025 133
PRACTICE
▪ Look at the following grammar
S –> AB
A –> aA | ε
B –> b | bB
▪ Giving the string:
aaaεb
▪ Draw top-down and bottom-up parse trees for the string
Dr. Nada Mobark, 2025 134
TOP-DOWN S
A B
S –> AB
A –> aA | ε
B –> b | bB A
S => A B A
=> a A B
=> a a A B
=>a a a A B
=> a a a ε B A
=> a a a ε b
a a a ε b
Dr. Nada Mobark, 2025 135
BOTTOM-UP S –> AB
S
A –> aA | ε
B –> b | bB
A
=> a a a ε b
=> aaaAb
=> aaAb
A
=> aAb
=> Ab
=> AB
=> S
A
A B
a a a ε b
Dr. Nada Mobark, 2025 136
SEPARATE LEXICAL AND SYNTAX ANALYSIS
▪ Reasons to Separate Lexical and Syntax Analysis
▪ Simplicity - less complex approaches can be used for lexical
analysis; separating them simplifies the parser
▪ Efficiency - separation allows optimization of the lexical analyzer
▪ Portability - parts of the lexical analyzer may not be portable, but
the parser always is portable
Dr. Nada Mobark, 2025 137
Symbol Table
i Lexemes Tokens position = initial + rate * 60
1 position id …
2 initial id …
Source Code
3 rate id …
4 60 Int_lit
Compiler
Lexical Analyzer
=
<id 1> + Syntax Analyzer
<id 2> *
<id 3> Int_Lit
Semantic Analyzer
Optimized Code Intermediate Code
t1 = id3 * 60.0 t1 = inttofloat(60)
t3 = id2 + t2 t2 = id3 * t1
id1 = t3
Code Generator
t3 = id2 + t2
id1 = t3
Assembly/machine Code
LDF R2, id3
MULF R2, R2, #60.0
LDF R1, id2
ADDF R1, R1, R2 Computer
STF id1, R1
Dr. Nada Mobark, 2025 138
SUMMARY
▪ The major methods of implementing programming languages
are: compilation, pure interpretation, and hybrid
implementation
▪ Syntax analysis is a common part of language implementation
▪ A lexical analyzer is a pattern matcher that isolates small-
scale parts of a program
▪ Detects syntax errors
▪ Produces a parse tree
▪ Parsing problem for bottom-up parsers: find the substring of
current sentential form
Dr. Nada Mobark, 2025 139
ANY Q??
CS341
CONCEPT OF
PROGRAMMING
LANGUAGES
Ch5 : Names, Bindings, and Scopes
Dr. Nada Mobark
GOAL
Discuss the
fundamental semantic
issues of variables
Dr. Nada Mobark, 2025 142
TOPICS
5.1 Introduction
5.3 Variables
▪ Name
▪ Address
▪ Value
▪ Type
▪ Lifetime
5.4 The Concept of Binding
▪ Type binding
▪ Storage binding
5.5 Scope
▪ Static
▪ dynamic
5.6 Scope and Lifetime
Dr. Nada Mobark, 2025 143
INTRODUCTION
▪ Imperative languages are abstractions of Von Neumann
architecture
▪ Processor: execute programs modifying the contents of the
memory
▪ Memory : store data and instructions
▪ Variables are essential!
Dr. Nada Mobark, 2025 144
WHAT IS A VARIABLE?
▪ Machine and assembly
languages
▪ Readability, writability, and
maintainability issues
Dr. Nada Mobark, 2025 145
WHAT IS A VARIABLE?
▪ A variable is an abstraction of a memory cell or collection of
cells
▪ Variables are
▪ noted by name
▪ stored based on type
▪ used based on scope and lifetime
Dr. Nada Mobark, 2025 146
VARIABLE ATTRIBUTES
▪ Name
▪ Address
▪ Value
▪ Type Symbol Table
▪ Lifetime
Name Token Addr Value Type Lifetime Scope
▪ Scope (Lexemes)
1 index id int local
2 17 const int
index = 2 * count + 17;
Dr. Nada Mobark, 2025 147
VARIABLE ATTRIBUTES, NAME
▪ A name is a string of characters that identifies some entity in a
program
▪ Examples: variable, subprogram, constant, etc.
▪ The term identifier is often used interchangeably with name.
▪ Design issues for names:
▪ What forms are legal?
▪ Are names case sensitive?
▪ Are special words reserved words or keywords?
Dr. Nada Mobark, 2025 148
5.2 DESIGN ISSUES FOR NAMES, FORM
▪ Most of the programming languages have the same for:
▪ a letter followed by a string consisting of letters, digits, and
underscore characters ( _ ).
▪ Special characters may be allowed at the beginning
▪ PHP: all variable names must begin with dollar signs
▪ Perl: all variable names begin with special characters, which
specify the variable’s type
▪ No spaces → A multiple-word naming convention:
▪ camel notation, all of the words of except the first are capitalized, as in
myStack
▪ use of underscores and mixed case
▪ A programming style, not a language issue!
Dr. Nada Mobark, 2025 149
DESIGN ISSUES FOR NAMES, LENGTH
▪ If too short, they cannot be descriptive
▪ Language examples:
▪ FORTRAN 95: maximum of 31
▪ C99: no limit but only the first 63 are significant;
▪ C++: no limit, but implementers often impose a limit on name length to
simplify the symbol table
▪ C# and Java: no limit, and all are significant
Dr. Nada Mobark, 2025 150
DESIGN ISSUES FOR NAMES, CASE
SENSITIVITY
▪ Names in the C-based languages are case sensitive
▪ Names in others are not
▪ Stick to a convention to avoid confusion
▪ In C variable names are lowercase.
▪ In C++, Java, and C# predefined names are mixed case (e.g.
IndexOutOfBoundsException)
▪ Disadvantage:
▪ poor readability
▪ names that look alike are different
▪ Poor writability:
▪ the need to remember specific case usage makes it more difficult to
write correct programs
Dr. Nada Mobark, 2025 151
DESIGN ISSUES FOR NAMES, SPECIAL WORDS
▪ Special words are used to separate the syntactic(action) parts
of statements and programs.
▪ Are they allowed to be used as names?
▪ No → reserved words
▪ cannot be redefined ( used as a user-defined name)
▪ Problem: If there are too many, many collisions occur
▪ Eg. : COBOL (300 reserved words)
▪ LENGTH, BOTTOM, COUNT
▪ Solution : names are visible only when explicitly imported.
▪ Yes → keywords
▪ can be redefined, have special meaning only in certain contexts
▪ e.g., in Fortran
▪ Real VarName (Real is a data type)
▪ Real = 3.4 (Real is a variable)
Dr. Nada Mobark, 2025 152
VARIABLE ATTRIBUTES, NAME
▪ not all variables have names!
int add(int nX, int nY)
{
return nX + nY;
}
Anonymous
variable
Dr. Nada Mobark, 2025 153
VARIABLE ATTRIBUTES, TYPE
▪ The type of a variable determines :
▪ the range of values of variables
▪ floating point type also determines the precision
▪ the set of operations that are defined for values of that type
Dr. Nada Mobark, 2025 154
VARIABLE ATTRIBUTES, ADDRESS
▪ The memory address with which a variable is associated
▪ sometimes called its l-value
▪ Design issues: Where and When?
Dr. Nada Mobark, 2025 155
WHAT IS BINDING?
▪ A binding is an association (between a name and the thing that
is named )
▪ between an entity and an attribute
▪ between a variable and its type or value,
▪ between an operation and a symbol
Dr. Nada Mobark, 2025 156
BINDING TIME
▪ Binding time is the time at which a binding takes place
▪ Language design time -- bind operator symbols to operations
▪ Language implementation time-- bind a data type to a
representation
▪ Compile time -- bind a variable to a type
▪ Runtime -- bind a local variable to a memory cell
int count = x + 5;
Dr. Nada Mobark, 2025 157
BINDING ATTRIBUTES TO VARIABLES
static dynamic
▪ A binding is static if it first ▪ A binding is dynamic if it
occurs before run time and first occurs during
remains unchanged execution or can change
throughout program during execution of the
execution. program
complete understanding of the binding times for program entities
is a prerequisite for understanding the semantics of a
programming language
Dr. Nada Mobark, 2025 158
TYPE BINDINGS
▪ Before the variable is referenced in a program, it must be
bound to a data type
▪ Design issues:
▪ How a type is specified?
▪ When does the binding take place?
Dr. Nada Mobark, 2025 159
STATIC TYPE BINDINGS
▪ If static, the type may be specified by either an explicit or an
implicit declaration
▪ An explicit declaration is a program statement used for declaring
the types of variables
Dr. Nada Mobark, 2025 160
STATIC TYPE BINDINGS
▪ An implicit declaration is a default mechanism for specifying
types of variables other than declaration statements
▪ Advantage: writability
▪ Disadvantage: reliability ( hard to detect errors)
naming conventions type inference
Requiring names of specific types to Using the context of the values
begin with particular special assigned to the variable in a
characters declaration statement
Perl: C#:
$apple String or Numeric var sum = 0;
var total = 0.0
var name = “Fred”
@apple Array
Dr. Nada Mobark, 2025 161
DYNAMIC TYPE BINDING
▪ Variable type is not specified by a declaration statement
▪ A type is determined when a variable is assigned a value
▪ May affect address-binding
▪ Variable can change its type at run time
▪ Example: JavaScript
list = [2, 4.33, 6, 8];
list = 17.3;
▪ Advantage:
▪ flexibility (generic program units)
▪ Disadvantages:
▪ High cost (dynamic allocation/de-allocation)
▪ Type error detection by the compiler is difficult
▪ Usually implemented by interpreter (slow)
Dr. Nada Mobark, 2025 162
STORAGE BINDING
▪ The variable is bound to a specific memory location.
Dr. Nada Mobark, 2025 163
VARIABLE ATTRIBUTES, LIFETIME
▪ The lifetime of a variable is the time during which it is bound
to a particular memory cell
▪ Allocation - getting a cell from some pool of available cells
▪ Deallocation - putting a cell back into the pool
▪ Categories:
▪ Static variables
▪ Stack-dynamic variables
▪ heap-dynamic variables
▪ Explicit
▪ Implicit
Dr. Nada Mobark, 2025 164
STATIC VARIABLES
▪ A variable is bound to memory cells before execution begins
and remains bound to the same memory cell until program
execution terminates.
▪ e.g., C and C++ static variables in functions
▪ Advantages:
▪ efficiency (direct addressing),
▪ history-sensitive subprogram support
▪ Disadvantage:
▪ lack of flexibility (no recursion)
Dr. Nada Mobark, 2025 165
EXAMPLE
#define MAX 5
int main(){
int i =0;
printf("Enter 5 numbers to be summed\n");
for(i = 0; i<MAX; ++i)
sumIt();
printf(“\nProgram completed\n");
C:\>test_static.o
getchar(); Enter 5 numbers to be summed
return 0;
} Enter a number: 1
void sumIt(void){ The current sum is: 1
static int sum = 0; Enter a number: 2
The current sum is: 3
int num;
Enter a number: 3
printf("\nEnter a number: "); The current sum is: 6
scanf("%d", &num); Enter a number: 4
sum+=num; The current sum is: 10
printf("The current sum is: %d",sum); Enter a number: 5
} The current sum is: 15
Program completed
Dr. Nada Mobark, 2025 166
STACK-DYNAMIC VARIABLES
▪ Storage bindings are created for variables when their
declaration statements are elaborated.
▪ A declaration is elaborated when the executable code associated
with it is executed
▪ If scalar, all attributes except address are
statically bound
▪ local variables in C subprograms (not declared
static) and Java methods
▪ Advantage: allows recursion; conserves
storage
▪ Disadvantages:
▪ Overhead of allocation and deallocation
▪ Subprograms cannot be history sensitive
▪ Inefficient references (indirect addressing)
Dr. Nada Mobark, 2025 167
HEAP-DYNAMIC VARIABLES
▪ What is heap?
▪ Allocation specified by the
programmer
▪ takes effect during execution
▪ Advantage:
▪ dynamic storage management
▪ Flexibility
Dr. Nada Mobark, 2025 168
EXPLICIT HEAP-DYNAMIC VARIABLES
▪ Allocated and deallocated by
explicit directives
▪ Referenced only through pointers
or references
▪ e.g.
▪ all objects in Java
▪ dynamic objects in C++ (via new
and delete)
▪ Disadvantage:
▪ Inefficient Cost of allocations,
references, and de-allocations
▪ Explicit de-allocation makes
programs unreliable
Dr. Nada Mobark, 2025 169
IMPLICIT HEAP-DYNAMIC VARIABLES
▪ Allocation and deallocation caused by assignment statements
▪ All attributes are bound every time they are assigned
▪ eg.
▪ all strings and arrays in Perl, JavaScript, and PHP
Highs = 5
▪ Advantage:Highs = [1, 2, 3, 4, 5]
▪ flexibility (generic code)
▪ Disadvantages:
▪ Inefficient- Introduce the run-time overhead of maintaining all the
dynamic attributes.
▪ Loss of error detection by the compiler
Dr. Nada Mobark, 2025 170
STORAGE BINDING TIMES
Dr. Nada Mobark, 2025 171
Same name x with different
variables (referred to different
In C++ memory cells).
…
int x = 1; //global variable
int main( )
{ cout << “global x in main is ” << x << endl; // 1
int x = 5; //local variable to main
cout << “local x in main’s outer scope is ” << x << endl; //5
{ //start new scope
int x = 7; //hides both x in outer scope and global x
cout << “local x in main’s inner scope is ” << x << endl; //7
} //end new scope
cout << “local x in main’s outer scope is ” << x << endl; ///5
} //end of main
Dr. Nada Mobark, 2025 172
VARIABLE ATTRIBUTES, VALUE
▪ The contents of the location with which the variable is
associated
▪ The address of a variable → the l-value of the variable
▪ The value of a variable → the r-value of the variable
▪ To access r-value, the l-value needs to be evaluated first.
Content Content
Sum = Sum + age ;
Address
Dr. Nada Mobark, 2025 173
VARIABLE ATTRIBUTES, SCOPE
▪ The scope of a variable is the range of statements over which
it is visible
▪ The scope rules of a language determine how references to names
are associated with variables
▪ Scope and lifetime are sometimes closely related, but are
different concepts
▪ scope is a textual, or spatial, concept whereas lifetime is a
temporal concept
Dr. Nada Mobark, 2025 174
STATIC SCOPE
▪ Scope can be statically determined – prior to execution
▪ To connect a name reference to a variable, you (or the
compiler) must search for the declaration of the variables
▪ Scope-based variable categories:
▪ The local variables of a program unit are those that are declared in
that unit
▪ The nonlocal variables of a program unit are those that are visible
in the unit but not declared there
▪ Global variables are a special category of nonlocal variables
Dr. Nada Mobark, 2025 175
GLOBAL SCOPE
▪ C, C++, PHP, JavaScript, and Python support a program
structure that consists of a sequence of function definitions in
a file
▪ These languages allow variable declarations to appear outside
function definitions
▪ C and C++ :
▪ Implicitly visible in all subsequent
functions in the file,
▪ except in in the case when the
variable is redefined.
Dr. Nada Mobark, 2025 176
BLOCKS
▪ Some languages allows a section of code to have its own local
variables whose scope is minimized
▪ Defined by blocks, can be nested
▪ Treated like sub-programs → variables are stack dynamic
▪ Storage is allocated when the block is entered and deallocated
when the block is exited.
The scope of loop counter
variables is restricted to the for
construct
Dr. Nada Mobark, 2025 177
NESTED BLOCKS
▪ A variable that is defined in an outer scope is accessible in all
(following) inner scopes.
Dr. Nada Mobark, 2025 178
EXAMPLE
▪ Variables with same name in nested scopes
Is this allowed ?
legal in C and C++, not in Java and C#→
error-prone
Dr. Nada Mobark, 2025 179
VARIABLE SHADOWING
▪ Block scoping may hide another
variable in a larger enclosing scope
▪ Variables can be hidden from a unit
by having a "closer" variable with
the same name
▪ can be accessed with selective
references
▪ C++ uses the scope resolution
operator (:: )
Dr. Nada Mobark, 2025 180
NESTED FUNCTIONS
a function defined inside another
function is called a nested function.
function big() {
▪ Static scoping
function sub1() { ▪ Once your program finds a
var x = 7; name reference, the search
sub2(); goes as follows:
print(x); ▪ search declarations, first
locally, then in increasingly
} larger enclosing scopes, until
one is found for the given
function sub2() { name
var y = x; ▪ Enclosing static scopes (to a
print(x); specific scope) are called its
} static ancestors; the nearest
static ancestor is called a
static parent
var x = 3;
sub1();
}
Dr. Nada Mobark, 2025 181
1-181
EVALUATION OF STATIC SCOPING
▪ Works well in many situations
▪ Problems:
▪ In most cases, too much access is possible
▪ As a program evolves, the initial structure is destroyed and local
variables often become global; subprograms also gravitate toward
become global, rather than nested
Dr. Nada Mobark, 2025 182
DYNAMIC SCOPE
▪ Based on calling sequences of program units, not their textual
layout (temporal versus spatial)
▪ References to variables are connected to declarations by
searching back through the chain of subprogram calls that
forced execution to this point
Dr. Nada Mobark, 2025 183
EXAMPLE
function big() {
▪ Dynamic scoping
▪ Reference to x in sub2 is to sub1's x function sub1() {
var x = 7;
sub2();
print(x);
big calls sub1 }
sub1 calls sub2
function sub2() {
sub2 uses x var y = x;
print(x);
}
var x = 3;
sub1();
}
Dr. Nada Mobark, 2025 184
EVALUATION OF DYNAMIC SCOPING
▪ Advantage:
▪ No need to pass arguments → convenience
▪ Disadvantages:
▪ While a subprogram is executing, its variables are visible to all
subprograms it calls → less reliable
▪ A statement in a subprogram that contains a reference to a
nonlocal variable can refer to different nonlocal variables during
different executions of the sub-programs → Impossible to
statically determine attributes
▪ Takes longer time to resolve → inefficient
▪ You need to know the sequence of subprogram calls to understand
a reference to a variable → Poor readability
Dr. Nada Mobark, 2025 185
#include <iostream>
using namespace std;
int i = 5; EXAMPLE
void p(){ ✓ Trace the program by
int i = -1; hand, and predict the
i = i + 1; output of the program.
cout << i << endl;
} ✓ What happens if we
remove the line:
int main(){
cout << i << endl; using namespace std;
char ch;
int i = 6;
i = i + 1;
p();
cout << i << endl;
return 0;
}
Dr. Nada Mobark, 2025 186
#include <iostream>
using namespace std;
int i;
EXAMPLE
int main(){
int i; ✓ Trace the program by
i = 5; hand, and predict the
output of the program.
for(
int i = 1; ✓ Does it compile
i<10 && cout << i << ' '; correctly or not?
++i ) Explain.
{
int i = -1;
cout << i << ' ';
}
cout << i << endl;
return 0;
}
Dr. Nada Mobark, 2025 187
SUMMARY
▪ Case sensitivity and the relationship of names to special
words represent design issues of names
▪ Variables are characterized by the sextuples: name, address,
value, type, lifetime, scope
▪ Binding is the association of attributes with program entities
▪ Scalar variables are categorized as: static, stack dynamic,
explicit heap dynamic, implicit heap dynamic
▪ The scope of a variable can be determined either statically or
dynamically
Dr. Nada Mobark, 2025 188
ANY Q??
CS341
CONCEPT OF
PROGRAMMING
LANGUAGES
Ch6 : Data types
Dr. Nada Mobark
GOAL
Explore categories,
characteristics, design
issues, and
implementation of the
common data types in
different programming
languages.
Dr. Nada Mobark, 2025 191
TOPICS
6.1 Introduction
6.2 Primitive Data Types
6.3 String Types
6.4 Enumeration Types
6.5 Array Types
6.6 Associative Arrays
6.11 Pointer and Reference Types
Dr. Nada Mobark, 2025 192
6.1 INTRODUCTION
▪ A data type defines a collection of data objects and a set of
predefined operations on those objects
▪ Earlier programming languages offered limited data structure
(types)
▪ The concept evolved over time:
▪ FORTRAN → arrays
▪ COBOL → decimal data , records
▪ ALGOL → user-defined data types
Dr. Nada Mobark, 2025 193
INTRODUCTION
▪ Uses of type system:
▪ Type-checking
▪ ensuring that the operands of an operator are of compatible types
▪ Program modularization
▪ Proper calling of Methods and interfaces
▪ Understanding semantics
▪ Expected program output
▪ One design issue for all data types: What operations are
defined and how are they specified?
Dr. Nada Mobark, 2025 194
6.2 PRIMITIVE DATA TYPES
▪ Almost all programming languages
provide a set of primitive data types
▪ Those not defined in terms of other
data types
▪ Some primitive data types are
merely reflections of the hardware
Dr. Nada Mobark, 2025 195
PRIMITIVE DATA TYPES, INTEGER
▪ A string of bits
▪ one of the bits (typically the leftmost) representing the sign
▪ Different sizes of integers
▪ Java :
▪ Negative values representation
▪ Signed magnitude
▪ Ones complement
▪ Twos complement
Dr. Nada Mobark, 2025 196
PRIMITIVE DATA TYPES, FLOATING POINT
▪ Model real numbers, but only as approximations
▪ Precision
▪ range
▪ Languages for scientific use support at least two floating-point
types (e.g., float and double; sometimes more)
▪ Usually exactly like the hardware, but not always
▪ IEEE Floating-Point Standard 754
Dr. Nada Mobark, 2025 197
PRIMITIVE DATA TYPES, COMPLEX
▪ Some languages support a complex type
▪ , e.g., C99, Fortran, and Python
▪ Each value consists of two floats, the real part and the
imaginary part
Dr. Nada Mobark, 2025 198
PRIMITIVE DATA TYPES, DECIMAL
▪ For business applications (money)
▪ Essential to COBOL
▪ C# offers a decimal data type
▪ Store a fixed number of decimal digits, in coded form (BCD)
▪ one digit per byte, or packed two digits per byte
▪ Operations are done on H/W or simulated in S/W
▪ Advantage: accuracy
▪ Disadvantages: limited range, wastes memory (6 digits == 24
bits)
Dr. Nada Mobark, 2025 199
PRIMITIVE DATA TYPES, BOOLEAN
▪ Simplest of all
▪ Range of values: two elements
▪ “true”
▪ “false”
▪ The C language doesn’t have a boolean data type
▪ Advantage:
▪ Readability over integers to represent switches or flags
▪ Could be implemented as bits, but often as bytes
▪ Why?
Dr. Nada Mobark, 2025 200
PRIMITIVE DATA TYPES, CHARACTER
▪ Stored as numeric coding
▪ Most commonly used coding: ASCII
▪ An alternative, Unicode
▪ Includes characters from most natural
languages
▪ 16-bit coding(UCS-2)
▪ 32-bit Unicode (UCS-4)
▪ Originally used in Java
▪ Now supported by many languages
▪ Supported by Fortran, starting with 2003
Dr. Nada Mobark, 2025 201
6.3 CHARACTER STRING TYPES
▪ Values are sequences of characters
▪ Typical operations:
▪ Assignment and copying
▪ Comparison (=, >, etc.)
▪ Catenation
▪ Substring reference(slice)
▪ Pattern matching
▪ Design issues:
▪ Is it a primitive type or just a special kind of array?
▪ Should the length of strings be static or dynamic?
Dr. Nada Mobark, 2025 202
CHARACTER STRING TYPE IN LANGUAGES
▪ C and C++
▪ Not primitive
▪ Use char arrays and a library of functions that provide operations
▪ Java (and C#, Ruby, and Swift)
▪ Primitive via the String class
▪ Fortran and Python
▪ Primitive type with assignment and several operations
▪ Perl, JavaScript, Ruby, and PHP
▪ built-in pattern matching, using regular expressions
Dr. Nada Mobark, 2025 203
CHARACTER STRING LENGTH OPTIONS
▪ Static:
▪ Java (immutable)
▪ length can’t be changed after string is created
▪ require no special dynamic storage allocation
▪ Limited dynamic:
▪ C and C++
▪ any number of chars 0 – max
▪ maintain the length, or use a special end of a string’s
character
▪ require no special dynamic storage allocation
▪ Dynamic length
▪ JavaScript, Perl A descriptor is the collection
▪ Variable length with no maximum of the attributes of a variable :
▪ Dynamic storage • Static : built at
▪ must grow and shrink dynamically. compilation time as part
of the symbol table
▪ Adjacent cells (mostly used)
• Dynamic: part or all has
▪ Overhead in allocation and deallocation
to be maintained at run
▪ Linked list or array of char pointers time
▪ Extra storage, complex operations
Dr. Nada Mobark, 2025 204
CHARACTER STRING TYPE EVALUATION
▪ Aid to writability
▪ As a primitive type with static length, they are inexpensive to
provide--why not have them?
▪ Dynamic length is nice, but is it worth the expense?
Dr. Nada Mobark, 2025 205
IMMUTABLE STRINGS IN RUBY
>> greeting = 'Hello'
=> "Hello“
>> greeting
=> "Hello“
>> greeting.object_id
=> 70101471431160
>> whazzup = greeting
"Hello“
>> greeting = 'Dude!'
=> "Dude!“
>> puts whazzup
=> "HELLO!"
Dr. Nada Mobark, 2025 206
6.11 POINTER AND REFERENCE TYPES
▪ A pointer type variable has a range of values that consists of
memory addresses and a special value, nil
▪ Provide the power of indirect addressing
▪ Provide a way to manage dynamic memory
▪ storage is allocated from the heap
▪ Design issues:
▪ What are the scope of a pointer variable?
▪ What is the lifetime of a heap-dynamic variable?
▪ Are pointers restricted as to the type of value to which they can
point?
▪ Are pointers used for dynamic storage management, indirect
addressing, or both?
▪ Should the language support pointer types, reference types, or
both?
Dr. Nada Mobark, 2025 207
POINTER OPERATIONS
▪ Two fundamental operations: assignment and dereferencing
▪ Assignment is used to set a pointer variable’s value to some
useful address
▪ Dereferencing yields the value stored at the location
represented by the pointer’s value
▪ Dereferencing can be explicit or implicit
Dr. Nada Mobark, 2025 208
EXAMPLE: POINTERS IN C AND C++
▪ Explicit dereferencing using (*) and address-of (&) operators
int* ptr = &x
j = *ptr
▪ Extremely flexible but must be used with care
▪ Pointers can point at any variable regardless of when or where it
was allocated
▪ Pointer arithmetic is possible
▪ Domain type need not be fixed
▪ void * can point to any type and can’t be type checked (cannot be de-
referenced)
Dr. Nada Mobark, 2025 209
PROBLEMS WITH POINTERS
▪ Aliasing
▪ Lost heap-dynamic variable (memory leakage)
▪ An allocated heap-dynamic variable that is no longer accessible
to the user program (often called garbage)
▪ Dangling pointers (dangerous)
▪ A pointer points to a heap-dynamic variable that has been
deallocated
float* stuff = new float[100];
float *p;
Stuff = new float[1000];
p = stuff;
delete []stuff;
Dr. Nada Mobark, 2025 210
REFERENCE TYPES
▪ C++ includes a special kind of pointer type called a reference
type that is used primarily for formal parameters
▪ Advantages of both pass-by-reference and pass-by-value
▪ Java extends C++’s reference variables and allows them to
replace pointers entirely
▪ References are references to objects, rather than being addresses
▪ C# includes both the references of Java and the pointers of
C++
▪ What about Python?
Dr. Nada Mobark, 2025 211
REFERENCES IN RUBY
▪ Pointer arithmetic, as in C, is not possible with Ruby.
▪ Some types are immutable
>> number = 3
=> 3
>> number
=> 3
>> number = 2 * number
=> 6
>> number
=> 6
Dr. Nada Mobark, 2025 212
6.4 ENUMERATION TYPES
▪ All possible values, which are
named constants, are provided in
the definition
▪ Design issues
▪ Is an enumeration constant allowed
to appear in more than one type
definition, and if so, how is the type
of an occurrence of that constant
checked?
▪ Are enumeration values coerced to
integer?
▪ Any other type coerced to an
enumeration type?
Dr. Nada Mobark, 2025 213
EXAMPLE, C#
Dr. Nada Mobark, 2025 214
EVALUATION OF ENUMERATED TYPE
▪ Aid to readability,
▪ no need to code a color as a number
▪ Aid to reliability,
▪ compiler can check:
▪ operations (don’t allow colors to be added)
▪ No enumeration variable can be assigned a value outside its defined
range
▪ In C#, F#, Swift, and Java 5.0, enumeration type variables :
▪ are not coerced into integer types
▪ can’t be assigned a value outside the predefined range.
Dr. Nada Mobark, 2025 215
6.5 ARRAY TYPES
▪ An array is a homogeneous aggregate of data elements in
which an individual element is identified by its position in the
aggregate, relative to the first element.
Dr. Nada Mobark, 2025 216
ARRAY TYPES
▪ Design issues:
▪ When does allocation take place?
▪ What types are legal for subscripts?
▪ What is the maximum number of subscripts?
▪ When are subscript ranges bound?
▪ Are subscripting expressions in element references range
checked?
▪ Are ragged or rectangular multidimensional arrays allowed, or
both?
▪ Are any kind of slices supported?
Dr. Nada Mobark, 2025 217
ARRAY INDEXING
▪ Indexing (or subscripting) is a mapping from indices to
elements
array_name (index_value_list) → an element
▪ Index Syntax
▪ Fortran and Ada use parentheses
▪ Ada explicitly uses parentheses to show uniformity between array
references and function calls because both are mappings
▪ Most other languages use brackets
▪ In some languages, the lower bound of the subscript range is
implicit
▪ Perl allows negative subscripts
▪ Offset from the end of the array
Dr. Nada Mobark, 2025 218
ARRAY CATEGORIES
▪ When are subscript type/ranges bound?
▪ Static: subscript ranges are statically bound and storage allocation is
static (before run-time)
▪ Advantage: efficiency (no dynamic allocation)
▪ eg. C/C++ static arrays
▪ Fixed stack-dynamic: subscript ranges are statically bound, but the
allocation is done at declaration/elaboration time during execution
▪ Advantage: space efficiency
▪ eg. C/C++ local arrays declared in functions
▪ Fixed heap-dynamic: subscript ranges are statically bound, storage
binding is dynamic but fixed after allocation
▪ binding is done when requested and storage is allocated from heap, not
stack
▪ Advantage: flexibility – allocated space fits the problem
▪ eg. C/C++ pointer arrays
▪ Heap-dynamic: binding of subscript ranges and storage allocation is
dynamic and can change any number of times
▪ Advantage: flexibility (arrays can grow or shrink during program execution)
▪ eg. Java ArrayList
Dr. Nada Mobark, 2025 219
ARRAY INITIALIZATION
▪ Some language allow initialization at the time of storage
allocation
▪ C# example:
int list [] = {4, 5, 7, 83}
▪ C and C++ examples
char name [] = ″freddie″;
char *names [] = {″Bob″, ″Jake″, ″Joe″];
▪ Java example
String[] names = {″Bob″, ″Jake″, ″Joe″};
▪ Python
list = [x ** 2 for x in range(12) if x % 3 == 0]
Dr. Nada Mobark, 2025 220
SLICES
▪ A slice is some substructure of an array; nothing more than a
referencing mechanism
▪ Slices are only useful in languages that have array operations
▪ Python
vector = [2, 4, 6, 8, 10, 12, 14, 16]
mat = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
▪ vector (3:6) is a three-element array
▪ mat[0][0:2] is the first and second element of the first row of
mat
▪ Ruby supports slices with the slice method
▪ list.slice(2, 2) returns the third and fourth elements of
list
Dr. Nada Mobark, 2025 221
RECTANGULAR AND JAGGED ARRAYS
▪ A rectangular array is a multi-dimensioned array in which all
of the rows have the same number of elements and all
columns have the same number of elements
▪ A jagged matrix has rows with varying number of elements
▪ Possible when multi-dimensioned arrays actually appear as arrays
of arrays
▪ C, C++, C# and Java support jagged arrays
Dr. Nada Mobark, 2025 222
ARRAY IMPLEMENTATION, SINGLE
DIMENSIONED
▪ Addressing array elements
▪ Use information in the descriptor
▪ Static → compile time!
Compile-time
descriptor for
single-
dimensioned
arrays
Dr. Nada Mobark, 2025 223
ARRAY IMPLEMENTATION, MULTI-
DIMENSIONED
▪ An actual address value requires finding the number of
preceding elements
▪ Two common ways:
▪ Row major order (by rows) – used in most languages
▪ Column major order (by columns) – used in Fortran
The location of the [i, j] element in a A compile-time descriptor
matrix for a multidimensional
Dr. Nada Mobark, 2025 array 224
HIGHER DIMENSIONAL ARRAYS
▪ Colored Images can be viewed as a 3D array of pixels
Dr. Nada Mobark, 2025 225
ASSOCIATIVE ARRAYS
▪ An associative array is an unordered collection of data
elements that are indexed by an equal number of values
called keys
▪ User-defined keys must be stored
▪ Design issues:
▪ What is the form of references to elements?
▪ Is the size static or dynamic?
▪ Built-in type in Perl, Python, Ruby, and Swift
Dr. Nada Mobark, 2025 226
SUMMARY
▪ The data types of a language are a large part of what
determines that language’s style and usefulness
▪ The primitive data types of most imperative languages
include numeric, character, and Boolean types
▪ The user-defined enumeration and subrange types are
convenient and add to the readability and reliability of
programs
▪ Arrays are included in most languages
▪ Pointers are used for addressing flexibility and to control
dynamic storage management
Dr. Nada Mobark, 2025 227
ANY Q??
CS341
CONCEPT OF
PROGRAMMING
LANGUAGES
Ch7 : Expressions and Assignment
Dr. Nada Mobark
GOAL
Understand the semantics
of operators, expression
evaluation, type
conversions, and
assignment.
Dr. Nada Mobark, 2025 230
TOPICS
7.1 Introduction
7.2 Arithmetic Expressions
7.3 Overloaded Operators
7.4 Type conversions
7.5 Relational and Boolean Expressions
7.6 Short-Circuit Evaluation
7.7 Assignment Statements
Dr. Nada Mobark, 2025 231
7.1 INTRODUCTION
▪ Arithmetic evaluation was one of the motivations for the
development of the first programming languages
▪ Expressions are the fundamental means of specifying
computations in a programming language
▪ To understand expression evaluation, need to be familiar with
the orders of operator and operand evaluation
Dr. Nada Mobark, 2025 232
7.2 ARITHMETIC EXPRESSIONS
▪ Similar to mathematics, arithmetic expressions consist of
operators, operands, parentheses, and function calls
▪ Design issues:
▪ Types of operators?
▪ Operator precedence rules?
▪ Operator associativity rules?
▪ Order of operand evaluation?
▪ Operand evaluation side effects?
▪ Operator overloading?
▪ Type mixing in expressions?
Dr. Nada Mobark, 2025 233
OPERATORS, CATEGORIES
▪ Example, Ruby
Dr. Nada Mobark, 2025 234
OPERATORS, NUMBER OF OPERANDS
▪ A unary operator has one operand
▪ A binary operator has two operands
▪ A ternary operator has three operands
average = (count == 0)? 0 : sum / count
Dr. Nada Mobark, 2025 235
OPERATORS, NOTATION
▪ In most languages, binary operators are infix, except in
Scheme and LISP, in which they are prefix; Perl also has some
prefix binary operators
(Infix) a + b * c → (prefix) ??
▪ Most unary operators are prefix, but the ++ and - - operators
in C-based languages can be either prefix or postfix
Dr. Nada Mobark, 2025 236
OPERATORS, PRECEDENCE
▪ The operator precedence rules for expression evaluation
define the order in which “adjacent” operators of different
precedence levels are evaluated
▪ Example:
What is the relative precedence of the
unary minus ??
Dr. Nada Mobark, 2025 237
EXAMPLES
Dr. Nada Mobark, 2025 238
OPERATORS, ASSOCIATIVITY
▪ The operator associativity rules for expression evaluation
define the order in which adjacent operators with the same
precedence level are evaluated
▪ Typical associativity rules
▪ Left to right, except **, which is right to left
▪ Sometimes unary operators associate right to left
▪ In APL; all operators have equal precedence and all operators
associate right to left
Dr. Nada Mobark, 2025 239
OPERATORS, ASSOCIATIVITY
▪ In case of floating point numbers, some associativity options
may cause overflow!!
▪ A & C → very large +ve values, B & D → very large –ve values
A+C+B+D
▪ Programmers can alter the precedence and associativity rules by
placing parentheses
(A + B) + (C + D)
Dr. Nada Mobark, 2025 240
7.5 RELATIONAL AND BOOLEAN EXPRESSIONS
▪ A relational operator is an operator that
compares the values of its two
operands
▪ Relational Expressions
▪ Use relational operators and operands of
various types
▪ Evaluate to some Boolean representation
▪ Operator symbols used vary somewhat
among languages (!=, /=, ~=, .NE., <>,
#)
Dr. Nada Mobark, 2025 241
EXAMPLE, RUBY
Dr. Nada Mobark, 2025 242
BOOLEAN EXPRESSIONS
▪ Boolean Expressions
▪ Operands are Boolean and
the result is Boolean
▪ Example operators : AND,
OR, &&, ||
Dr. Nada Mobark, 2025 243
EXAMPLE, RUBY
Dr. Nada Mobark, 2025 244
BOOLEAN EXPRESSIONS
▪ C89 has no Boolean type -- it uses int type with 0 for false
and nonzero for true
▪ eg,
a < b < c is a legal expression:
▪ Left operator is evaluated, producing 0 or 1
▪ The evaluation result is then compared with the third operand (i.e., c)
▪ b is never compared with c
Dr. Nada Mobark, 2025 245
PRECEDENCE
▪ Arithmetic expressions can be the operands of relational
expressions, and relational expressions can be the operands
of Boolean expressions → different precedence levels
Dr. Nada Mobark, 2025 246
SHORT CIRCUIT EVALUATION
▪ An expression in which the result is determined without
evaluating all of the operands and/or operators
▪ Examples;
(13 * a) * (b / 13 – 1)
▪ If a is zero, there is no need to evaluate (b /13 - 1)
(a > b) || (b++ / 3)
▪ B may not be incremented
Dr. Nada Mobark, 2025 247
SHORT CIRCUIT EVALUATION
▪ AND operation does not short circuit in
▪ FORTRAN (1956)
▪ BASIC (1964) and VB
▪ Pascal (1970)
▪ SQL (1974)
▪ Problem with non-short-circuit evaluation
index = 0;
while (index <= length) && (LIST[index] != value)
index++;
▪ When index=length, LIST[index] will cause an indexing
problem
Dr. Nada Mobark, 2025 248
OPERAND, EVALUATION ORDER
▪ Variables: fetch the value from memory
▪ Constants: sometimes a fetch from memory; sometimes the
constant is in the machine language instruction
▪ Parenthesized expressions: evaluate all operands and
operators first
▪ The most interesting case is when an operand is a function call
▪ May be subject to side effects!
Dr. Nada Mobark, 2025 249
POTENTIALS FOR SIDE EFFECTS
▪ Functional side effects: when a function changes a two-way
parameter or a non-local variable
▪ Problem with functional side effects:
▪ When a function referenced in an expression alters another
operand of the expression;
a = 10;
b = a + fun(&a);
//assume fun returns 10 and changes
//its parameter to 20
Dr. Nada Mobark, 2025 250
SOLUTIONS TO FUNCTIONAL SIDE EFFECTS
▪ Write the language definition to disallow functional side
effects
▪ No two-way parameters in functions
▪ No non-local (global) references in functions
▪ Advantage: it works!
▪ Disadvantage: inflexibility of one-way parameters and lack of non-
local references
▪ Write the language definition to demand that operand
evaluation order be fixed
▪ Java requires that operands appear to be evaluated in left-to-right
order
▪ Disadvantage: limits some compiler optimizations
Dr. Nada Mobark, 2025 251
REFERENTIAL TRANSPARENCY
▪ A program has the property of referential transparency if any
two expressions in the program that have the same value can
be substituted for one another anywhere in the program,
without affecting the action of the program
▪ Advantage: Semantics of a program is much easier to
understand
▪ eg.
result1 = (fun(a) + b) / (fun(a) – c);
temp = fun(a);
result2 = (temp + b) / (temp – c);
▪ If fun has no side effects, result1 = result2
▪ Otherwise, not, and referential transparency is violated
Dr. Nada Mobark, 2025 252
7.4 TYPE CONVERSIONS
▪ A narrowing conversion is one that converts an object to a
type that cannot include all of the values of the original type
e.g., float to int
▪ A widening conversion is one in which an object is converted
to a type that can include at least approximations to all of the
values of the original type e.g., int to float
Dr. Nada Mobark, 2025 253
TYPE CONVERSIONS: IMPLICIT
▪ A mixed-mode expression is one that has operands of
different types
▪ A coercion is an implicit type conversion
▪ In most languages, all numeric types are coerced in expressions,
using widening conversions
int a;
float b, c, d;
. . .
d = b * a;
▪ Disadvantage of coercions:
▪ They decrease in the type error detection ability of the compiler
▪ In ML, Ada, and F#, there are no coercions in expressions →
increased error detection
Dr. Nada Mobark, 2025 254
TYPE CONVERSIONS : EXPLICIT
▪ Called casting in C-based languages
▪ Examples
▪ C: (int)angle
▪ F#: float(sum)
Note that F#’s syntax is similar to that of function calls
Dr. Nada Mobark, 2025 255
COERCION MADNESS!!
▪ JavaScript Example
Dr. Nada Mobark, 2025 256
7.7 ASSIGNMENT STATEMENTS
▪ The general syntax
<target_var> <assign_operator> <expression>
▪ The assignment operator
▪ = Fortran, BASIC, the C-based languages
▪ := Ada
▪ confusing when = is overloaded for the relational operator for
equality
▪ that’s why the C-based languages use == as the relational
operator
Dr. Nada Mobark, 2025 257
COMPOUND ASSIGNMENT OPERATORS
▪ A shorthand method of specifying a commonly needed form
of assignment
▪ Introduced in ALGOL; adopted by C and the C-based
languages
▪ Example
a = a + b
▪ can be written as
a += b
counter = 2
while counter < 68
puts counter
counter**=2
end
Dr. Nada Mobark, 2025 258
UNARY ASSIGNMENT OPERATORS
▪ Unary assignment operators in C-based languages combine
increment and decrement operations with assignment
▪ Examples
sum = ++count //count incremented, then assigned to sum
sum = count++ //count assigned to sum, then incremented
count++ //count incremented
-count++ //count incremented then negated
▪ Ruby does not support ++ operator!
Dr. Nada Mobark, 2025 259
MULTIPLE ASSIGNMENTS
▪ Perl and Ruby allow multiple-target multiple-source
assignments
($first, $second, $third) = (20, 30, 40);
▪ Also, the following is legal and performs an interchange:
($first, $second) = ($second, $first);
Dr. Nada Mobark, 2025 260
ASSIGNMENT, CONDITIONAL TARGETS
▪ In Perl
($flag ? $total : $subtotal) = 0
▪ equivalent to
if ($flag){
$total = 0
} else {
$subtotal = 0
}
Dr. Nada Mobark, 2025 261
ASSIGNMENT AS AN EXPRESSION
▪ In the C-based languages, Perl, and JavaScript, the assignment
statement produces a result and can be used as an operand
Examples:
▪ while ((ch = getchar())!= EOF){…}
▪ a = b + (c = d / b) – 1
▪ Sum = count = 0;
▪ if (x = y) ...
▪ Java and C# allow only boolean expressions in their if
statements
Dr. Nada Mobark, 2025 262
7.3 OVERLOADED OPERATORS
▪ Use of an operator for more than one purpose is called
operator overloading
▪ Some are common (e.g., + for addition and string concatenetion)
▪ The compiler will choose the correct meaning based on the types
of the operands
Dr. Nada Mobark, 2025 263
OVERLOADED OPERATORS
▪ C++, C#, and Ruby allow user-defined overloaded operators
▪ Not allowed in Java!
Dr. Nada Mobark, 2025 264
OVERLOADED OPERATORS, PROS
▪ When sensibly used → aid to readability (avoid method calls,
expressions appear natural)
Dr. Nada Mobark, 2025 265
OVERLOADED OPERATORS, CONS
▪ potential troubles:
▪ Loss of compiler error detection
▪ omission of an operand is not an error!
▪ eg.
▪ & → addressOf / Bitwise AND
▪ - → subtraction, unary minus
▪ loss of readability
▪ Users can define nonsense operations
▪ Using a symbol that is unrelated to the operation
▪ Binding modules in a system
▪ Same operators overloaded in different ways
▪ Needs to be eliminated
Dr. Nada Mobark, 2025 266
SUMMARY
▪ Expressions are the basic feature to understand about a
language.
▪ Operator precedence and associativity defines a statement-
level control structure.
▪ Operator overloading is a feature that improves writability but
may have a negative effect on code readability.
▪ Various forms of assignment operators are supported in
different languages.
Dr. Nada Mobark, 2025 267
ANY Q??
CS341
CONCEPT OF
PROGRAMMING
LANGUAGES
Ch8 : Statement-Level Control Structures
Dr. Nada Mobark
GOAL
Examine the flow of control
among statements
Dr. Nada Mobark, 2025 270
TOPICS
8.1 Introduction
8.2 Selection Statements
▪ Two-way
▪ Multiple-way
8.3 Iterative Statements
▪ Counter/Logically controlled Loops
▪ User-controlled mechanisms
▪ Iteration based on Data Structures
8.4 unconditional Branching
Dr. Nada Mobark, 2025 271
8.1 LEVELS OF CONTROL FLOW
▪ Control statements are the statements which alter the flow of
execution and provide better control to the programmer on
the flow of execution
▪ Levels:
▪ Within expressions (Chapter 7):
▪ Associativity, precedence rules
▪ Among program statements (this chapter)
▪ Selection & looping
▪ Among program units (Chapter 9)
▪ Sub-programs
Dr. Nada Mobark, 2025 272
STATEMENT-LEVEL
CONTROL
• Together, they form
flexible, complex, and
powerful program logic.
• More structures → high
wirtability
• Too-few structures →
may affect readability
Dr. Nada Mobark, 2025 273
CONTROL STATEMENTS: DESIGN
▪ Goal: what is the best collection of control structures that
provides the required capability and the desired writability?
▪ Design issue
▪ Should a control structure have multiple entries?
▪ goto, labels
▪ What about exit?
▪ no explicit exit → all exits from a control structure are restricted to
transferring control to the first statement following the structure → no
harm to readability and also no danger.
Dr. Nada Mobark, 2025 274
8.5 UNCONDITIONAL BRANCHING
▪ Transfers execution control to a
specified place in the program
▪ Represented one of the most heated
debates in 1960’s and 1970’s
▪ Major concerns: Readability,
maintainability
▪ Java : Not supported
▪ C# : goto statement can be used in
switch statements.
▪ Any program that uses a goto can be
rewritten so that it doesn't need the
goto
Dr. Nada Mobark, 2025 275
C++ EXAMPLE
Dr. Nada Mobark, 2025 276
8.2 SELECTION STATEMENTS
▪ A selection statement provides the means of choosing
between two or more paths of execution
▪ Two general categories:
▪ Two-way selectors
▪ Multiple-way selectors
Dr. Nada Mobark, 2025 277
TWO-WAY SELECTION STATEMENTS
▪ General form:
if control_expression
then clause
else clause
▪ Design Issues:
▪ What is the form and type of the control expression?
▪ How are the then and else clauses specified?
▪ How should the meaning of nested selectors be specified?
Dr. Nada Mobark, 2025 278
TWO-WAY SELECTION, THE CONTROL EXPRESSION
▪ If the then reserved word or some other syntactic marker is
to introduce the then clause, no need for parentheses
▪ Expression :
▪ In C89, C99, Python, and C++, the control expression can be
arithmetic
▪ In most other languages, the control expression must be Boolean
Dr. Nada Mobark, 2025 279
TWO-WAY SELECTION, CLAUSE FORM
▪ In many contemporary languages, the then and else clauses
can be single statements or compound statements
▪ In Perl, must be compound → all clauses must be delimited by
braces
▪ In Python and Ruby, clauses are statement sequences → ??
Dr. Nada Mobark, 2025 280
TWO-WAY SELECTION, NESTING SELECTORS
▪ Java example
if (sum == 0)
if (count == 0)
result = 0;
else result = 1;
▪ Which if gets the else?
▪ static semantics rule: else matches with the nearest elseless-if
▪ To force an alternative semantics, compound statements may be
used
Dr. Nada Mobark, 2025 281
NESTING SELECTORS - RUBY
▪ Statement sequences as clauses,
▪ use of a special word resolves ambiguity and adds to the
readability:
if sum == 0 then
if count == 0 then
result = 0
else
result = 1
end
end
Dr. Nada Mobark, 2025 282
NESTING SELECTORS - RUBY
▪ Ruby : If - elsif Statement:
▪ used to make more complex branching statements.
Dr. Nada Mobark, 2025 283
MULTIPLE-WAY SELECTION STATEMENTS
▪ Allow the selection of one of any number of statements or
statement groups
▪ an n-way branch to statements of code, where n is the number of
selectable segments
Dr. Nada Mobark, 2025 284
MULTIPLE-WAY SELECTION USING IF
▪ Multiple Selectors
can appear as direct
extensions to two-way
selectors, using else-
if clauses → more
flexible
Dr. Nada Mobark, 2025 285
MULTIPLE-WAY SELECTION
▪ Design Issues:
▪ What is the form and type of the control expression?
▪ How are the selectable segments specified?
▪ Is execution flow through the structure restricted to include just a
single selectable segment?
▪ How are case values specified?
▪ What is done about unrepresented expression values?
Dr. Nada Mobark, 2025 286
THE SWITCH STATEMENT
▪ C, C++, Java, and JavaScript.
▪ The control expression and the constant expressions are some
discrete type:
▪ Integer
▪ Characters
▪ enumeration types
▪ Testing for equality.
▪ default clause
▪ for unrepresented values
▪ Optional
Dr. Nada Mobark, 2025 287
THE SWITCH STATEMENT - C
▪ Design choices for C’s switch
statement
▪ Control expression can be only an
integer type
▪ Selectable segments can be
compound statements
▪ no implicit branch at the end of
selectable segments
▪ Any number of segments can be
executed in one execution of the
construct
increase in flexibility
▪ The break statement (restricted
goto) should be used for exiting decrease in reliability
Dr. Nada Mobark, 2025 288
THE SWITCH STATEMENT - C#
▪ Design choices for C#’s switch
statement differs from C in:
▪ disallows the implicit execution
of more than one segment
▪ Each selectable segment must end
with an unconditional branch
(goto or break)
▪ the control expression and the
case constants can be string
Dr. Nada Mobark, 2025 289
MULTIPLE-WAY SELECTION - RUBY
▪ Case statement
▪ Allows range checking
▪ Implicit branch at the end
of selectable segments
Dr. Nada Mobark, 2025 290
IMPLEMENTING MULTIPLE SELECTORS
▪ Approaches:
▪ Implement multiple conditional branches using hard coded labels
goto branches
label1 :label1 : code for statement1
goto out
. . .
labeln : code for statementn
goto out
default: code for statementn+1
goto out
branches: if t = constant_expression1 goto label1label1
. . .
if t = constant_expressionn goto labelnlabeln
goto default
out:
Dr. Nada Mobark, 2025 291
IMPLEMENTING MULTIPLE SELECTORS
▪ Approaches:
▪ Store case values in a table and use a linear search of the table
▪ Suitable when there are more than ten cases, a hash table of case values
can be used
▪ Use an array whose indices are the case values and values are the
case labels
▪ Useful when the number of cases is small and more than half of the
whole range of case values are represented,
Dr. Nada Mobark, 2025 292
8.3 ITERATIVE STATEMENTS
▪ The repeated execution of a statement or compound
statement is accomplished either by iteration or recursion
▪ The body of an iterative statement is the collection of statements
▪ The execution of the body is controlled by the iteration statement.
▪ General design issues for iteration control statements:
1. How is iteration controlled?
2. Where is the control mechanism in the loop?
Dr. Nada Mobark, 2025 293
LOGICALLY-CONTROLLED LOOPS
▪ Repetition control is based on a Boolean expression
▪ Design issues:
▪ Pretest or posttest?
▪ pretest to mean that the test for loop completion occurs before the loop
body is executed
▪ posttest to mean that it occurs after the loop body is executed.
Dr. Nada Mobark, 2025 294
LOGICALLY-CONTROLLED LOOPS
▪ C and C++ have both pretest
and posttest forms
▪ the control expression can be
arithmetic
▪ it is legal to branch into the
body of a logically-controlled
loop
▪ Java:
▪ the control expression must
be Boolean
▪ the body can only be entered
at the beginning ; Java has no
goto
Dr. Nada Mobark, 2025 295
COUNTER-CONTROLLED LOOPS
▪ A counting iterative statement has a loop variable, and a
means of specifying the initial and terminal, and step size
values
▪ Design Issues:
▪ Should it be a special case of the logically controlled loop or a
separate statement?
▪ The loop variable :
▪ the type and scope
▪ Is it legal to be changed in the loop body, and if so, does the change
affect loop control?
▪ Should be evaluated only once, or once for every iteration?
▪ What is its value after loop termination?
Dr. Nada Mobark, 2025 296
COUNTER-CONTROLLED LOOPS – C-BASED
▪ Loop parameters:
▪ Initial, terminal, step-size specs of a loop variable
▪ Syntax:
for ([expr_1] ; [expr_2] ; [expr_3]) statement
▪ Semantics:
Dr. Nada Mobark, 2025 297
COUNTER-CONTROLLED LOOPS - C
▪ C Design choices:
▪ There is no explicit loop variable
▪ The first expression is evaluated once, but the other two are
evaluated with each iteration
▪ If the second expression is absent, it is an infinite loop
▪ Everything can be changed in the loop
▪ It is legal to branch into the body of a for loop in C
▪ The expressions can be whole statements, or even statement
sequences, with the statements separated by commas
Dr. Nada Mobark, 2025 298
COUNTER-CONTROLLED LOOPS – C++
▪ C++ differs from C in two ways:
▪ The control expression can also be Boolean
▪ The initial expression can include variable definitions (scope is
from the definition to the end of the loop body)
▪ Java and C#
▪ Differs from C++ in that the control expression must be Boolean
Dr. Nada Mobark, 2025 299
USER-LOCATED LOOP CONTROL MECHANISMS
▪ Sometimes it is convenient for the programmers to decide a
location for loop control (other than top or bottom of the loop)
▪ Simple design for single loops (e.g., break)
▪ Design issues for nested loops
▪ Should the conditional be part of the exit?
▪ Should control be transferable out of more than one loop?
Dr. Nada Mobark, 2025 300
EXAMPLE - BREAK
▪ C , C++, Python, Ruby, C#, and java have unconditional
unlabeled exits (break), last in Perl
▪ Exit the innermost loop
▪ Can be used to quit infinite loops
Dr. Nada Mobark, 2025 301
EXAMPLE
Dr. Nada Mobark, 2025 302
EXAMPLE - CONTINUE
▪ C, C++, and Python have an unlabeled control statement,
continue, that skips the remainder of the current iteration,
but does not exit the loop
Dr. Nada Mobark, 2025 303
RUBY
1.i = 1
2.while true
3. if i*5 >= 25
4. break
5. end 1.for i in 5...11
6. puts i*5 2. if i == 7 then
7. i += 1 3. next
8.end 4. end
5. puts i
6.end
Redo ??
Dr. Nada Mobark, 2025 304
ITERATION BASED ON DATA STRUCTURES
▪ The number of elements in a data structure controls loop
iteration
▪ Mechanism is a call to an iterator function that returns the next
element in some chosen order, if there is one; else loop is
terminate
Dr. Nada Mobark, 2025 305
EXAMPLE, C#
public interface Iterator<T>
▪ Implementing this interface allows an object to be the target of the
"for-each loop" statement.
Dr. Nada Mobark, 2025 306
SUMMARY
▪ Variety of statement-level structures
▪ Choice of control statements beyond selection and logical
pretest loops is a trade-off between language size and
writability
Dr. Nada Mobark, 2025 307
ANY Q??
CS341
CONCEPT OF
PROGRAMMING
LANGUAGES
Ch9 : Subprograms
Dr. Nada Mobark
GOAL
explore the
design/implementation of
subprograms and
parameter-passing
methods
Dr. Nada Mobark, 2025 310
TOPICS
9.1 Introduction
9.2 Fundamentals of Subprograms
9.3 Design Issues for Subprograms
9.5 Parameter-Passing Methods
9.6 Parameters That Are Subprograms
9.8 Design Issues for Functions
Dr. Nada Mobark, 2025 311
9.1 INTRODUCTION
▪ The first programmable computer, Babbage’s Analytical
Engine
▪ Built in 1840s
▪ reusing collections of instruction cards at several different places
in a program
Dr. Nada Mobark, 2025 312
9.1 INTRODUCTION
▪ Two fundamental abstraction facilities
▪ Process abstraction
▪ Discussed in this chapter
▪ Emphasized from early days
▪ Advantages:
▪ abstract details, improve logical structure, better readability
▪ Reuse code, save coding time, save memory!
▪ Data abstraction
▪ OOP - Emphasized in the1980s
▪ Discussed in depth in Chapter 11
Dr. Nada Mobark, 2025 313
9.2 FUNDAMENTALS OF SUBPROGRAMS
▪ Subprograms are collection of
statements that define
parameterized computations
▪ Each subprogram has a single
entry point
▪ The calling program is
suspended during execution of
the called subprogram
▪ Control always returns to the
caller when the called
subprogram’s execution
terminates
Dr. Nada Mobark, 2025 314
PROCEDURES VS. FUNCTIONS
▪ Two categories of subprograms
▪ Function:
▪ returns value(s) Pascal
▪ They are expected to produce no side
effects
▪ no change to the parameters, nor any
variable outside the function body
▪ In practice, functions have side
effects
▪ Procedure:
▪ does not return value
▪ can produce results in the calling
program unit by two methods:
▪ through variables that are not formal
parameters but are still visible in both
the procedure and the calling
program unit,
▪ through formal parameters that allow
the transfer of data to the caller, those
parameters can be changed.
Dr. Nada Mobark, 2025 315
BASIC DEFINITIONS
▪ A subprogram header is the first
part of the definition
▪ including the name, the kind of
subprogram (procedure/function),
and the formal parameters
▪ The parameter profile (signature) is
the number, order, and types of its
parameters
▪ The protocol is a subprogram’s
parameter profile and, if it is a
function, its return type
▪ A subprogram definition describes
the interface to and the actions of
the subprogram abstraction
▪ A subprogram call is an explicit
request that the subprogram be
executed
Dr. Nada Mobark, 2025 316
EXAMPLES, C
▪ Function declarations in C and C++ are often called
prototypes
▪ A subprogram declaration provides the protocol, but not the body,
of the subprogram
▪ Such declarations are often placed in header files
Dr. Nada Mobark, 2025 317
METHODS IN RUBY
Dr. Nada Mobark, 2025 318
PARAMETERS
▪ Subprograms typically describe
computations that need data!
▪ There are two ways that a subprogram
can gain access to the data that it is to
process:
▪ through direct access to nonlocal variables
(declared elsewhere but visible in the
subprogram) → can reduce reliability
▪ through parameter passing → more flexible
▪ A formal parameter is a dummy variable listed
in the subprogram header and used in the
subprogram
▪ An actual parameter represents a value or
address used in the subprogram call
statement
Dr. Nada Mobark, 2025 319
ACTUAL/FORMAL PARAMETER
CORRESPONDENCE
▪ Positional
▪ The binding of actual parameters to formal
parameters is by position
▪ the first actual parameter is bound to the first
formal parameter and so forth
▪ Safe and effective
▪ Keyword
▪ The name of the formal parameter to which an
actual parameter is to be bound is specified
with the actual parameter
▪ Advantage: Parameters can appear in any order,
thereby avoiding parameter correspondence
errors
▪ Disadvantage: User must know the formal
parameter’s names
Dr. Nada Mobark, 2025 320
FORMAL PARAMETER, DEFAULT VALUES
▪ if no actual parameter is
passed, formal parameters
can have default values
▪ Allowed in certain languages
(e.g., C++, Python, PHP).
▪ In C++, default parameters
must appear last because
parameters are positionally
associated (no keyword
parameters)
Dr. Nada Mobark, 2025 321
VARIABLE PARAMETERS
▪ C# methods can accept a variable number of parameters as
long as they are of the same type—the corresponding formal
parameter is an array preceded by params
Dr. Nada Mobark, 2025 322
9.3 DESIGN ISSUES FOR SUBPROGRAMS
▪ Are local variables static or dynamic?
▪ What parameter passing methods are provided?
▪ Are parameter types checked?
▪ Can subprograms be passed as parameters? what is the
referencing environment of a passed subprogram?
▪ Can subprograms be nested? What is the referencing
environment of a passed subprogram?
▪ Can subprograms be overloaded?
▪ Can subprogram be generic?
Dr. Nada Mobark, 2025 323
9.8 DESIGN ISSUES FOR FUNCTIONS
▪ Are side effects allowed?
▪ Parameters should always be in-mode to reduce side effect (like
Ada)
▪ What types of return values are allowed?
▪ Most imperative languages restrict the return types
▪ C allows any type except arrays and functions
▪ C++ is like C but also allows user-defined types
▪ Java and C# methods can return any type (but because methods are not
types, they cannot be returned)
▪ Python and Ruby treat methods as first-class objects, so they can be
returned, as well as any other class
▪ What is the max number of Returned Values?
▪ In most languages, only a single value can be returned from a
function
▪ Ruby can return many values by storing them in an array
▪ ML, F#, Python return multiple values as a tuple
Dr. Nada Mobark, 2025 324
9.6 PARAMETERS THAT ARE SUBPROGRAMS
▪ In some situations, it is convenient to be able to transmit
computations, rather than data, as parameters to
subprograms.
▪ pass subprogram names as parameters
▪ Example : when a subprogram must sample some
mathematical function
▪ integration by sampling a function at a number of points
Dr. Nada Mobark, 2025 325
EXAMPLE, PYTHON
Dr. Nada Mobark, 2025 326
9.5 PARAMETER PASSING
▪ Parameter-passing methods are the ways in which parameters
are transmitted to and/or from called subprograms.
▪ In mode : can receive data from the corresponding actual
parameter
▪ Out mode : can transmit data to the actual parameter
▪ In-out mode :can do both
Dr. Nada Mobark, 2025 327
IMPLEMENTATION MODELS OF PARAMETER
PASSING
▪ Passing modes:
• Physically move a value
• Move an access path to a value
▪ A variety of models developed by language designers:
▪ Pass-by-value
▪ Pass-by-result
▪ Pass-by-value-result
▪ Pass-by-reference
▪ Pass-by name
Dr. Nada Mobark, 2025 328
PASS-BY-VALUE (IN MODE)
▪ The value of the actual parameter
is used to initialize the
corresponding formal parameter
▪ Normally implemented by copying
▪ Physical move: additional storage is
required (stored twice) and the actual
move can be costly (for large
parameters)
▪ Passing Access path : not
recommended, must write-protect in
the called subprogram, and accesses
cost more (indirect addressing)
Dr. Nada Mobark, 2025 329
PASS-BY-REFERENCE (IN-OUT MODE)
▪ Pass an access path
▪ called pass-by-sharing
▪ Advantage:
▪ Passing process is efficient (no
copying and no duplicated storage)
▪ Disadvantages
▪ Slower accesses (compared to pass-
by-value) to formal parameters
▪ Potentials for unwanted side effects
(collisions)
▪ Unwanted aliases (access broadened)
fun(total, total);
Dr. Nada Mobark, 2025
fun(list[i], list[j]); // i == j 330
PASS-BY-REFERENCE (IN-OUT MODE)
▪ Another issue:
▪ Can the passed reference be changed in the called
subprogram?
▪ In C, it is possible
▪ But in some other languages, such as Pascal and C++, formal
parameters that are addresses are implicitly dereferenced, which
prevents such changes
Dr. Nada Mobark, 2025 331
PASS-BY-RESULT (OUT MODE)
▪ When a parameter is passed by
result:
▪ no value is transmitted to the
subprogram
▪ the corresponding formal
parameter acts as a local variable
▪ its value is transmitted to caller’s
actual parameter when control is
returned to the caller, by physical
move
▪ Require extra storage location and
copy operation
Dr. Nada Mobark, 2025 332
PASS-BY-VALUE-RESULT (IN-OUT MODE)
▪ A combination of pass-by-value and pass-by-result
▪ Sometimes called pass-by-copy
▪ Formal parameters have local storage
▪ Disadvantages:
▪ Those of pass-by-result
▪ Those of pass-by-value
Dr. Nada Mobark, 2025 333
IMPLEMENTING
PARAMETER-
PASSING METHODS
• In most languages
parameter
communication takes
place thru the run-time
stack
• Pass-by-reference are
the simplest to
implement; only an
address is placed in the
stack
Function header: void sub(int a, int b, int c, int d)
Function call in main: sub(w, x, y, z)
(pass w by value, x by result, y by value-result, z by reference)
Dr. Nada Mobark, 2025 334
PARAMETER PASSING METHODS OF MAJOR
LANGUAGES
▪ C
▪ Pass-by-value
▪ Pass-by-reference is achieved by using pointers as parameters
▪ C++
▪ A special pointer type called reference type for pass-by-reference
▪ Java
▪ All non-object parameters are passed are passed by value
▪ no method can change any of these parameters
▪ Object parameters are passed by reference
▪ C#
▪ Default method: pass-by-value
▪ Pass-by-reference is specified by preceding both a formal parameter
and its actual parameter with ref
▪ Python and Ruby
▪ use pass-by-assignment (all data values are objects); the actual is
assigned to the formal
Dr. Nada Mobark, 2025 335
DESIGN CONSIDERATIONS FOR PARAMETER
PASSING
▪ Two important considerations
▪ Efficiency
▪ One-way or two-way data transfer
▪ But the above considerations are in conflict
▪ Good programming suggest limited access to variables, which
means one-way whenever possible
▪ But pass-by-reference is more efficient to pass structures of
significant size
Dr. Nada Mobark, 2025 336
TYPE CHECKING PARAMETERS
▪ The types of actual parameters are checked for consistency
with the types of the corresponding formal parameters.
▪ Considered very important for reliability
▪ FORTRAN 77 and original C: none
▪ Pascal and Java: it is always required
▪ ANSI C and C++: choice is made by the user / Prototypes
▪ Relatively new languages Perl, JavaScript, and PHP do not
require type checking
▪ In Python and Ruby, variables do not have types, so parameter
type checking is not possible
Dr. Nada Mobark, 2025 337
SUMMARY
▪ A subprogram definition describes the actions represented
by the subprogram
▪ Subprograms can be either functions or procedures
▪ Local variables in subprograms can be stack-dynamic or
static
▪ Three models of parameter passing: in mode, out mode, and
in-out mode
▪ (extra) Subprograms can be overloaded
Dr. Nada Mobark, 2025 338
ANY Q??
CS341
CONCEPT OF
PROGRAMMING
LANGUAGES
Ch10: Implementing subprograms
Dr. Nada Mobark
GOAL
Explore the implementation
of subprograms
Dr. Nada Mobark, 2025 341
TOPICS
10.1 The General Semantics of Calls and Returns
10.2 Implementing “Simple” Subprograms
10.3 Implementing Subprograms with Stack-Dynamic Local
Variables
Dr. Nada Mobark, 2025 342
THE GENERAL SEMANTICS OF CALLS AND
RETURNS
▪ The subprogram call and return operations of a language are
together called its subprogram linkage
Dr. Nada Mobark, 2025 343
THE GENERAL SEMANTICS OF CALLS AND
RETURNS
▪ General semantics of calls to a subprogram
▪ Parameter passing methods
▪ Stack-dynamic allocation of local variables
▪ Save the execution status of calling program
▪ Transfer of control and arrange for the return
▪ If subprogram nesting is supported, access to nonlocal variables
must be arranged
▪ General semantics of subprogram returns:
▪ Out mode and inout mode parameters must have their values
returned
▪ Deallocation of stack-dynamic locals
▪ Restore the execution status
▪ Return control to the caller
Dr. Nada Mobark, 2025 344
10.2 IMPLEMENTING “SIMPLE” SUBPROGRAMS
▪ Don’t support recursion
▪ Call Semantics:
- Save the execution status of the caller
- Pass the parameters
- Pass the return address to the called
- Transfer control to the called
▪ Return Semantics:
▪ If pass-by-value-result or out mode parameters are used, move the
current values of those parameters to their corresponding actual
parameters
▪ If it is a function, move the functional value to a place the caller can
get it
▪ Restore the execution status of the caller
▪ Transfer control back to the caller
Dr. Nada Mobark, 2025 345
IMPLEMENTING “SIMPLE” SUBPROGRAMS
▪ Required storage:
▪ Status information about the caller
▪ parameters,
▪ return address,
▪ return value for functions,
▪ temporaries
Dr. Nada Mobark, 2025 346
IMPLEMENTING “SIMPLE” SUBPROGRAMS
▪ Two separate parts:
▪ the actual code and
▪ the non-code part (local variables and data that can change)
▪ The format, or layout, of the non-code part of an executing
subprogram is called an activation record
▪ Data are relevant only during activation/execution
▪ No recursion → one activation record
▪ Fixed size → statically allocated
Dr. Nada Mobark, 2025 347
EXAMPLE
▪ Three subprograms
▪ Separated code and data segments
▪ Can be attached to the code
▪ May be compiled separately, put
together with a linker
▪ The linker does:
▪ Find/Load main and all referenced
subprograms code in memory,
including library calls.
▪ Load all activation records in memory
▪ Patch in the target address of all calls to
subprograms.
Dr. Nada Mobark, 2025 348
10.3 USING STACK-DYNAMIC LOCAL
VARIABLES
▪ More complex activation record, because:
▪ The compiler must generate code to cause implicit allocation and
deallocation of local variables
▪ Recursion must be supported
▪ adds the possibility of multiple simultaneous activations of a
subprogram
Dr. Nada Mobark, 2025 349
ACTIVATION RECORD
▪ An activation record instance is dynamically created when a
subprogram is called
▪ reside on the run-time stack
▪ Last called, first complete
▪ Return address points to the next instruction following the call.
▪ The dynamic link points to the base of the activation record of the
caller
▪ Static scope : used to trace back info in case of run-time errors.
▪ Dynamic scope : used to access non-local variables
▪ Return address, dynamic link, and parameters are placed first by
caller.
▪ Local variables allocated and initialized by the calling program →
placed last
Dr. Nada Mobark, 2025 350
AN EXAMPLE: C FUNCTION
void sub(float total, int part)
{
int list[5];
float sum;
…
}
Dr. Nada Mobark, 2025 351
ACTIVATION RECORD
▪ The activation record format is static, but its size may be
dynamic
▪ Local data may not have fixed size
▪ The Environment Pointer (EP) must be maintained by the run-
time system.
▪ It always points at the base of the activation record instance of the
currently executing program unit
▪ Used as the base of the offset addressing of the data contents of the
activation record
Dr. Nada Mobark, 2025 352
REVISED SEMANTIC CALL/RETURN ACTIONS
▪ Caller Actions:
▪ Create an activation record instance
▪ Save the execution status of the current program unit
▪ Compute and pass the parameters
▪ Pass the return address to the called
▪ Transfer control to the called
▪ Prologue (before call) actions of the called:
▪ Save the old EP as the dynamic link in the activation record
▪ Set to point to base of the new Activation record instance
▪ Allocate local variables
Dr. Nada Mobark, 2025 353
REVISED SEMANTIC CALL/RETURN ACTIONS
▪ Epilogue (at the end of call) actions of the called:
▪ If there are pass-by-value-result or out-mode parameters, the
current values of those parameters are moved to the
corresponding actual parameters
▪ If the subprogram is a function, its value is moved to a place
accessible to the caller
▪ Restore the stack pointer by setting it to the value of the current
EP-1 and set the EP to the old dynamic link
▪ Restore the execution status of the caller
▪ Transfer control back to the caller
Dr. Nada Mobark, 2025 354
AN EXAMPLE WITHOUT RECURSION
void fun1(float r) {
void fun3(int q) {
int s, t;
...
...
}
fun2(s);
void main() {
...
float p;
}
...
void fun2(int x) {
fun1(p);
int y;
...
...
}
fun3(y);
...
}
Dr. Nada Mobark, 2025 355
AN EXAMPLE WITHOUT RECURSION
main calls fun1
fun1 calls fun2
fun2 calls fun3
Dr. Nada Mobark, 2025 356
DYNAMIC CHAIN AND LOCAL OFFSET
▪ The collection of dynamic links in the stack at a given time is
called the dynamic chain, or call chain
▪ Local variables can be accessed by their offset from the
beginning of the activation record, whose address is in the EP.
This offset is called the local_offset
▪ The local_offset of a local variable can be determined by the
compiler at compile time
▪ Based on order, type, and size
Dr. Nada Mobark, 2025 357
AN EXAMPLE WITH RECURSION
▪ The activation record used in the previous example supports
recursion
int factorial (int n) {
<-----------------------------1
if (n <= 1) return 1;
else return (n * factorial(n - 1));
<-----------------------------2
}
void main() {
int value;
value = factorial(3);
<-----------------------------3
}
Dr. Nada Mobark, 2025 358
STACKS FOR CALLS TO FACTORIAL
▪ Each call result in a
fresh copy of the
activation record
placed on the stack.
▪ the functional value
is undefined
Dr. Nada Mobark, 2025 359
STACKS FOR RETURNS FROM FACTORIAL
▪ Functional value is
returned before the
call ends.
Dr. Nada Mobark, 2025 360
SUMMARY
▪ Subprogram linkage semantics requires many action by the
implementation
▪ Simple subprograms have relatively basic actions
▪ Stack-dynamic languages are more complex
▪ Subprograms with stack-dynamic local variables have two
components
▪ actual code
▪ activation record
▪ Activation record instances contain formal parameters and
local variables among other things
Dr. Nada Mobark, 2025 361
ANY Q??