طراحی و پیادهسازی
زبانهای برنامهنویسی
دانشگاه بین المللی امام خمینی (ره)
1 شکوه کرمانشاهانی
فصل سوم
Ref. Book: Concepts of Programming Language by Sebesta
/20
2
/20
3 Discussion of this chapter
Defining Syntax and Semantic
most common method of describing syntax: context-free
grammars (also known as Backus-Naur Form)
Derivations
Parse Trees
Ambiguity
descriptions of operator precedence and associativity
Extended Backus-Naur Form
Attribute grammars
which can be used to describe both the syntax and static
semantics of programming languages
Brief discussion of three formal methods of describing semantics
/20
4 Description of a programming
language is crutial
A concise yet understandable description of a PL is
difficult but essential to the language’s success: Tradeoff\
Some Chalanges:
Diversity of the people who must understand the
description
Programming language implementors obviously must be
able to determine how the expressions, statements, and
program units of a language are formed, and also their
intended effect when executed. The difficulty of the
implementors’ job is, in part, determined by the
completeness and precision of the language description.
language users must be able to determine how to
encode software solutions by referring to a language
reference manual.
/20
5 Syntax & Semantic
description
The syntax of a programming language is the form of its
expressions, statements, and program units.
Its semantics is the meaning of those expressions,
statements, and program units.
EX.
the syntax of a Java while statement is
while (boolean_expr) statement
The semantics of this statement form is that when the
current value of the Boolean expression is true, the
embedded statement is executed. Then control implicitly
returns to the Boolean expression to repeat the process. If
the Boolean expression is false, control transfers to the
statement following the while construct.
/20
6 Syntax & Semantic
description (2)
Although they are often separated for discussion
purposes, syntax and semantics are closely related.
In a well-designed programming language, semantics
should follow directly from syntax; that is, the
appearance of a statement should strongly suggest
what the statement is meant to accomplish.
Describing syntax is easier than describing semantics
A concise and universally accepted notation is available
for syntax description, but none has yet been developed
for semantics
Naturally more complicated
/20
7 Syntax
A language, whether natural (such as English) or
artificial (such as Java), is a set of strings of characters
from some alphabet. The strings of a language are
called sentences or statements.
The syntax rules of a language specify which strings of
characters from the language’s alphabet are in the
language.
Formal descriptions of the syntax of programming
languages, for simplicity’s sake, often do not include
descriptions of the lowest-level syntactic units. These
small units are called lexemes.
/20
8 Syntax; Lexeme
The lexemes of a programming language include its
numeric literals, operators, and special words, among
others.
One can think of programs as strings of lexemes rather
than of characters.
The description of lexemes can be given by a lexical
specification, which is usually separate from the
syntactic description of the language.
Lexemes are partitioned into groups: Token
/20
9 Syntax; Token
A token of a language is a category of its lexemes
For example, the names of variables, methods,
classes, and so forth in a programming language form
a group called identifiers
: An identifier is a token that can have lexemes, or
instances, such as sum and total.
In some cases, a token has only a single possible
lexeme.
For example, the token for the arithmetic operator
symbol + has just one possible lexeme
/20
10 Syntax; Lexeme & Token
Example
index = 2 * count + 17;
/20
11 Two way of Language Definition:
Recognition / Generation
Language Recognizer
For Language: L from alphabet: ∑
We need to construct a mechanism R, called a
recognition device
capable of reading strings of characters from the alphabet Σ
Indicate whether a given input string was or was not in L
R would either accept or reject the given string.
Then: R is a description of L
Because most useful languages are, for all practical
purposes, infinite, this might seem like a lengthy and
ineffective process.
However: The syntax analyzer (Paeser) of a compiler is a
recognizer
/20
12 Two way of Language Definition:
Recognition / Generation (2)
Language Generator
A device that can be used to generate the sentences
of a language.
We can think of the generator as having a button that
produces a sentence of the language every time it is
pushed.
seems to be a device of limited usefulness as a
language descriptor/
There is a close connection between formal generation and
recognition devices for the same language
/20
13 Formal Methods of Describing
Syntax
Context-Free Grammar
Backus-Naur Form (BNF)
Drivation
Parse Tree
Ambiguity
Operator Precedence
Associativity
An Unambiguous Grammar for if-else (page 227 pdf)
Extended BNF
/20
14 Design / Definition a Language
with BNF
left-hand side right-hand side (RHS)
(LHS)
rule, or production
A metalanguage is a language that is used to describe
another language.
BNF is a metalanguage for programming languages.
BNF/20uses abstractions for syntactic structures
15 Formal Methods of Describing
Syntax
6 Rule
/20
16 Formal Methods of Describing
Syntax
generated by the leftmost derivation:
/20
17 BNF: PARSE TREE
/20
18 BNF: Ambiguity
/20
19 BNF: Ambiguity
Left Most Derivation
Solution:
Make a decision:
Right Recursive Rules
Or
Left Recursive Rules
/20
Right Most Derivation
20 BNF: Ambiguity
Solution:
Make a decision:
Right Recursive Rules Using More non-terminals
Or
Left Recursive Rules
/20
21 BNF: NO Ambiguity
/20
22 BNF: Operator Precedence
/20
23
BNF: Associativity of Operators
/20
24
BNF: Associativity of Operators
/20
Left Recursion Left Associativity
25 Design / Definition a Language
with BNF
BNF is a metalanguage for programming languages.
BNF uses abstractions for syntactic structures
How Define??
Un-Ambiguise Rules:
More Non-Terminals / More Rules
Each Rule: Right Recursion or Left Recursion
Assocaitivity
Decision For Each Rule to be: Right Recursiev or Left Recursive
Right Recursive: Right Associativity
Left Recursive: Left Associativity
Operator Precedence: Rules Ordering
More Depth: More Precedence
Other ….Continue
/20
26 BNF: Continue
A Grammar for a Small Language
/20
27 BNF: Continue
A Derivation of Grammar for a Small Language
/20
28 BNF: Continue
A Grammar for Identifier Decration of C
<Declaration> <ID-TYPE> <Spaces> <Ident-List> ;
<ident_list> → identifier
| identifier, <ident_list>
<Spaces> Space
| space<spaces>
/20
29 BNF: Continue
if Statement
Ambiguity
/20
30 BNF: Continue
An Unambiguous Grammar for if-else
More Non-Termainals
There is just one possible parse tree, using this
grammar, for the following
sentential form:
/20
31 Extended BNF : EBNF
/20
32 Extended BNF : EBNF
/20
33 BNF & EBNF
last slide
/20
34 Attribute Grammars
An attribute grammar is a device used to describe
more of the structure of a programming language
than can be described with a context-free grammar.
Extension to a context-free grammar
allows certain language rules to be conveniently
described, such as type compatibility
+ Static Semantic
/20
35 Static Semantics
There are some characteristics of programming
languages that are difficult to describe with BNF
consider type compatibility rules
In Java, for example, a floating-point value cannot be
assigned to an integer type variable, although the
opposite is legal
Although this restriction can be specified in BNF, it
requires additional nonterminal symbols and rules. If all
of the typing rules of Java were specified in BNF, the
grammar would become too large to be useful,
because the size of the grammar determines the size
of the syntax analyzer.
/20
36 Static Semantics (2)
some characteristics of programming languages that are
impossible to describe with BNF:
Consider the common rule that all variables must be
declared before they are referenced. It has been proven
that this rule cannot be specified in BNF.
These problems exemplify the categories of language rules
called static semantics rules.
The static semantics of a language is only indirectly
related to the meaning of programs during execution
rather, it has to do with the legal forms of programs
(syntax rather than semantics).
Static semantics is so named because the analysis required
to check these specifications can be done at compile time.
/20
37 Static Semantic (3)
Because of the problems of describing static
semantics with BNF, a variety of more powerful
mechanisms has been devised for that task.
One such mechanism, attribute grammars, was
designed by Knuth (1968) to describe
both the syntax and the static semantics of programs.
/20
38 Attribute grammars
Attribute grammars are a formal approach both to
describing and checking the correctness of the static
semantics rules of a program.
Attribute grammars are context-free grammars to
which have been added:
Attributes
attribute computation functions / Semantic Functions
predicate functions
/20
39 Attribute Grammars (2)
Attributes, which are associated with grammar
symbols (the terminal and nonterminal symbols), are
similar to variables in the sense that they can have
values assigned to them.
Attribute computation functions, sometimes called
semantic functions, are associated with grammar
rules.
They are used to specify how attribute values are
computed
Predicate functions, which state the static semantic
rules of the language, are associated with grammar
rules
/20
40 Attribute Grammars (3)
Associated with each grammar symbol X is a set of
attributes A(X):
Synthesized attributes: S(X)
used to pass semantic information up a parse tree
inherited attributes: I(X)
pass semantic information down and across a tree
Associated with each grammar rule is:
A set of semantic functions
A set of predicate functions (possibly empty )
over the attributes of the symbols in the grammar rule
/20
41 Attribute Grammars (4): Ex.
/20
42 Attribute Grammars (5)
Intrinsic Attributes
Intrinsic attributes are synthesized attributes of leaf
nodes whose values are determined outside the parse
tree.
For example, the type of an instance of a variable in a
program could come from the symbol table
the only attributes with values are the intrinsic
attributes of the leaf nodes.
/20
43 Attribute Grammars (6)
Intrinsic Attributes
Given the intrinsic attribute values on a parse tree, the
semantic functions can be used to compute the
remaining attribute values
/20
44 Attribute Grammars (7): Ex.
Predicate functions, which
state the static semantic rules of the language, are associated with
grammar
rules.
/20
45 Dynamic Semantics:
Describing the Meanings of Programs
dynamic semantics
meaning, of the expressions, statements, and program
units of a programming language.
Recall
Because of the power and naturalness of the available
notation, describing syntax is a relatively simple matter.
BUT
No universally accepted notation or approach has been
devised for dynamic semantics.
/20
46 Dynamic Semantics:
Describing the Meanings of Programs
Solution?
There are several of the methods that have been
developed
Operational Semantics
Denotational Semantics
Axiomatic Semantics
Others
PAUSE
/20
We will return to this topic again
47
Design and design issues
of a Programming Language
/20
48 Chaper 5:
Fundamental semantic issues of variables
Names, Bindings, and Scopes
Two primary components of Von Neumann computer
architecture
Memory
stores both instructions and data
Processor
provides operations for modifying the contents of the
memory
The abstractions in a language for the memory cells of
the machine are variables
/20
49 Abstraction for memory cells:
Variables
A variable has some attributes:
Name
Address
Type
Value
/20
50 Name
& its Designing issues
A name is a string of characters used to identify some
entity in a program
The following are the primary design issues for names:
Are names case sensitive?
Are the special words of the language reserved words
or keywords?
Special Words
Special words in programming languages are used to
make programs more readable by naming actions to
be performed
/20
51 Name
Special Words
Special Words
To make programs more readable by naming actions to
be performed
To separate the syntactic parts of statements and
programs
Special Words
Reserved word is a special word of a programming
language that cannot be used as a name.
One potential problem with reserved words: If the language
includes a large number of reserved words, the user may
have difficulty making up names that are not reserved. The
best example of this is
COBOL, which has 300 reserved words.
/20
52 Name
Special Words
Special Words
Reserved word
In most languages, names that are defined in other
program units, such as Java packages and C and C++
libraries, can be made visible to a program. These names
are predefined, but visible only if explicitly imported.
Once imported, they cannot be redefined.
Keyword
which means they can be redefined
EX. Fortran
؟Which meaning is used ?
/20
53 Abstraction for memory cells:
Variables
A variable has some attributes:
Name
Address
Type
Value
/20
54 Variable
Address
The address of a variable is the machine memory address
with which it is associated.
Not as simple as it may appear
It is possible for the same variable to be associated
with different addresses at different times during the
execution of the program.
EX. Local Variable of a subprogram
Different instantiations of the same variable
Address of a variable: l-value …?
/20
55 Abstraction for memory cells:
Variables
A variable has some attributes:
Name
Address
Type
Value
/20
56 Variable
TYPE
The type of a variable determines the range of values
the variable can store and the set of operations that
are defined for values of the type
Ex. Is type “char” can be in a sum expression?
/20
57 Abstraction for memory cells:
Variables
A variable has some attributes:
Name
Address
Type
Value
/20
58 Variable
Value
The value of a variable is the contents of the memory
cell or cells associated with the variable.
It is convenient to think of computer memory in terms
of abstract cells, rather than physical cells.
A byte size is too small for most program variables
An abstract memory cell has the size required by the
variable with which it is associated.
Henceforth, the term memory cell will mean abstract
memory cell.
Value of a variable: r-value
/20
59 Abstraction for memory cells:
Variables
A variable has some attributes:
Name
Address
Type
Value
Binding
/20
60
Binding
Definition: A binding is an association between an attribute and an entity
Such as between a variable and its attributes
The time at which a binding takes place is called binding time.
Binding and binding times are prominent concepts in the semantics of
programming languages.
Binding times:
Design ex. Binding * to multiplication
Implementation ex. Binding a Size or range of values to a variable type
Compile ex. Binding a type to a variable in Java
Load ex. Binding a variable to a storage cell
Link ex. Binding a call to library subprogram to subprogram code
Run ex. Some value binding & some storage binding
/20
61 Binding
Example
count = count + 5;
Some of the bindings and their binding times
The type of count is bound at compile time.
The set of possible values of count is bound at compiler
design time.
The meaning of the operator symbol + is bound at compile
time, when the types of its operands have been determined.
The internal representation of the literal 5 is bound at compiler
design time.
The value of count is bound at execution time with this
statement.
A complete understanding of the binding times for the
attributes of program entities is a prerequisite for
/20
understanding the semantics of a programming language
62 Binding of Attributes to
Variables
Static Binding
A binding is static if it first occurs before run time
begins and remains unchanged throughout program
execution
Dynamic Binding
If the binding first occurs during run time
or can change in the course of program execution, it
is called
.
/20
63 Variables
Type Binding
Static type Binding
Explicit Declaration
Implicit Declaration
Implicit variable type binding is done by the language
processor, either a compiler or an interpreter.
Syntactic form of the variable’s name
detrimental to reliability
type inference
For example, in C# a var declaration of a variable must
include an initial value,
/20
64 Variables
Type Binding
Dynamic type Binding
Not Explicit Declaration
Not Implicit Declaration
the variable is bound to a type when it is assigned a
value in an assignment statement
(Such an assignment may also bind the variable to an
address and a memory cell, because different type
values may require different amounts of storage.)
more programming flexibility
Possibility of generic program
dealing with data of any numeric type
/20
65 Variables
Type Binding
Dynamic type Binding
Before the mid-1990s, the most commonly used
programming languages used static type binding, the
primary exceptions being some functional languages
such as Lisp
Since then there has been a significant shift to languages
that use dynamic type binding. In Python, Ruby,
JavaScript, and PHP, type binding is dynamic
Ex. list = [10.2, 3.5]; ………list = 47;……..
The option of dynamic type binding was included in C#
2010
dynamic any;
/20
66 Variables
Type Binding
Dynamic type Binding disadvantage 1
It causes programs to be less reliable
Less error detection capabitity
For example, suppose that in a particular JavaScript program, i
and x are currently the names of scalar numeric variables and y is
currently the name of an array. Furthermore, suppose that the
program needs the assignment statement
i = x;
but because of a keying error, it has the assignment statement
i = y;
In JavaScript (or any other language that uses dynamic type
binding), no error is detected in this statement by the
interpreter—the type of the variable named i is simply changed
to an array
/20
67 Variables
Type Binding
Dynamic type Binding disadvantage 2
COST
The cost of implementing dynamic attribute binding is
considerable, particularly in execution time. Type
checking must be done at run time. Furthermore,
every variable must have a run-time descriptor
associated with it to maintain the current type.
/20
68 Variables
Storage Binding & Lifetime
Allocation
The memory cell to which a variable is bound somehow
must be taken from a pool of available memory
Deallocation
the process of placing a memory cell that has been
unbound from a variable back into the pool of available
memory.
Lifetime
The time during which the variable is bound to a specific
memory location
The lifetime of a variable begins when it is bound to a
specific cell and ends when it is unbound from that cell
/20
69 Variables
Storage Binding & Lifetime
Selon storage binding of a variable and according to the
lifetime:
Scaler Variable
Static
stack-dynamic
explicit heap-dynamic
implicit heap-dynamic
/20
70 Variables
Scope
Visibility of a variable
/20