Static and heap allocation
STORAGE ORGANISATION
The executing target program runs in its own logical address
space in which each program value has a location.
• The management and organization of this logical address
space is shared between the complier, operating system and
target machine. The operating system maps the logical address
into physical addresses, which are usually spread throughout
memory.
• Run-time storage comes in blocks, where a byte is the smallest
unit of addressable memory. Four bytes form a machine word.
Multibyte objects are stored in consecutive bytes and given the
address of first byte.
The storage layout for data objects is strongly influenced by the
addressing constraints of the target machine.
A character array of length 10 needs only enough bytes to hold
10 characters, a compiler may allocate 12 bytes to get
alignment, leaving 2 bytes unused.
This unused space due to alignment considerations is referred to
as padding.
The size of some program objects may be known at run time
and may be placed in an area called static.
The dynamic areas used to maximize the utilization of space at
run time are stack and heap.
STORAGE ALLOCATION STRATEGIES
The different storage allocation strategies are :
1. Static allocation - lays out storage for all data objects at
compile time
2. Stack allocation-manages the run-time storage as a stack.
3. Heap allocation-allocates and deallocates storage as needed
at run time from a data area known as heap.
STATIC ALLOCATION
In static allocation, names are bound to storage as the program
is compiled, so there is no need for a run-time support package.
Since the bindings do not change at run-time, everytime a
procedure is activated, its names are bound to the same storage
locations.
Therefore values of local names are retained across activations
of a procedure. That is, when control returns to a procedure the
values of the locals are the same as they were when control left
the last time.
From the type of a name, the compiler decides the amount of
storage for the name and decides where the activation records
go. At compile time, we can fill in the addresses at which the
target code can find the data it operates on.
STATIC ALLOCATION
In static allocation, names are bound to storage as the program
is compiled, so there is no need for a run-time support package.
Since the bindings do not change at run-time, everytime a
procedure is activated, its names are bound to the same storage
locations.
Therefore values of local names are retained across activations
of a procedure. That is, when control returns to a procedure the
values of the locals are the same as they were when control left
the last time.
From the type of a name, the compiler decides the amount of
storage for the name and decides where the activation records
go. At compile time, we can fill in the addresses at which the
target code can find the data it operates on.
HEAP ALLOCATION
Stack allocation strategy cannot be used if either of the
following is possible:
1. The values of local names must be retained when an
activation ends.
2. A called activation outlives the caller.
3.Heap allocation parcels out pieces of contiguous storage, as
needed for activation records or other objects.
4.Pieces may be deallocated in any order, so over the time the
heap will consist of alternate areas that are free and in use.
The record for an activation of procedure r is retained when the
activation ends.
Therefore, the record for the new activation q(1, 9) cannot
follow that for s physic
If the retained activation record for r is deallocated, there will be
free space in the between the activation records for s and a.
2. what is the need for syntax directed translation.defines
between syntazies and inhereted attribute
Syntax-directed translation is a technique used in compilers and
interpreters to translate source code from one language to
another, typically from a higher-level programming language to
machine code or an intermediate representation. It involves
associating semantic actions with the grammar rules of the
source language. These actions are performed during parsing
and are directed by the syntax of the language.
### Need for Syntax-Directed Translation:
1. **Integration of Semantics with Syntax:**
- Syntax-directed translation allows the integration of semantic
actions directly within the grammar rules of the source
language. This ensures that the translation process aligns closely
with the syntactic structure of the language.
2. **Automatic Code Generation:**
- By associating semantic actions with grammar rules, syntax-
directed translation facilitates automatic generation of target
code or intermediate representations. This helps in reducing the
manual effort required to write translators or compilers.
3. **Error Detection and Recovery:**
- During the parsing phase, syntax-directed translation can
detect certain semantic errors that arise due to inconsistencies
in the use of variables, types, or operations. This aids in better
error reporting and recovery strategies.
4. **Optimization Possibilities:**
The structured nature of syntax-directed translation allows for
potential optimizations to be applied during the translation
process. This can lead to improved performance and efficiency
of the generated code.
### Inherited Attributes vs. Synthesized Attributes:
In the context of syntax-directed translation, attributes are used
to pass information between different parts of the grammar
rules or between different nodes in the parse tree. There are
two main types of attributes:
- **Inherited Attributes:**
- These attributes are passed down from parent nodes to child
nodes in the parse tree. They are used when information needs
to flow from higher-level constructs to lower-level constructs
during parsing. Inherited attributes typically reflect information
that is computed or determined higher up in the parse tree.
- **Synthesized Attributes:**
- These attributes are computed or synthesized within the
grammar rules themselves. They derive their value based on the
attributes of their child nodes. Synthesized attributes are used
when information needs to flow from lower-level constructs to
higher-level constructs in the parse tree, aggregating
information as the parse progresses.
3.Define the address codes and discuss the transulation of
expressions into three address codes with an example
Address codes, also known as intermediate codes, are
representations of instructions or operations that serve as an
intermediate step between the source code (high-level
language) and the target code (machine code or lower-level
representation). These codes abstractly represent computations
or actions in a way that can be easily translated into machine
code by a compiler or interpreter.
### Translation of Expressions into Three Address Codes:
Three address codes are a specific type of intermediate code
where each instruction operates on at most three addresses or
operands. These codes are useful for representing complex
expressions in a format that's easier to translate into assembly
language or machine code. Let's discuss how expressions are
typically translated into three address codes using an example.
#### Example Expression:
Consider the following expression in a hypothetical
programming language:`
a = (b + c) * (d - e)
Steps to Translate into Three Address Codes
1. **Identify Temporary Variables:**
- Temporary variables are introduced to hold intermediate
results. These are denoted as `t1`, `t2`, etc.
2. **Break Down the Expression:**
- Decompose the expression into simpler parts where each
operation has at most two operands.
3. **Generate Three Address Codes:**
t1 = b + c
t2 = d - e
t3 = t1 * t2
a = t3
#### Explanation:
- **Step 1:** `t1 = b + c`
- Here, `t1` is a temporary variable that stores the result of the
addition of `b` and `c`.
- **Step 2:** `t2 = d - e`
- `t2` is another temporary variable that stores the result of the
subtraction of `d` and `e`.
- **Step 3:** `t3 = t1 * t2`
- `t3` holds the result of multiplying `t1` and `t2`.
- **Step 4:** `a = t3`
- Finally, assign the value of `t3` to variable `a`.
### Three Address Code Representation:
In the above example, the three address codes represent each
operation or assignment with a simple instruction that operates
on up to three addresses (operands):
- `t1 = b + c`
- `t2 = d - e`
- `t3 = t1 * t2`
- `a = t3`
### Benefits of Three Address Codes:
- **Simplicity:** They provide a clear and systematic
representation of operations.
- **Efficiency:** Easier for further optimization or translation
into machine code.
- **Flexibility:** Suitable for a wide range of expressions and
operations.
4.list and discuss the key issues to consider the design of code
generator
Designing a code generator for a compiler or interpreter involves
several key considerations to ensure efficient and correct
translation of source code into target code (e.g., machine code,
intermediate code). Here are the key issues to consider:
### 1. Target Architecture
- **Description:** The code generator must be aware of the
target hardware architecture where the generated code will
execute.
- **Considerations:**
- **Instruction Set:** Understand the instructions available on
the target platform and how they operate.
- **Registers:** Utilize registers efficiently for storing
intermediate values and results.
- **Memory Access:** Consider memory hierarchy (cache,
main memory) and alignment requirements.
### 2. Intermediate Representation
- **Description:** The intermediate representation (IR) used by
the compiler influences the complexity and efficiency of code
generation.
- **Considerations:**
- **Level of Abstraction:** Choose an IR that balances ease of
manipulation and closeness to source/target.
- **Optimization Opportunities:** Ensure the IR allows for
optimization passes before code generation.
### 3. Instruction Selection
- **Description:** Translating high-level operations into
equivalent low-level machine instructions.
- **Considerations:**
- **Mapping Operations:** Efficiently map high-level
language constructs (e.g., loops, function calls) to machine
instructions.
- **Optimization:** Select instructions that maximize
performance while adhering to semantics.
### 4. Register Allocation
- **Description:** Efficient use of registers for storing
intermediate values during code generation.
- **Considerations:**
- **Allocation Strategy:** Choose between static allocation,
dynamic allocation (e.g., using graph coloring), or hybrid
approaches.
- **Lifetime Analysis:** Determine when registers can be
reused or freed.
### 5. Addressing Modes and Memory Management
- **Description:** Handling memory access and addressing
modes based on the target architecture.
- **Considerations:**
- **Address Calculation:** Ensure correct calculation of
memory addresses considering base registers, offsets, and scale
factors.
- **Memory Management:** Handle allocation and
deallocation of memory for variables and data structures.
6. Code Quality and Optimization
**Description:** Generating efficient and optimized code that
minimizes execution time and resource usage.
**Considerations:**
**Code Size:** Optimize generated code size for memory-
constrained environments.
**Performance:** Apply optimization techniques such as loop
unrolling, constant folding, and common subexpression
elimination.
5.Basic Block Generation
Algorithm: Partition into basic blocks
Input: A sequence of three-address statements
Output: A list of basic blocks with each three-address statement
in exactly one block
Method:
1. We first determine the set of leaders, the first statements of
basic blocks. The rules we use are of the following:
a. The first statement is a leader.
b. Any statement that is the target of a conditional or
unconditional goto is a leader.
c. Any statement that immediately follows a goto or conditional
goto statement is a leader.
For each leader, its basic block consists of the leader and all
statements up to but not
2. including the next leader or the end of the program.
Transformations on Basic Blocks:
A number of transformations can be applied to a basic block
without changing the set of expressions computed by the block.
Two important classes of transformation are :
• Structure-preserving transformations
• Algebraic transformations
1. Structure preserving transformations:
a) Common subexpression elimination:
a := b + c
a := b + c
b:=a-d
b:=a-d–
c:=b+c
c:=b+c
d:=a-d
d:= b
Since the second and fourth expressions compute the same
expression, the basic block can be transformed as above.
b) Dead-code elimination:
Suppose x is dead, that is, never subsequently used, at the point
where the statement x : = y + z appears in a basic block. Then
this statement may be safely removed without changing the
value of the basic block.
c) Renaming temporary variables:
A statement t := b+c(t is a temporary) can be changed to u:= b +
c (u is a new temporary) and all uses of this instance of t can be
changed to u without changing the value of the basic block.
Such a block is called a normal-form block.
2. Algebraic transformations:
Algebraic transformations can be used to change the set of
expressions computed by a basic
block into an algebraically equivalent set.
Examples:
i) x : = x + 0 or x : = x * 1 can be eliminated from a basic block
without changing the set of
expressions it computes.
ii) The exponential statement x : =y ** 2 can be replaced by x : =
y * y.
6. generate code for the arthmatic expression z=(u+v)-((w+x)-
y)by applying code generation algorithm
To generate code for the arithmetic expression \( z = (u + v) -
((w + x) - y) \), we will apply a simple code generation algorithm
that translates each operation into a sequence of three-address
code instructions. Here's how we can break down and generate
the code:
### Step-by-Step Code Generation:
1. **Assigning Temporary Variables:**
- We'll use temporary variables \( t1 \), \( t2 \), and \( t3 \) to
hold intermediate results.
2. **Generate Three Address Codes:**
t1 = u + v
t2 = w + x
t3 = t2 - y
z = t1 - t3
Explanation:
- **Step 1:** \( t1 = u + v \)
- Compute the sum of variables \( u \) and \( v \), storing the
result in \( t1 \).
- **Step 2:** \( t2 = w + x \)
- Compute the sum of variables \( w \) and \( x \), storing the
result in \( t2 \).
- **Step 3:** \( t3 = t2 - y \)
- Subtract variable \( y \) from \( t2 \), storing the result in \
( t3 \).
- **Step 4:** \( z = t1 - t3 \)
- Subtract \( t3 \) from \( t1 \), storing the final result in \( z \).
Three Address Code:
Here's the corresponding three-address code:
1. t1 = u + v
2. t2 = w + x
3. t3 = t2 - y
4. z = t1 - t3
Conclusion:
This sequence of three-address code instructions accurately
represents the arithmetic expression \( z = (u + v) - ((w + x) -
y) \) in a form that can be easily translated into assembly
language or machine code by a compiler or interpreter. Each step
ensures that intermediate results are computed and stored
appropriately before being used in subsequent operations,
adhering to the principles of code generation.
7.Discuss various data strctures used to implement a symbol
table
A symbol table is a crucial data structure used in compilers,
interpreters, and programming language implementations to
store information about identifiers (variables, functions,
constants, etc.) defined in the source code. It facilitates efficient
lookup, insertion, and management of symbols during various
stages of compilation or interpretation. Several data structures
can be employed to implement symbol tables, each offering
different trade-offs in terms of speed, memory usage, and ease of
implementation. Here are some common data structures used for
implementing symbol tables:
1. Hash Tables
**Description:**
- **Structure:** Hash tables store key-value pairs where the key
is typically the symbol name, and the value is a structure
containing information about the symbol (e.g., type, scope,
memory location).
- **Advantages:**
- **Fast Lookup:** Average-case time complexity for lookup,
insertion, and deletion is \( O(1) \).
- **Efficient Memory Usage:** Can be more memory-efficient
than other structures if properly sized and implemented.
- **Considerations:**
- **Collision Handling:** Requires a collision resolution
strategy (e.g., chaining, open addressing) to handle cases where
different keys hash to the same index.
2. Binary Search Trees (BST)
**Description:**
- **Structure:** Binary search trees organize symbols based on
their names in a hierarchical manner.
- **Advantages:**
- **Ordered Access:** Allows traversal in sorted order of
symbol names.
- **Dynamic Operations:** Efficient for operations like insert,
delete, and search (average-case time complexity \( O(\log n) \)
for balanced trees).
- **Considerations:**
- **Balancing:** Requires balancing techniques (e.g., AVL
trees, Red-Black trees) to maintain optimal performance.
3. Linked Lists
**Description:**
- **Structure:** Simple linked lists can be used where each node
stores a symbol entry.
- **Advantages:**
- **Simplicity:** Easy to implement and understand.
- **Dynamic Size:** Can dynamically grow or shrink as
symbols are added or removed.
- **Considerations:**
- **Linear Search:** \( O(n) \) time complexity for searching,
which may be inefficient for large symbol tables.
### 4. Symbol Tables as Arrays or Lists
**Description:**
- **Structure:** Arrays or lists where symbols are stored
consecutively.
- **Advantages:**
- **Sequential Access:** Symbols can be accessed sequentially
for iteration or printing.
- **Considerations:**
- **Efficiency:** Search operations may require \( O(n) \) time
complexity unless symbols are sorted or indexed.
5. Balanced Search Trees (e.g., AVL Trees, Red-Black Trees)
**Description:**
- **Structure:** Self-balancing binary search trees that ensure
logarithmic time complexity for insert, delete, and search
operations.
- **Advantages:**
- **Guaranteed Performance:** Maintain \( O(\log n) \) time
complexity even in worst-case scenarios.
- **Considerations:**
- **Complexity:** More complex to implement compared to
basic data structures like hash tables.
### 6. Trie (Prefix Tree)
**Description:**
- **Structure:** Trie is used when symbols have a hierarchical
or structured naming convention (e.g., namespaces, packages).
- **Advantages:**
- **Prefix Search:** Efficient for prefix-based symbol lookup.
- **Considerations:**
- **Memory Usage:** Can be memory-intensive depending on
the structure and depth of the trie.
8. Explain scope reprasentation with symbol table
Understanding Symbol Tables
A symbol table typically associates each symbol (variable,
function, constant) with metadata that describes its properties,
such as its type, scope (global, local), memory location, and
other relevant attributes. When a compiler or interpreter
processes source code, it maintains and updates the symbol table
to keep track of symbols as they are encountered and defined.
Representation of Scopes in Symbol Tables
1. **Global Scope:**
- Symbols defined outside of any function or block are
typically stored in the global scope.
- Global symbols are accessible throughout the entire program.
2. **Local Scopes (Function and Block Scopes):**
- Each function or procedure typically has its own scope where
local variables and parameters are defined.
- Block scopes may exist within functions or other blocks (like
`if`, `while`, `for` statements), where variables declared within
them are only accessible within that block.
How Scopes are Managed in Symbol Tables
Symbol tables manage scopes by organizing symbols based on
their scope level and visibility rules. Here's how this
organization is typically handled:
- **Nested Scopes:** Symbol tables can be structured
hierarchically to reflect nested scopes. For example:
- A global scope may contain entries for global variables and
functions.
- Each function scope may contain entries for local variables
and parameters specific to that function.
- Block scopes within functions may contain entries for
variables declared within those blocks.
- **Scope Resolution:**
- During name resolution (when a symbol is referenced), the
symbol table is queried starting from the innermost scope going
outwards.
- This ensures that local variables are prioritized over global
variables of the same name, adhering to the scope rules of the
programming language.
The semantic rules are defined in terms of the following
operations:
mktable(previous) creates a new symbol table and returns a
pointer to the new table. The
argument previous points to a previously created symbol table,
presumably that for the
enclosing procedure.
2. enter(table, name, type, offset) creates a new entry for name
name in the symbol table pointed
to by table. Again, enter places type type and relative address
offset in fields within the entry.
addwidth(table, width) records the cumulative width of all the
entries in table in the header
associated with this symbol table.
4. enterproc(table, name, newtable) creates a new entry for
procedure name in the symbol table
pointed to by table. The argument newtable points to the symbol
table for this procedure name.
9. Describe the ground strcture of an activation record.
explain the purpose of each item used in action record
An activation record, also known as a stack frame, is a
fundamental data structure used by compilers and interpreters to
manage the execution of functions or procedures in a program. It
provides a structured way to manage the state of a function call,
including parameters, local variables, return addresses, and other
necessary information. Let's describe the basic structure of an
activation record and the purpose of each item typically included
in it:
### Structure of an Activation Record:
1. **Return Address:**
- **Purpose:** Stores the address to which control should
return after the function completes execution. This is crucial for
supporting nested function calls and ensuring the correct flow of
control.
2. **Previous Frame Pointer (or Frame Pointer):**
- **Purpose:** Points to the base of the activation record of
the calling function. It allows the current function to access its
caller's variables and parameters, facilitating nested function
calls and stack traversal.
3. **Parameters:**
- **Purpose:** Holds values passed to the function from its
caller. Parameters may be passed by value, reference, or pointer,
depending on the programming language and architecture.
4. **Local Variables:**
- **Purpose:** Stores variables declared within the function
body. Local variables have function scope and are typically
allocated on the stack when the function is called and deallocated
when the function exits.
5. **Temporary Variables:**
- **Purpose:** Holds intermediate values computed during the
execution of the function. Temporary variables are often used by
the compiler or interpreter for optimization purposes.
6. **Saved Machine Status:**
- **Purpose:** Saves the state of the machine registers that the
function modifies. This ensures that the function does not
unintentionally modify or corrupt the state of registers used by
its caller or other functions.
### Purpose of Each Item in an Activation Record:
- **Return Address:** Ensures proper control flow after the
function completes execution or encounters a return statement.
- **Previous Frame Pointer:** Facilitates stack unwinding and
access to variables in the calling function.
- **Parameters:** Provides access to values passed from the
caller, allowing the function to operate on input data.
- **Local Variables:** Stores variables declared within the
function, preserving their values across function calls.
- **Temporary Variables:** Holds intermediate computations
or optimizations performed by the compiler or interpreter.
- **Saved Machine Status:** Maintains the integrity of
machine registers across function calls, preventing unintended
side effects.
- **Dynamic Link:** Supports dynamic scoping and resolution
of nested function calls.
- **Static Link:** Enables access to non-local variables in
lexical scopes, supporting nested function definitions.
- **Exception Handling Information:** Manages and
propagates exceptions or errors encountered during function
execution, ensuring robust error handling.