Assembly Language
Biruk Belete-UU89100
Department Of Computer Science, Unity University
Computer Organization and Assembly Programming, COSC 2042
Section-N12
Mr. Germaye
July 22, 2023
What is Assembly Language?
An assembly language is a symbolic representation of the machine language of a specific
processor, augmented by additional types of statements that facilitate program writing and that
provide instructions to the Assembler (A program that translates assembly language into
machine code).
Low-level programming languages such as assembly language are a necessary bridge
between the underlying hardware of a computer and the higher-level programming languages
such as Python or JavaScript in which modern software programs are written.
Today, assemble languages are rarely written directly, although they are still used in some
niche applications such as when performance requirements are particularly high .
How Does It Works
Assembly language is low-level code that relies on a strong relationship between the
instructions input using the coding language and how a machine interprets the code instructions.
Assembly language is hardware dependent, with a different assembly language for each type
of processor. In particular, assembly language instructions can make reference to specific
registers in the processor, include all of the opcodes of the processor, and reflect the bit length of
the various registers of the processor and operands of the machine language. An assembly
language programmer must therefore understand the computer’s architecture. Sometimes there is
more than one assembler for the same architecture, and sometimes an assembler is specific to
an operating system or to particular operating systems. Most assembly languages do not provide
specific syntax for operating system calls, and most assembly languages can be used universally
with any operating system, as the language provides access to all the real capabilities of
the processor, upon which all system call mechanisms ultimately rest. In contrast to assembly
languages, most high-level programming languages are generally portable across multiple
architectures but require interpreting or compiling, much more complicated tasks than
assembling.
The terms assembly language and machine language are sometimes, erroneously, used
synonymously. Machine language consists of instructions directly executable by the processor.
Each machine language instruction is a binary string containing an opcode, operand references,
and perhaps other bits related to execution, such as flags. For convenience, instead of writing an
instruction as a bit string, it can be written symbolically, with names for opcodes and registers.
An assembly language makes much greater use of symbolic names, including assigning names to
specific main memory locations and specific instruction locations. Assembly language also
includes statements that are not directly executable but serve as instructions to the assembler that
produces machine code from an assembly language program.
Components of Assembly Language
The common components of an assembly language uses are the following:
LABEL - the assembler defines the label as equivalent to the address into which the first byte of
the object code generated for that instruction will be loaded. The programmer may subsequently
use the label as an address or as data in another instruction’s address field. The assembler
replaces the label with the assigned value when creating an object program. Labels are most
frequently used in branch instructions.
The main uses of label are the following;
A label makes a program location easier to find and remember.
The label can easily be moved to correct a program. The assembler will automatically
change the address in all instructions that use the label when the program is reassembled.
The programmer does not have to calculate relative or absolute memory addresses, but
just uses labels as needed.
MNEMONIC - The mnemonic is the name of the operation or function of the assembly
language statement. As discussed subsequently, a statement can correspond to a machine
instruction, an assembler directive, or a macro. In the case of a machine instruction, a mnemonic
is the symbolic name associated with a particular opcode.
OPERAND - An assembly language statement includes zero or more operands. Each operand
identifies an immediate value, a register value, or a memory location. Typically, the assembly
language provides conventions for distinguishing among the three types of operand references,
as well as conventions for indicating addressing mode.
an assembly language statement may refer to a register operand by name. With their symbolic
name and their bit encoding The assembler will translate the symbolic name into the binary
identifier for the register.
COMMAND – is a symbolic representation of machine language instructions. Almost
invariably, there is a one-to-one relationship between an assembly language instruction and a
machine instruction. The assembler resolves any symbolic references and translates the assembly
language instruction into the binary string that comprises the machine instruction. Often
commands in assembly languages use abbreviations to keep terminology short while also using
self-descriptive abbreviations, such as a language using “ADD” for addition and “MOV” to
transfer data.
DIRECTIVES - Directives, also called pseudo-instructions, are assembly language statements
that are not directly translated into machine language instructions. Instead, directives are
instruction to the assembler to perform specified actions doing the assembly process. Examples
include the following:
Define constants
Designate areas of memory for data storage
Initialize areas of memory
Place tables or other fixed data in memory
Allow references to other programs
MACRO – a macro is a statement that functions as a short representation of sequence of other
instructions and directives. You then can call this macro to run. This makes your code more
efficient by allowing you to replicate extensive code sections more than once as shorter macro
calls. You also can update your code more efficiently because changing the macro change all
references to the macro.
Sample Code
As an example of the use of assembly language, we look at a program to compute the greatest
common divisor of two integers. We define the greatest common divisor of the integers a and b
as follows: where we say that k divides a if there is no remainder. Euclid’s algorithm for the
greatest common divisor is based on the following theorem. For any nonnegative integers a and
integer b, Here is a C language program that implements Euclid’s algorithm:
unsigned int gcd (unsigned int a, unsigned int b)
if (a == 0 && b == 0) b = 1;
else if (b == 0) b = a;
else if (a != 0)
while (a != b)
if (a<b) b -= a;
else a -= b;
return b;
The compiled assembly version of the preceding code is the following:
gcd: mov ebx,eax
mov eax,edx
test ebx,ebx
jne L1
test edx,edx
jne L1
mov eax,1
ret
L1: test eax,eax
jne L2
mov eax,ebx
ret
L2: test ebx,ebx
je L5
L3; cmp ebx,eax
je L5
jae L4
sub eax,ebx
jmp L3
L4: sub ebx,eax
jmp L3
L5: ret
Advantage and Disadvantage
Disadvantages of Assembly Language
1. Writing code in assembly language takes much longer than writing in a high-level
language.
2. Reliability and security - It is easy to make errors in assembly code. The assembler is not
checking if the calling conventions and register save conventions are obeyed. Nobody is
checking for you if the number of PUSH and POP instructions is the same in all possible
branches and paths. There are so many possibilities for hidden errors in assembly code that it
affects the reliability and security of the project unless you have a very systematic approach to
testing and verifying.
3. Debugging and verifying - Assembly code is more difficult to debug and verify because
there are more possibilities for errors than in high-level code.
4. Maintainability - Assembly code is more difficult to modify and maintain because the
language allows unstructured spaghetti code and all kinds of tricks that are difficult for others to
understand. Thorough documentation and a consistent programming style are needed.
5. Portability - Assembly code is platform-specific. Porting to a different platform is
difficult.
6. System code can use intrinsic functions instead of assembly. The best modern C compilers
have intrinsic functions for accessing system control registers and other system instructions.
Assembly code is no longer needed for device drivers and other system code when intrinsic
functions are available.
7. Application code can use intrinsic functions or vector classes instead of assembly. The best
modern C compilers have intrinsic functions for vector operations and other special instructions
that previously required assembly programming.
8. Compilers have been improved a lot in recent years. The best compilers are now quite
good. It takes a lot of expertise and experience to optimize better than the best C compiler.
Advantages of Assembly Language
1. Debugging and verifying - Looking at compiler-generated assembly code or the
disassembly window in a debugger is useful for finding errors and for checking how well a
compiler optimizes a particular piece of code.
2. Making compilers - Understanding assembly coding techniques is necessary for making
compilers, debuggers and other development tools.
3. Embedded systems - Small embedded systems have fewer resources than PCs and
mainframes. Assembly programming can be necessary for optimizing code for speed or size in
small embedded systems.
4. Hardware drivers and system code - Accessing hardware, system control registers etc.
may sometimes be difficult or impossible with high level code.
5. Accessing instructions that are not accessible from high-level language. Certain assembly
instructions have no high-level language equivalent.
6. Self-modifying code - Self-modifying code is generally not profitable because it interferes
with efficient code caching. It may, however, be advantageous, for example, to include a small
compiler in math programs where a user-defined function has to be calculated many times.
7. Optimizing code for size - Storage space and memory is so cheap nowadays that it is not
worth the effort to use assembly language for reducing code size. However, cache size is still
such a critical resource that it may be useful in some cases to optimize a critical piece of code for
size in order to make it fit into the code cache. 8. Optimizing code for speed. Modern C
compilers generally optimize code quite well in most cases. But there are still cases where
compilers perform poorly and where dramatic increases in speed can be achieved by careful
assembly programming.
9. Function libraries - The total benefit of optimizing code is higher in function libraries that
are used by many programmers.
10. Making function libraries compatible with multiple compilers and operating systems. It is
possible to make library functions with multiple entries that are compatible with different
compilers and different operating systems.