Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
10 views35 pages

Report FPGA

This document outlines a project focused on the design and implementation of an Arithmetic Logic Unit (ALU) using Verilog at the Viet Nam – Korea University of Information & Communication Technology. The project aims to educate students on the principles of ALU operation, design basic arithmetic and logic functions, and evaluate performance through simulation and testing. The report includes a detailed implementation plan, theoretical foundations, and applications of ALUs in various computing contexts.

Uploaded by

trihv.21ce
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views35 pages

Report FPGA

This document outlines a project focused on the design and implementation of an Arithmetic Logic Unit (ALU) using Verilog at the Viet Nam – Korea University of Information & Communication Technology. The project aims to educate students on the principles of ALU operation, design basic arithmetic and logic functions, and evaluate performance through simulation and testing. The report includes a detailed implementation plan, theoretical foundations, and applications of ALUs in various computing contexts.

Uploaded by

trihv.21ce
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 35

VIET NAM – KOREA UNIVERSITY OF

INFORMATION & COMMUNICATION TECHNOLOGY

FACULTY OF COMPUTER ENGINEERING AND


ELECTRONIC

FPGA/ASIC DESIGN WITH VERILOG


ARITHMETIC AND LOGIC PROCESSORS
(ALU- ARITHMETIC LOGIC UNIT)

Student Implementation : Tang Van Binh- 21CE006


Dang Anh Cuong- 21CE007
Huynh Van Tri- 21CE052
Class : 21CE1
Instructor: TS. Duong Ngoc Phap

Da Nang, June 2025


VIET NAM – KOREA UNIVERSITY OF
INFORMATION & COMMUNICATION TECHNOLOGY

FACULTY OF COMPUTER ENGINEERING AND


ELECTRONIC

FPGA/ASIC DESIGN WITH VERILOG


ARITHMETIC AND LOGIC PROCESSORS
(ALU- ARITHMETIC LOGIC UNIT)

Student Implementation : Tang Van Binh- 21CE006


Dang Anh Cuong- 21CE007
Huynh Van Tri- 21CE052
Class : 21CE1
Instructor: TS. Duong Ngoc Phap

Da Nang, June 2025


LECTURER'S COMMENTS

……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………

Da Nang, June 2025


Lecturer
TABLE OF CONTENTS

BEGIN 1
1. Introduction 1
2. Objectives of the project 1
3. Contents and implementation plan 2
4. Report layout 4
Chapter 1. TOPIC OVERVIEW 5
1. Analyze the requirements of the project 5
1.1. Title of the topic 5
1.2. Basic Functions 5
1.3 Application of the topic 5
Chapter 2. THEORETICAL BASIS 6
2.1 Overview of ALU (ARITHMETIC LOGIC UNIT) 6
2.2 Overview of QuartusII 7
2.3 Function Introduction 7
2.3.1. Add 2 binary numbers 7
2.3.2. Subtracting 2 binary numbers 8
2.3.3. Multiplying 2 binary numbers 8
2.3.4. AND 8
2.3.5. OR 8
2.3.6. XOR 9
2.3.7. NOT 9
2.3.8. Bit translation operations 9
2.4 Verilog Overview 9
2.5 Board DE2 Overview 10
CHAPTER 3. CONSTRUCTION IMPLEMENTATION 12
3.1 Block Diagram 12
3.2 Waveform Simulation on Modelsim 13
3.3 Optimal Results 15
3.4 Video test on board DE2 17
CONCLUDE 18
REFERENCES 19
ADDENDUM 20
DRAWING CATALOGUE

Figure 2. 1 ALU 7
Figure 2. 2 Verilog Overview 10
Figure 2. 3 board DE2 11
Figure 3. 1 Block Diagram 12
Figure 3. 2 Waveform Simulation on Modelsim 13
Figure 3. 3 SDC Files 15
Figure 3. 4 Setup slack 15
Figure 3. 5 Hold slack 15
Figure 3. 6 Clocks 16
Figure 3. 7 Fmax 16
Figure 3. 8 Power Analyze 16
TABLE CATEGORIES

Table 1. 1 Implementation Plan 4


Table 3. 1 Test Function 17

BEGIN

1. Introduction

- Arithmetic Logic Units (ALUs) are a key functional component in the


architecture of central processing units (CPUs). ALU is responsible for
executing arithmetic operations (such as addition, subtraction, multiplication,
division) and logical operations (AND, OR XOR, NOT, comparison, bit
translation, etc.), thereby supporting data processing in most computational
tasks of computer systems. In the development of computer hardware, the
design and optimization of ALUs is an important step to improve processing
performance, reduce latency, and optimize hardware resources. ALUs are
typically integrated as combinatorial circuits, which use basic logical blocks to
perform binary operations on the input data. In addition, the ALU can be
combined with control components and registers to form complete processing
systems.
- This project focuses on the operating principles, functional structure, and design
process of a simple ALU. Through the implementation of the ALU model using
a hardware description language (HDL) such as Verilog or VHDL, students
have the opportunity to apply their knowledge of numerical logic,
combinatorial circuit design, and numerical system simulation. The goal of the
project is to build an ALU that is capable of performing a set of basic
mathematical operations, while ensuring accuracy, scalability, and resource
efficiency when deployed on an actual hardware platform (FPGA or simulation
environment).
- The topic not only contributes to strengthening the foundational knowledge of
digital systems and computer architecture, but also serves as an important
stepping stone in the development of microprocessor and microcontroller
systems in modern industrial and embedded applications.
2. Objectives of the project

The main goal of the project "Arithmetic Logic Unit (ALU)" is to research, design
and realize a simple ALU model capable of performing basic operations in data
processing. Specifically, the topic aims at the following objectives:

- Learn the theory and working principles of ALU

To study the roles, functions, general architecture and operation modes


of arithmetic and logic processors in microprocessor systems.

- ALU design with a set of basic math operations

Build an ALU model capable of performing arithmetic operations


(addition, subtraction) and logic (AND, OR, XOR, NOT), and support
bit translation (shift left, shift right) and comparison.

- Simulate and test the operation of the ALU

Apply a hardware description language (HDL) such as Verilog or VHDL


to design and simulate ALUs in specialized software environments such
as ModelSim or Vivado. Evaluate the correctness of the function through
a set of test datasets.

- Performance and scalability evaluation

Analyze factors such as logic complexity, propagation latency,


scalability, and the ability to integrate ALUs in larger microprocessor
systems.

- Strengthening digital system design skills

Help students master the method of designing combinatorial circuits,


understand the process of implementing a block of hardware functions
from logical description to simulation and actual testing.

3. Contents and implementation plan


a) Implementation content
Phase 1: Research the theoretical basis (Weeks 1-3)
● ALU Overview Study:
o Learn about ALU's role and place in computer architecture.
o Survey the development history of ALU and various ALU architectures.
o Study the types of mathematical operations that ALU usually performs
(arithmetic and logic).
● Arithmetic representation in computers:
o Study integer representation methods (signed and unsigned): offset two,
offset one, sign-value.
o Research on floating-point numerical representation methods according
to IEEE 754 standard.
o Analyze the influence of representation methods on the performance of
mathematical operations.
● Theory of basic arithmetic operations:
o Research algorithms and methods for performing addition and
subtraction (e.g., ripple-carry addition, lookahead-carry addition, using
double offset for subtraction).
o Research algorithms and methods for performing multiplication (e.g.,
multiplication, translation and addition, Booth multiplication).
o Research algorithms and methods for performing division (e.g.,
restorative division, non-restorative division).
Phase 2: ALU Architecture and Design Analysis (Weeks 4-7)
● ALU Overview Architecture Analysis:
o Identify the main functional blocks in an ALU capable of addition,
subtraction, multiplication, division (addition/subtraction, multiplication,
division, selector, temporary registers, etc.).
o Research methods for connecting and controlling these functional
blocks.
● Design of arithmetic functional blocks:
o Research and compare effective addition/subtraction circuit design
methods (e.g., using logic libraries, optimization techniques).
o Research and compare effective kernel circuit design methods (e.g.,
sequential multiplication, parallel multiplication, using advanced
multiplication algorithms).
o Research and compare methods of designing efficient division circuits
(e.g., sequential division, division by inverse multiplication).
● Study on ALU control signals:
o Identify the control signals needed to select the math and control the data
flow in the ALU.
o Analyze how control signals are generated and used.
Phase 3: Performance Evaluation and Optimization (Weeks 8-10)
● Factors affecting ALU performance:
o Analyze the ALU's critical performance indicators (e.g., latency,
throughput, area, power consumption).
o Studying the influence of architecture, algorithms, and fabrication
technology on performance.
● ALU optimization techniques:
o Investigate optimization techniques at the logical level (e.g., use
complex logical gates, reduce the number of ports).
o Investigate optimization techniques at the architectural level (e.g.,
pipelining, parallelism).
o Research simulation tools and methods to evaluate ALU performance.
● Comparison of different ALU architectures:
o Compare the performance and cost of different ALU architectures that
have the same function.
o Evaluate the pros and cons of each architecture in different applications.
Phase 4: Aggregation and Reporting (Weeks 11-12)
● Summary of research results:
o Systematize the knowledge and results obtained during the research
process.
o Make comments and reviews on ALU's design methods and
performance.
● Write a report:
o Prepare a detailed report on the research process and the results
achieved.
o Present analyses, comparisons, and conclusions in a clear and logical
manner.
● Give a talk:
o Prepare and present the research results to the council.
b) Implementation plan
Time Implementation content

Week 1 – Week 3 Come up with ideas


-Learn about the topic
Architectural analysis and design of ALU functional
Week 4 – Week 7
blocks

- Performance evaluation and optimization


Week 8 – Week 10 techniques
- Finishing off the rest of the product
Review the product and finalize the report and
Week 11 – Week 12
slides.
Table 1. 1 Implementation plan
4. Report layout
After the Introduction, the report is presented in three chapters, specifically as
follows:
Chapter 1. Overview of the topic. In this chapter, the report provides an overview of
the basic concepts and their importance
Chapter 2. Theoretical basis. In this chapter, the report covers programming
languages,.. used in the subject
Chapter 3.Implementation of construction. This chapter covers the block diagram,
how it works, and the product when finished
Finally, there are Conclusions, References and Appendices related to the topic.
Chapter 1. TOPIC OVERVIEW

1. Analyze the requirements of the project


1.1. Title of the topic
ARITHMETIC AND LOGIC UNIT (ALU- ARITHMETIC LOGIC UNIT)

1.2. Basic Functions


Arithmetic:
● Add, subtract, multiply, divide (if supported).
Logical math:
● AND, OR, XOR, NOR.
Compare:
● a = b, a > b,
Translate bits:
● Left translation (multiplied by 2), right translated (divided by 2).
Status flags:
● Updated zero, carry, overflow, sign flags.

1.3 Application of the topic


● Processing data in CPUs: ALUs are the core component of CPUs,
performing arithmetic (addition, subtraction, multiplication, division) and
logic (AND, OR, XOR, NOR) operations to process data in every
computer program, from office applications to games.
● Program Flow Control: The ALU supports comparisons (a = b, a > b, a
< b) to execute conditional jump commands, which helps control the
execution flow in the software.
● Embedded systems: ALUs are used in microcontrollers and embedded
devices (such as sensors, IoT devices, automobiles) to perform fast,
energy-efficient calculations.
● Digital Signal Processing (DSP): ALUs handle logical and arithmetic
operations in applications such as audio, video, radar, and
telecommunications.
● Artificial Intelligence and Machine Learning: ALU supports matrix
computation, logical arithmetic, and arithmetic in AI processors (such as
GPUs, TPUs).
● IC and FPGA design: Understanding ALUs helps optimize digital circuit
design, integrating complex arithmetic into custom chips.
● Encryption and Security: ALU's logical operations (XOR, AND) are
used in the encryption algorithm, error detection, and random number
generation.
● Education and Research: The ALU project is the foundation for
teaching computer architecture, hardware design, and programming, and
the basis for research on microprocessor optimization.
Chapter 2. THEORETICAL BASIS

2.1 Overview of ALU (ARITHMETIC LOGIC UNIT)


ALU (Arithmetic Logic Unit) – Arithmetic and Logic Processor – is one of
the core and most important components in the architecture of microprocessors and
computers.
The ALU is responsible for performing arithmetic and logical operations, where
direct data processing operations take place.
1. Role of ALU
ALUs play a central role in executing commands, performing computational
operations, and processing data in the CPU. The results of the operations are sent to
the
other parts such as registers, memory, or further used in subsequent processing
operations.
2. Main Functions
A basic ALU typically performs the following groups of operations:
● Arithmetic: addition, subtraction, sometimes including multiplication and
division.
● Logical math: AND, OR, XOR, NOT.
● Bit translation: shift left, shift right.
● Compare: check by, different, bigger, smaller, etc.
Depending on the complexity of the processor, the ALU can be expanded to
handle accent numbers, floating-point numbers, or integrate specialized processing
blocks such as FPUs (Floating Point Units).
3. General structure
An ALU typically consists of key components such as:
● Blocks perform math: addition, subtraction, logic circuits.
● Control Unit: receives a control signal from the CPU to select the operation to
be performed.
● Input/output register: temporarily stores class maths and calculation results.
● Flags: signals special conditions such as overflow, carry, negative, zero, etc.
4. Application
ALUs are integrated in most processors from simple to complex (such as RISC,
CISC, ARM, MIPS) and are the foundation for embedded systems, personal
computers, control devices, robots, and automation systems.
Figure 2. 1 ALU
2.2 Overview of QuartusII
Quartus II is a specialized digital circuit design software developed by Altera (now
part of Intel), which is commonly used in the design, simulation, and programming of
FPGA (Field Programmable Gate Array) and CPLD (Complex Programmable
Logic Device) devices.
The software provides a comprehensive integrated environment (IDE) that allows
users to perform a full range of digital circuit design steps, from hardware description
in HDL languages (such as Verilog or VHDL), to synthesis, analysis, simulation,
positioning, and loading programs into the chip.
Key features of the Quartus II include:
● Digital hardware design: supports HDL language, schematics, and status
charts.
● Design synthesis and analysis: optimization of logical resources, checking for
syntax and logic errors.
● Simulation and testing: combined with software such as ModelSim to test the
function and timing of the circuit.
● Positioning and Routing (Place & Route): defines the layout of resources on
the FPGA.
● Create a configuration file and load down the chip via the JTAG port.
Quartus II supports multiple Altera FPGA families such as Cyclone, Stratix, Arria,
and provides a MegaWizard Plug-In Manager tool to quickly create IP blocks
such as ALUs, counters, registers, RAM, etc. With its intuitive interface, powerful
features, and high compatibility, Quartus II is an essential tool in teaching and
developing modern embedded digital systems.
2.3 Function Introduction

2.3.1. Add 2 binary numbers


● Description: Perform the addition of two binary numbers (a + b), generate
sums and status flags (carry, overflow, zero, sign).
● Mechanism: Uses a full adder circuit to add each pair of bits, transferring the
memory bit (carry) to the next pair of bits. The addition circuit is optimized to
reduce latency and handle large bit-length numbers.
● Status flags:
o Carry: The last bit of memory.
o Overflow: Alerts when results exceed performance limits.
o Zero: The result is 0.
o Sign: The signature bit of the result.
● Application: Calculate total, increase record value, handle memory addresses.

2.3.2. Subtracting 2 binary numbers


● Description: Perform subtraction (a - b) by converting to addition 2 of b (a + (-
b)).
● Mechanism:
1. Calculate the complement of 2 of b: bit island (NOT) and add 1.
2. Use the plus circuit to do a+(-b).
3. Update the status flag similar to addition.
● Status flags: Carry (often understood as borrow), overflow, zero, sign.
● Application: Counting, reducing the value of the register, comparing numbers.

2.3.3. Multiplying 2 binary numbers


● Description: Perform the multiplication of two binary numbers, which is
equivalent to repeating addition and translating bits.
● Mechanism:
o Using the binary multiplication algorithm (shift-and-add): check each bit
of the second-order math, if the bit is 1, add the first-order (left-
translated) to the temporal product.
o Specialized kernel circuits (such as Wallace tree or Dadda tree) can be
deployed to speed up processing.
o The result is twice the bit length of the class math (n-bit × n-bit = 2n-bit).
● Status flags: Overflow (if the result is truncated), zero, sign.
● Application: Scientific calculation, graphics processing, digital signal
processing.

2.3.4. AND
● Description: Perform a logical AND on each pair of bits of two classes,
returning 1 if both bits are 1, and vice versa returning 0.
● Mechanism: Uses logical AND ports for each pair of bits, processing in
parallel to optimize speed.
● Status flags: Zero (if the result is 0), sign (the highest bit of the result).
● Application: Bit filtering, condition testing, signal processing.

2.3.5. OR

● Description: Performing a logical OR operation, returns 1 if at least one of the


two bits is 1, otherwise returns 0.
● Mechanism: Uses a logical OR port for each bit pair.
● Status flag: Zero, sign.
● Application: Bit combination, flag setting, logic processing.

2.3.6. XOR
● Description: Perform a logical XOR, return 1 if the two bits are different,
return 0 if they are the same.
● Mechanism: Uses a logical XOR port for each bit pair.
● Status flag: Zero, sign.
● Application: Encryption, error detection, random string generation.

2.3.7. NOT
● Description: Perform logical negation, return 1 if the input is 0, reverse return
0.
● Mechanism: Use the NOT port in the numeric circuit or the NOT operator in
the programming language.
● Status flags: Zero, Sign.
● Application: Logic reversal, negative signal processing, digital circuit design.

2.3.8. Bit translation operations


● Left Translation:
o Description: Translate the bits of arithmetic to the left some places, fill
in the 0 bits to the right.
o Mechanism:
▪ Use a shift register to move bits.

▪ Equivalent to multiplying 2 for each translation (n bits of


translation corresponds to multiplying 2^n).
o Status flags: Carry (bit pushed out), zero, sign.
o Application: Fast multiplication, bit processing, algorithm optimization.
● Right:
o Description: Translate the bits to the right, fill in the 0 bit (logical shift)
or the sign bit (arithmetic shift) to the left.
o Mechanism:
▪ Logical shift: Fill in the 0 in the highest bit.

▪ Arithmetic shift: Hold the bit of the sign to preserve the arithmetic
value.
▪ Equivalent to dividing by 2 for each translation.
o Status flags: Carry (bit pushed out), zero, sign.
o Application: Fast splitting, bit processing, data normalization.

2.4 Verilog Overview


Verilog is a Hardware Description Language (HDL) that is widely used in the
design and simulation of digital circuit systems, especially on programmable devices
such as FPGAs and ASICs. Verilog allows designers to model the behavior, structure,
and uptime of digital electronic circuits at different levels of abstraction.
The Verilog language has a syntax similar to the C programming language,
making it easily accessible to learners, especially for engineering students. Verilog not
only enables the description of combinatorial and sequential circuits, but also supports
simulation, testing, and synthesis for hardware realization.
Some of the key features of Verilog:
● Hardware Structure Description: Easily represent the logic circuit according
to functional blocks such as register, adder, ALU, controller,...
● Simulation Support: Functional and temporal simulations can be run to test
the correctness of the design before loading down the hardware.
● Good integration with EDA software: Like Quartus II, ModelSim, Vivado,...
used for designing, synthesizing, and programming FPGAs.
● The syntax is short, close to software programming, but describes hardware
in parallel rather than sequentially.
Applications of Verilog:
● Design of microprocessors, controllers, data transmission systems.

● Build IP Cores (reusable processing cores).


● Development of embedded systems on FPGAs.
● Simulate and test digital logic circuits in tools like ModelSim.

Figure 2. 2 Verilog Overview

2.5 Board DE2 Overview


Altera's (now Intel) DE2 (Development and Education Board) is a development and
education platform designed for learning, research, and development of digital
systems, especially in the fields of multimedia, storage, and networking. It is a
powerful tool, suitable for laboratory environments at universities and colleges, as well
as for complex digital system design and development projects.
Main features:
● FPGAs: The DE2 board uses the Cyclone II FPGA chip EP2C35F672 with
approximately 35,000 logic elements (LEs), providing powerful processing
power for digital applications.
● Memory:
o 8MB SDRAM
o 512K SRAM
o 4MB Flash
● Communication and I/O:
o TV Decoder
o Ethernet 10/100
o RS232 Port
o USB Host/Device
o Switch (toggle and pushbutton), LED, 7-segment LED display
o 16x2 character LCD display
o Audio (microphone, line-in) and video connection (VGA 10-bit DAC)
o Extended connectivity: USB 2.0, infrared (irDA) port, SD card slot
● Software Support: Altera offers the Quartus II toolkit, documentation, hands-
on exercises, and demonstration demos.
Purpose and Application:
● Education: Board DE2 is designed to support the teaching of subjects in
numerical logic, computer organization, and embedded system design. Students
can practice FPGA programming using VHDL/Verilog and use the Nios II IDE
to develop embedded systems.
● Design projects: With a variety of interfaces and large memory, DE2 is
suitable for multimedia (audio and video processing), networking, and storage
projects.
● System Development: Board supports the development of complex digital
systems, from simple applications such as LED control to systems that require
signal processing or networking.
Advantage:
● Integrate a variety of I/O interfaces, meeting the needs from basic to advanced.

● Good support for learning and experimentation with extensive documentation


from Altera.
● Expandability through connectors (expansion headers), allowing connection to
other boards.
Restrict:
● The DE2 uses the Cyclone II FPGA, which is older than newer versions such as
the DE2-115 (Cyclone IV), so the performance and power efficiency are lower.
● Some modern features (such as Gigabit Ethernet) are not available, which
appeared in later versions such as the DE2-115.
Figure 2. 3 board DE2
CHAPTER 3. CONSTRUCTION IMPLEMENTATION

3.1 Block Diagram

Figure 3. 1 Block Diagram

General Block Diagram Explained


1. Getting Started (Getting Started):
● The starting point of the process, indicating where the system starts
working.
2. Toggle opcode switch (+, -, *, AND, OR, XOR, NOT, A<<1, A>>1):
● This step allows the user to select a specific math or logic operation through
switches. The math operations include:
● Add (+), multiply (*)

● Logical math: AND, OR, XOR, NOR

● Compare: A>B, A<B

● Bit translation: A<<1 (left translation), A>>1 (right translation)

● This is the step to enter the code (opcode) to determine the operation to be
performed.
3. Input Switch Lever A and B (Input Switch Lever A and B):
● The user enters two values, A and B, through the switches. This is the input
data to perform the math or logic selected in the previous step.
4. Get key (Get key):
● This step can be to confirm or activate the command by pressing a key. It
signals that the values and operations have been entered completely, and the
system is ready to process them.
5. Display on 7-segment LED (Display on 7-segment LED):
● The results of the math or logic (based on A, B, and opcode) are displayed
on a 7-segment LED display. 7-segment LEDs are often used to display
simple numbers or symbols.
6. End (End):
● The end point of the process, where the system completes the processing
and displays the results.

3.2 Waveform Simulation on Modelsim


Figure 3. 2 Waveform Simulation on Modelsim
Simulation results:
Total simulation time is 900ns
# OP[0]: A=2, B=1, OUT=3 (ADD)
# OP[1]: A=3, B=2, OUT=1 (SUB)
# OP[2]: A=4, B=3, OUT=0 (AND)
# OP[3]: A=5, B=4, OUT=5 (OR
# OP[4]: A=6, B=5, OUT=3 (XOR)
# OP[5]: A=7, B=6, OUT=248 (NOT)
# OP[6]: A=8, B=7, OUT=16 (A<<1)
# OP[7]: A=9, B=8, OUT=4 (A>>1)
# OP[8]: A=10, B=9, OUT=90 (Multi)
# result_reg = 90

Explain:
● OP[0]: Addition
A = 2, B = 1
OUT = 2 + 1 = 3 (True)
● OP[1]: Subtraction
A = 3, B = 2
OUT = 3 – 2 = 1 (True)
● OP[2]: AND
A = 4 (0100), B = 3 (0011)
OUT = A & B = 0100 & 0011 = 0000 = 0 (True)
● OP [3]: OR
A = 5 (0101), B = 4 (0100)
OUT = A | B = 0101 | 0100 = 0101 = 5 (True)
● OP[4]: XOR
A = 6 (0110), B = 5 (0101)
OUT = A^B = 0110 ^0101 = 0011 = 3 (True)
● OP[5]: Operation NOT
A = 7 (00000111)
OUT = ~A = 11111000 = 248 (unsigned 8-bit) (True)
● OP [6]: Shift
Left
OUT = A << 1 = 00010000 = 16 (True)

● OP [7]: Shift Right


A = 9 (00001001)
OUT = A >> 1 = 00000100 = 4 (True)

● OP [8]: Multi

A = 10, B = 9 (00001001)
OUT = A * B = 90 (True)

● Result_reg = 90 (01011010)

The register will record the value when Key(0) is on the falling edge, the
result in OP[8] = 90 so the final value saved in the register is 90

3.3 Optimal Results:


The design was timing tested through Timing Quest Timing Analysis with the clock
time constraints defined in the SDC file as follows:
create_clock -name clk -period 50.0 [get_ports clk]
(The system operates at 20Mhz and has a cycle of 50ns.)
Figure 3. 3 SDC Files

Figure 3. 4 Setup slack


Set up the slack of the clk watch in slow timing model mode (worst-case scenario).
● Slack = Required time Actual arrival time.
Meaning:
● Positive Slack (43,209) means that there is a lot of excess time the signal
arrives ahead of schedule.
Conclusion: No setup time violation.

Figure 3. 5 Hold slack


Hold slack – check if the signal comes too early .
Meaning:

A positive Slack (1,717) means that the signal does not arrive too early
does not violate the hold time.
Conclusion: Good.

F
Explanation:
● Basic base type clock).
● Period = 50 ns 20 MHz frequency equivalent.
● Rise: 0 ns, fall: 25 ns the up edge at 0ns, the down edge at 25 ns.
Bottom Line: This is the normal configuration for a 20 MHz clock.

Figure 3. 7 Fmax
Explanation:
● fmax: The maximum frequency at which the design can run safely according to
the simulation of timing in slow (worst-case) mode.
● Restricted: The value is limited by some factor (but still equal to fmax, so it
doesn't matter).
Meaning:
● The design can run at up to 147.25 MHz, which is a lot higher than your actual
requirement (20 MHz).
Conclusion: Very good (plenty of margin on timing.)

Figure 3. 8 Power Analyzer


Explanation:
● Total Thermal Power Dissipation: This value indicates how much heat the
cooling system has to handle. If it is too high, the chip may run hot and require
additional cooling (such as a heatsink or fan).
● Core Dynamic Thermal Power:
This index is higher if the circuit is active a lot or runnin

● Core Static Thermal Power:


If this value is high it could be because the chip is large, or

● I/O Thermal Power Dissipation:


If using many LEDs, screens, or communication buses

Meaning:
● Very low power (~198 mW): indicates a low power design, suitable for small
circuits.
● Low Dynamic Power (~2.5 mW): the logic circuit is not active frequently or
has very little signal change.
● Static Power ~80 mW: core leakage current is normal for small/medium
FPGAs.
● Highest I/O (~58%): indicates a system using many LEDs or signal output
devices such as 7-segment.
Conclusion: Good
⮚ Summary of Timing, Performance and Power
o The circuit works well in terms of timing with high safety, meeting all
Setup and Hold requirements, ensuring that data is recorded correctly
when synchronized with the 20Mhz clock.
o The analysis results show that the circuit has a maximum frequency
(Fmax) of 147.25Mhz, that is, the circuit can run about 7 times faster
than the current frequency while still ensuring the right timing.
Demonstrates that the circuit is capable of processing very quickly and
can be upgraded to operate at higher frequencies, depending on
application requirements
o Total power consumption is 197.94 mW, with low dynamic power (2.46
mW) and acceptable static power (80.22 mW).
o Total power consumption is 197.94 mW, with low dynamic power (2.46
mW) and acceptable static power (80.22 mW).
3.4 Video test on board DE2

https://drive.google.com/file/d/1n2lrbKzJgzw6Hq-ncrHzkX8b9XydC_yz/view?
usp=sharing

Test function on board:


STT SW[11:8 Operation A (SW[3:0]) B (SW[7:4]) Result Notes
]
(opcode)
1 0000 A+B 0011 (3) 0010 (2) 00000101 (5) ADD

2 0001 A-B 0101 (5) 0011 (3) 00000010 (2) SUB

3 0010 A&B 1100 (C) 1010 (A) 00001000 (8) AND

4 0011 A|B 1100 (C) 1010 (A) 00111110 (E) OR

5 0100 A^B 1111(F) 0101 (5) 00011010 (1A) XOR

6 0101 ~A 0111 (7) XXXX 1110 (8) NOTE


7 0110 A << 1 0101 (5) XXXX 00001010 (10) Shift
Left
8 0111 A >> 1 1000 (8) XXXX 00000100 (4) Shift
Right
9 1000 A*B 0011 (3) 0100 (4) 00001100 (12) Multi

Table 3. 1 Test Function


CONCLUDE

Results Achieved
● The project "Arithmetic Logic Unit (ALU)" has completed the initial
objectives, including researching, designing and deploying a simple ALU
model on the DE2 FPGA platform. The system is successfully built with the
ability to perform arithmetic (addition, subtraction) and logic (AND, OR,
XOR, NOR), as well as the functions of comparison (A == B, A > B) and bit
translation (shift left, shift right), meeting the basic requirements of an ALU
in computer architecture.
● Using the Verilog language and the Quartus II tool, the ALU model has been
effectively simulated on ModelSim, ensuring the accuracy of the operations
through waveform. The actual product deployed on the DE2 board allows
data input via switches, key control, and display of results on 7-segment
LEDs, demonstrating the feasibility and practical application of the design.
● The system not only demonstrates the ability to process binary data, but also
provides status flags (zero, carry, overflow, sign) to support processing flow
control, suitable for digital logic education and research applications. The
implementation process has helped strengthen the skills of assembly circuit
design, HDL programming, and hardware resource optimization.
● With its expansion through FPGAs, this ALU model has the potential to be
integrated into larger microprocessor or microcontroller systems, serving as a
foundation for future hardware development projects.
Research Direction

● In the future, the project can be improved by expanding ALU's set of


mathematical operations, including multiplication and division, as well as
supporting floating point numbers according to the IEEE 754 standard to
enhance scientific computing capabilities. This requires further research into
optimal algorithms such as Booth or Wallace Tree.
● An important research area is to optimize ALU performance through
pipelining or parallelism, in order to reduce latency and increase throughput,
especially when deployed on newer FPGAs such as Cyclone IV or Stratix.
● User interface (UI) integration using QML or web-based monitoring
application development can enhance interactivity, enabling remote
monitoring and adjustment of ALUs. This is especially useful in embedded
or IoT systems.
● In addition, research on reducing power consumption and optimizing logic
resources on FPGAs will make ALUs more suitable for mobile devices or
applications that require high energy efficiency. Integration with advanced
simulation tools or AI for design automation is also a potential direction.
● Finally, the application of ALU to real-world projects such as digital signal
processing (DSP), secure encryption, or artificial intelligence will expand the
scope of application and value of the subject in industrial and research
settings.
REFERENCES
1. https://semiconvn.com/home/index.php
2. https://en.wikipedia.org/wiki/Quartus_Prime
3. ChatGPT
4. https://ictc.edu.vn/

5.
ADDENDUM

CODE Alu_4bit:
module alu_4bit (
input [12:0] SW,
input [3:0] KEY,
clk input, // Add an external clock signal
output [7:0] LEDR,
output [8:0] LEDG,
output [0:6] HEX7, HEX6,
output [0:6] HEX5, HEX4,
output [0:6] HEX3, HEX2, HEX1, HEX0
);

=== Input Signal ===


wire[3:0] A_switch = SW[3:0];
wire[3:0] B = SW[7:4];
wire[3:0] opcode = SW[11:8];
wire use_mem = SW[12];

=== Result Saving Memory ===


reg[7:0]result_reg = 8'd0;

=== Pick A ===


wire [3:0] A_selected = use_mem ? result_reg [3:0] : A_switch;

=== ALU ===


reg [7:0] alu_out;
always @(*) begin
case (opcode)
4'b0000: alu_out = A_selected + B;
4'b0001: alu_out = A_selected – B;
4'b0010:alu_out=A_selected&B;
4'b0011: alu_out = A_selected | B;
4'b0100:alu_out=A_selected^B;
4'b0101: alu_out = ~A_selected;
4'b0110: alu_out = A_selected << 1;
4'b0111: alu_out = A_selected >> 1;
4'B1000: alu_out = A_selected * B;
default: alu_out = 8'd0;
endcase
End

// === Save the result when pressing KEY0 (side down), synchronize with
clock ===
reg key0_prev = 1'b1;
always @(posedge clk) begin
key0_prev <= KEY[0];
KEY Down Edge Detection[0]
if (key0_prev & ! KEY[0])
result_reg <= alu_out;
End

=== Status flags ===


wire carry_flag = alu_out[7];
wire zero_flag = (alu_out[3:0] == 4'd0);

assign LEDR = alu_out[7:0];


assign LEDG[0] = zero_flag;
assign LEDG[1] = carry_flag;

=== Display A, B ===


hex_ssd H7 (.bin(4'd0), .seg(HEX7));
hex_ssd H6 (.bin(A_selected), .seg(HEX6));
hex_ssd H5 (.bin(4'd0), .seg(HEX5));
hex_ssd H4 (.bin(B), .seg(HEX4));

=== Display results in BCD ===


wire [3:0] tens, ones;
binary_to_bcd_0to99 bcd (.binary(alu_out[7:0]), .tens(tens), .ones(ones));

hex_ssd H3 (.bin(4'd0), .seg(HEX3));


hex_ssd H2 (.bin(4'd0), .seg(HEX2));
hex_ssd H1 (.bin(tens), .seg(HEX1));
hex_ssd H0 (.bin(ones), .seg(HEX0));

endmodule

module binary_to_bcd_0to99(
binaries[7:0] inputs,
output reg [3:0] tens,
output reg [3:0] ones
);
always @(*) begin
if (binary < 100) begin
tens = binary / 10;
ones = binary % 10;
end else begin
tens = 4'd9;
ones = 4'd9;
End
End
endmodule

module hex_ssd (
input[3:0] bin,
reg output [0:6] seg
);
always @(*) begin
case (bin)
4'h0: seg = 7'b0000001;
4'h1: seg = 7'b1001111;
4'h2: seg = 7'b0010010;
4'h3: seg = 7'b0000110;
4'h4: seg = 7'b1001100;
4'h5: seg = 7'b0100100;
4'h6: seg = 7'b0100000;
4'h7: seg = 7'b0001111;
4'h8: seg = 7'b0000000;
4'h9: seg = 7'b0000100;
4'hA: seg = 7'b0001000;
4'hB: seg = 7'b1100000;
4'hC: seg = 7'b0110001;
4'hD: seg = 7'b1000010;
4'hE: seg = 7'b0110000;
4'hF: seg = 7'b0111000;
default: seg = 7'b1111111;
endcase
End
endmodule

CODE alu_tb
'Timescale 1ns / 1ps

module alu_tb;

Signal Simulation
reg [12:0] SW;
reg[3:0] KEY;
reg clk;
wire [7:0] LEDR;
wire [8:0] LEDG;
wire [0:6] HEX7, HEX6, HEX5, HEX4, HEX3, HEX2, HEX1, HEX0;

connect ALU
alu_4bit uut (
.SW(SW),
. KEY(KEY),
.clk(clk),
. LEDR(LEDR),
. LEDG(LEDG),
. HEX7(HEX7), . HEX6(HEX6), . HEX5(HEX5), . HEX4 (HEX4),
. HEX3(HEX3), . HEX2(HEX2), . HEX1(HEX1), . HEX0 (HEX0)
);

Clock 10ns
always #5 clk = ~clk;

KEY
task press_key0;
begin
KEY[0] = 1;
#10;
KEY[0] = 0;
#10;
KEY[0] = 1;
#10;
End
endtask

simu
integer i;
reg [3:0] A, B;

initial begin
$display("=== START ALU SIMULATION ===");

Khoitao
clk = 0;
KEY = 4'b1111;
SW = 13'd0;

9 opcodes
for (i = 0; i < 9; i = i + 1) begin
A = i + 2; value A different every time
B = i + 1; value B different every time

SW[3:0] = A;
SW[7:4] = B;
SW[11:8] = i[3:0];
SW[12] = 0;

#20;
press_key0();
#50;

$display("OP[%0d]: A=%0d, B=%0d, OUT=%0d", i, A, B, LEDR);


$display("result_reg = %0d", uut.result_reg);
End

$display("=== END SIMULATION ===");


#500;
$finish;
End

endmodule

You might also like