VLSI Design and Test View of Computer Security
VLSI Design and Test View of Computer Security
Shih-Lien Lu
School of Innovation and Technology
Warner Pacific University
Portland Oregon, USA
[email protected]
2023 International VLSI Symposium on Technology, Systems and Applications (VLSI-TSA/VLSI-DAT) | 979-8-3503-3416-6/23/$31.00 ©2023 IEEE | DOI: 10.1109/VLSI-TSA/VLSI-DAT57221.2023.10134130
Abstract— Computer security is becoming more and more components prevent threat-actors access to and temper with
important as more aspects of our life rely on information sensitive components. Masked AES design approach is an
technology. The fundamental building block of information example of hiding side channels against differential power
technology is semiconductor. Semiconductor enables Very analysis attack which undermines the confidentiality of
Large-Scale Integrated (VLSI) Circuits which power the
digital infrastructure. This presentation examines the
encryption.
challenges and opportunities of security from the point of view There are many good survey papers on different topics
of VLSI design and technology. We first explain why security of hardware security measures. Instead of providing
in general and hardware security in particular is challenging. comprehensive coverage of these areas, we will focus on
We present a survey of different aspects of hardware security. outlining the differences between the design and test aspects
We argue more work needs to be done in building the of a hardware security component - integrated circuit. We
components to enhance hardware security. Moreover, we argued that the current evaluation merits of power,
look at how design considerations of these components need performance and area (PPA) should not be the only criteria.
to be different from regular design. Security properties need to be included to reflect the main
As an example, we will take a deeper dive into Physically
Unclonable Functions (PUF). PUF have been gaining
purpose of the design. We also maintain that design margins
attention since early 2000s and are now an established secure for IC’s security components need to be considered
alternative to other key storage methods for many integrated differently from the standard design margins used currently.
circuits (ICs) such as FPGAs and microcontrollers. There are We will examine the design of a specific sub-component,
many works on the design of PUFs. However, the merits of a physically unclonable function (PUF), in details to illustrate
particular PUF should go beyond the normal Power- our position.
Performance-Area (PPA) of a standard digital integrated
II. CHALLENGE OF SECURITY
circuit. This is because the security primitive only occupies a
very small portion of the whole chip. Improving a PUF design It is difficult if not impossible to ensure a design is
by some percentage makes little impact on the PPA of the secure. Not only the specification of the product needs to
whole chip. As a security primitive, it should be evaluated also be correct, that is, there is no bugs, one must ensure there
on what its designated functionality as well. We summarize is no additional features beyond the specification are
some of the evaluation criteria important for PUF. We also included in the implementation. In other words, there
discuss some experience we had with challenges of SRAM
should be no extra features purposely or inadvertently
based PUF.
Keywords—security, VLSI, design, test, cybersecurity,
incorporated into the final implementation. To verify the
hardware functionality of a design and implementation, one just
needs to make sure the final product meets all the
I. INTRODUCTION
requirements in the architectural specification. However, to
Computing devices are increasingly internet-connected. assure security one must assure the implemented design is
Protecting them against unauthorized access exactly the specification, no more no less. Security
(confidentiality), assuring them to provide truthful vulnerabilities can also be introduced when security
information (integrity) and guaranteeing the reliable access features are incorporated and implemented. Moreover,
to approved users (availability) are the security goals of any goals for security may conflict with other goals such as
devices, systems and organizations. The hardware of a reliability [1]. Reliability assumes thing may go wrong
device is the foundation upon which its software and data while security needs to assume there will be attacks to make
security rests. If the hardware is compromised, the security things go wrong. Like software and network security,
of the entire system can be at risk. For example, a hacker following are factors that contribute to the hardware
who gains physical access to a device can bypass its security challenge.
software security measures and gain access to sensitive A. Complexity
information, such as passwords and personal data. In
Security involves protecting not just individual chips
addition, hardware vulnerabilities can be exploited to install
or devices but the entire systems they are part of. Not only
malware, which can then be used to gain remote access to
each device is growing in complexity, the complexity is
the device and its data..
further extended because of the connectivity of devices in
Hardware security refers to the measures taken to protect
a system and the software and hardware interaction. With
the physical components of a device, such as the processor
each additional device in a system the possible
(ICs), circuit board, memory, storage and connectivity from
combinations increase as devices in a system or
unauthorized access, tampering, or theft. Hardware security
interconnected can interact with each other. Moreover,
measures need to encompass many aspects as the attacks of
addition of security features into the system can further
a device may span different levels. For example, the use of
complicate matters as these security elements may
secure boot processes ensures a device booting from system
introduce weakness. There are also interactions between
software that is trusted by the original equipment
software and hardware which are difficult to detect at the
manufacturer. Secured enclosures of system and/or
hardware side or at software side alone.
XXX-X-XXXX-XXXX-X/XX/$XX.00 © 20XX IEEE
979-8-3503-3416-6/23/$31.00
Authorized licensed use limited to:©2023 IEEEof Engineering. Downloaded on December 03,2024 at 07:46:19 UTC from IEEE Xplore.
HKBK College 2023 VLSI
Restrictions apply. TSA
B. Evolving Threat circuits in FPGAs is a practical way to prevent hardware
As technology evolves, so do the methods used by attacks and has been adopted by designs that needs high
attackers to breach systems. Recently, it is reported that security.
ChatGPT [2] was used to generate threats [3]. This means Detection is another key category of techniques in
that security measures must continually be updated and countering hardware attacks. If an attack cannot be
improved to counter new threats that can be generated by prevented, it is important to detect when an attack has been
both AI and human. initiated so that proper action can be taken to stop it or alert
C. Human Factor the security system. This involves monitoring the system
for signs of an attack, such as abnormal or suspicious
People are often the weakest link in security. Human
behavior from a device, or parameters that fall outside of
behavior is unpredictable and difficult to control, which
their normal range (such as high or low temperature, out-
makes it a common target for threat actors looking to
of-bound voltage, etc.). For instance, hardware sensors like
exploit vulnerabilities in hardware security. Attack points
temperature sensors can be used to detect tampering by
can be due to human ignorance of security practices,
detecting changes in temperature or other physical
careless behavior, or intentional malfeasance.
parameters that may indicate an attack is underway.
D. Global Threat Landscape Detection can also be done periodically such as self-testing
Security threats can come from anywhere in the world, and verification.
making it difficult to protect against all possible attacks. Obfuscation is another technique used to conceal the
Global supply chain practices also complicated the security design from hardware attacks. This can be achieved at
mitigation methods. different abstraction levels of a system. For example, much
E. Resource Cyonstraints work has been done on measures to hide power supply
Implementing strong security measures can sometimes signature to thwart a side-channel attacks called differential
conflict with the need for ease of use and accessibility. power analysis (DPA). by setting up decoys like honeypots
Many cases, people are not willing to take the security that can distract attackers and prevent them from accessing
measures due to convenience. Moreover, employing and sensitive information. A recent novel approach is called
maintaining strong security measures can be resource- Morpheus. Morpheus is a security system designed to
intensive, and many people are not willing to pay for the protect against side-channel attacks on cryptographic
overhead. Organizations may not have the necessary algorithms. The principle behind Morpheus is the concept
resources to fully enforce security measures. of masking, which involves obscuring sensitive
III. HARDWARE SECURITY information so that it cannot be extracted by attackers. In
Many good recent papers, including [5][5][7][8][9], the context of side-channel attacks, Morpheus uses
have done great jobs to summarize the state of hardware dynamic software-based masking to obscure the power
security in difference respects. We attempt to list them in consumption of cryptographic operations, making it more
the way we see helpful for understanding issues related to difficult for attackers to extract secret keys through power
the position of this presentation is trying to argue. analysis. Morpheus [9] works by constantly rearranging the
operations of a cryptographic algorithm, so that attackers
A. Types of Attacks
cannot use information about the power consumption of a
The objectives of threat-actors are to violate the single operation to determine the secret key.
confidentiality of information, erode the reliability of data, Of course, the last measure is recovery. In hardware
and obstruct access to it. They achieve these goals through recovery may just meant to reset the system or send an
different means such as reverse engineering, physical alarm signal.
tempering of devices, information gathering through side
C. Electronic Design Automation Tools
channels, insertion of rogue components through the entire
supply chain, and substitution of firmware and software. It The role of Electronic Design Automation (EDA) tools in
is a usual practice to identify the attacks and propose actions hardware security is a widely discussed topic in academic
to count the identified attacks. literature. Recently special sessions are added to
conferences as security is becoming a more important topic.
B. Counter Measures A paper discussed [10] describes current hardware security
Actions to defeat attacks can fall into a few categories. verification practices. Tools can be used in ensuring the
Attack prevention is the first line of defense. To avoid security of hardware systems by facilitating the design and
back-door, hardware trojan and malicious component implementation of secure systems, preventing hardware
insertions, the simplest way is to ensure the supply chain is attacks, and detecting and mitigating attacks that may occur.
trustworthy. Devices can be hardened at the design phase Despite the promising studies, there are still areas that
to avoid hardware attacks. An example method is called require further attention, such as the effective compilation
logic locking. An alternate method for implementing of assumptions and constraints for security schemes, the
security functions is through the use of a finite state modeling and evaluation of security-relevant metrics, and
machine or a specialized circuit. This approach sacrifices the automated synthesis of various countermeasures without
the flexibility of a Turing machine but offers the advantage causing negative cross-effects.
of significantly reducing the possibility of unintended
D. Foundational IPs for Hardware Security
execution. A specialized circuit can handle complex logical
functions, but it differs from a standard CPU as it does not Since security is a very large field, we will focus on a
rely on sequential processing and random access to the few types of hardware security mitigation solutions. We are
central memory or storage. Implement the specialized particularly interested in a class of IPs (intellectual
Authorized licensed use limited to: HKBK College of Engineering. Downloaded on December 03,2024 at 07:46:19 UTC from IEEE Xplore. Restrictions apply.
properties) that are building blocks for detection or parameters to ensure that the circuit meets its performance
generating bits for encryption part of the hardware security. and reliability specifications over its intended lifetime. The
One type of IPs falls in the category of sensors. The other amount of design margin in VLSI circuits varies depending
is the type of IPs are used to generate random bits that can on the specific requirements of the system, the application,
be used for encryption and authentication. We will examine and the manufacturing process. A typical range is between
these two particular types of IPs and discuss their design 5% to 20%. A higher margin provides more headroom for
criteria. the circuit to operate within its specified limits, but also
IV. DESIGN CRITERIA increases the cost and size of the circuit. On the other hand,
a lower margin reduces the cost and size, but also increases
There are several design criteria that are considered
the risk of the circuit failing to operate within its specified
when designing VLSI circuits. First criterium is
operating range. Reasons for margin is due to variations.
performance. Performance of a VLSI circuit is a crucial
(1) Static variations are caused by inherent
factor in determining its overall success. Performance is
manufacturing processes and the underlying physical
usually measured by frequency and time needed to
limitations of semiconductors. There are two types –
complete tasks. Second criterium is power consumption. It
random and systematic. Figure 1 shows that process
determines the overall efficiency and battery life in portable
variations can be from lot to lot, across the same wafer and
device for example. The physical size (area) of a VLSI
on the same die [12]. These variations can occur at the
circuit is another important consideration, as smaller
circuits are typically more cost-effective and easier to
integrate into other systems. These three factors are
referenced commonly as the PPA (Performance, Power and
Area) of a circuit. In addition to PPA, there are other criteria
such as reliability, manufacturability, testability and
scalability. Recently, security has been added as a criterium.
A. Design Integration
One important criterium for designing security
building blocks is on the integration of the security element
with the chip they are to protect. First, obviously when a
security element is off-chip, a connection must be placed
between the security element and the processing unit to
allow communication. This opens up a possible attack
surface. Protecting this communication channel is
challenging. Second, if the security element is integrated
with the processing unit, then there should be no extra
processing steps required to build the security element.
Security elements occupy a small percentage of the total
area of a chip. Adding any extra processing steps will
increase the cost of protection. An easy example for
illustration purpose is described. If a chip takes 20 masks
to process. Adding an extra mask step is 5% of the cost Figure 1. Variations of two wafers of the design
while the security element may only take up less than 1% transistor level, the interconnect level, and the circuit level.
of the total area. A common category of elements for They lead to mismatches, skews, and other non-idealities in
security purposes is sensors. These may include VLSI and may cause design to have various performance,
temperature, voltage, frequency and light sensors. These power consumption, and reliability issues.
sensors are used to detect any tampering of a device. They Figure 2 illustrates that a circuit’s frequency can vary
are different from normal sensors in that they mostly just by more than 30% and its leakage power spans 5X range.
want to detect out-of-range operation instead of accurate or The impact of these variations can be mitigated through
linear readings. An example is described in [11]. process control, design for variability, and statistical
B. Design Margin for Integrated circuits analysis. Design for variability involves taking into account
Design margin in VLSI refers to the extra headroom the expected range of variation in device performance and
that is intentionally built into a digital circuit design to
accommodate variations in manufacturing processes,
temperature, power supply voltage, and other operating
conditions. It ensures that the circuit will still function
correctly even if the actual operating conditions deviate
from the ideal specifications. The larger the design margin,
the more robust the circuit will be against these variations,
but it also increases the cost and reduces the efficiency of
the design. Design margins are determined by trade-offs
between device PPA and reliability requirements. The goal
of a design margin is to provide sufficient margins for Figure 2. Impact of Static Variations
manufacturing, environmental, and aging-related incorporating margins in the design to ensure reliable
Authorized licensed use limited to: HKBK College of Engineering. Downloaded on December 03,2024 at 07:46:19 UTC from IEEE Xplore. Restrictions apply.
operation. Statistical analysis techniques can be used to functional or not functional in terms of security? Unlike
quantify and mitigate the impact of systematic variations on design margins for device functionality, how show we
the overall performance and reliability of the system. define security design margins?
(2) Dynamic variations in circuits design refer to the C. PUF Properties and Design Criteria
fluctuations that occur during normal circuit operation, Common parameters used to evaluate PUFs include
such as changes in power supply voltage, temperature, and Hamming weight, Intra- and Inter-Hamming-distance,
load conditions. These variations can have a significant autocorrelation function and Entropy [16]. It is important
impact on the performance and reliability of VLSI circuits. to use min-entropy [17] for elements such as PUFs and
Dynamic variations can cause the circuit parameters to random bits generators. An evaluation process for these
deviate from their nominal values, leading to changes in the elements should be clearly defined. Designers need to work
operating conditions of the circuit. For example, a change with the manufacturers closely for this defined process. It
in temperature can cause the threshold voltage of transistors is best that an on-die testing block is employed as part of
to shift, affecting the circuit's gain and power consumption. this process.
Similarly, a change in power supply voltage can affect the VI. SUMMARY
circuit's output levels and cause logical errors. To account
for these dynamic variations, designers typically leave a Cybersecurity is becoming more and more important.
design margin in the circuit, which allows for a certain At the root of cybersecurity is hardware security. Much
degree of tolerance to these changes. By leaving this emphasis has been placed on the design novelty and
margin, the circuit can still operate correctly despite the efficiency of the design of security elements. We argue that
dynamic variations, and the risk of malfunctions or failures more efforts should be spend on understanding the design
can be reduced. criteria for security functions targeted. It is critical to
The amount of design margin needed will depend on understand how to specify security margins and how to
the specific requirements and constraints of the application, quantify them precisely just like other design margins we
such as the maximum allowable deviation from nominal are accustom to.
values and the required operating lifetime of the circuit. References
The decision to leave a certain amount of design margin is [1] Heather Adkins et. al, Building Secure and Reliable Systems, O’Reilly
a trade-off between performance, power, cost, and Media,(online)
reliability. We argue security should also be considered, [2] https://openai.com/blog/chatgpt
especially for security building blocks. [3] https://research.checkpoint.com/2023/opwnai-cybercriminals-starting-to-use-
chatgpt/
C. Design Margin of Security [4] https://static.googleusercontent.com/media/sre.google/en//static/pdf/building
How do we formulate design margins for security _secure_and_reliable_systems.pdf
purposes? We believe that this is an important research area [5] W. Hu, C.-H. Chang, A. Sengupta, S. Bhunia, R. Kastner, and H. Li. 2021. An
which has not been addressed in details. Overview of Hardware Security and Trust: Threats, Countermeasures, and
Design Tools. IEEE Trans. on CAD of Integrated Circuits and Systems 40, 6
V. EXAMPLE SECURITY COMPONENT (2021)
[6] J. Knechtel et al., "Towards Secure Composition of Integrated Circuits and
We now examine a particular example as an Electronic Systems: On the Role of EDA," 2020 Design, Automation & Test
illustration for our position so far. in Europe Conference & Exhibition, Grenoble, France, 2020, pp. 508-513
A. Physically Unclonable Functions [7] A. Ehret, K. Gettings, B. R. Jordan and M. A. Kinsy, "A Survey on Hardware
Security Techniques Targeting Low-Power SoC Designs," 2019 IEEE High
A physical unclonable function (PUF) which is
Performance Extreme Computing Conference (HPEC), Waltham, MA, USA,
unclonable in every sense. It has gained much attention 2019, pp. 1-8
recently. It harvests the static variations inherently exists in [8] K. Yang, D. Blaauw and D. Sylvester, "Hardware Designs for Security in
the process to generate a unique signature. Many papers on Ultra-Low-Power IoT Systems: An Overview and Survey," in IEEE Micro,
the design of PUF have been published [13]. Reliability, vol. 37, no. 6, pp. 72-89, November/December 2017
energy, and area has been the focus for comparing different [9] M. Gallerger et. al., Morpheus: A Vulnerability-Tolerant Secure Architecture
Based on Ensembles of Moving Target Defenses with Churn , Proceedings of
designs [14]. As pointed out previous, reliability, area and
the 24th International Conference on Architectural Support for Programming
energy tend not to be the most important factors. It is more Languages and Operating SystemsApril 2019 Pages 469–484
important to assure the security aspect of the design and [10] Ryan Kastner et. al., Automating hardware security property generation:
implementation. invited, DAC '22: Proceedings of the 59th ACM/IEEE Design Automation
B. Impact of Variations on PUF ConferenceJuly 2022 Pages 1384–1387
[11] S. L. Lu, Capacitor-based temperature-sensing device, US Patent # 11009404
PUFs mainly harvest the random static variations to [12] TM Mak, Privte communication
generate signatures. However, systematic variations have [13] A. Shamsoshoara et. al.., A survey on physical unclonable function (PUF)-
been observed and they are more prominent with based security solutions for Internet of Things, Computer Networks, Volume
technology advancement. They show up on within die, 183, 24 December 2020
across a wafer and between wafers as shown in Figure 1. A [14] https://www.green-ic.org/physically-unclonable-function-database-pufdb/
published paper has demonstrated that SRAM bits power [15] H. Liu et. al., Methods for Estimating the Convergence ofInter-Chip Min-
Entropy of SRAM PUFs, IEEE Trans. On Circuits and Systems-I, Vol. 65, No.
up states are biased within a die [15] (Figure 4). We have
2, Feb 2018
observed this bias with advanced FinFET SRAM as well [16] Gu, C., Liu, W., Hanley, N., Hesselbarth, R., & O'Neill, M. (2018). A
for with-di, across wafer and between wafers. In fact, there Theoretical Model to Link Uniqueness and Min-Entropy for PUF Evaluations.
are dice on wafers where the bias is so prominent that IEEE Transactions on Computers, 68(2), 287.
almost all bits are powered up to a particular state. How [17] https://arxiv.org/pdf/1209.3744.pdf (min entropy)
does the bias affect security? How do we mark a die to be
Authorized licensed use limited to: HKBK College of Engineering. Downloaded on December 03,2024 at 07:46:19 UTC from IEEE Xplore. Restrictions apply.