SoC Fuzzing Intro
SoC Fuzzing Intro
Abstract—The ever-increasing usage and application of system-on-chips (SoCs) has resulted in the tremendous modernization of
these architectures. For a modern SoC design, with the inclusion of numerous complex and heterogeneous intellectual properties (IPs),
and its privacy-preserving declaration, there exists a wide variety of highly sensitive assets. These assets must be protected from any
unauthorized access and against a diverse set of attacks. Attacks for obtaining such assets could be accomplished through different
sources, including malicious IPs, malicious or vulnerable firmware/software, unreliable and insecure interconnection and communication
protocol, and side-channel vulnerabilities through power/performance profiles. Any unauthorized access to such highly sensitive assets
may result in either a breach of company secrets for original equipment manufactures (OEM) or identity theft for the end-user. Unlike
the enormous advances in functional testing and verification of the SoC architecture, security verification is still on the rise, and little
endeavor has been carried out by academia and industry. Unfortunately, there exists a huge gap between the modernization of the
SoC architectures and their security verification approaches. With the lack of automated SoC security verification in modern electronic
design automation (EDA) tools, we provide a comprehensive overview of the requirements that must be realized as the fundamentals
of the SoC security verification process in this paper. By reviewing these requirements, including the creation of a unified language for
SoC security verification, the definition of security policies, formulation of the security verification, etc., we put forward a realization of
the utilization of self-refinement techniques, such as fuzz, penetration, and AI testing, for security verification purposes. We evaluate all
the challenges and resolution possibilities, and we provide the potential approaches for the realization of SoC security verification via
these self-refinement techniques.
✦
1 I NTRODUCTION
90
Number of CWE HW Vulnerabilities
80
70 Identified/Investigated Vulnerabilities Crypto TRNG DSP DMA DRAM
60 Projected to-be-identified
Full-duplex channel
0
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
Peripherial Bus
100 200
40 80
30 60 Fig. 2: An SoC Design with the Integration of a wide variety of IPs.
20 40
10 20
0 0 become even more difficult to ensure that the design under
Fig. 1: IP Re-use vs. Threats vs. Security-related Spending. investigation does not violate these policies, whose violation
can lead to the emergence of vulnerabilities or backdoors
for attackers. It might not be beyond the possibility that the
vulnerabilities may be introduced unintentionally by a de- definition of a set of well-established security properties can
signer during the transformation of a design from speci- help verification engineers alleviate the problem. However,
fication to implementation. This is because the designers some of these policies are related to different phases of the
at the design house spend most of their design, imple- design flow, from specification to implementation, and they
mentation, and integration efforts to meet area, power, and might involve the design house, different IP vendors, and
performance criteria. For verification, they mostly focus the integration team. Additionally, the realization of these
on functional correctness [6]. Additionally, the flaws in policies may require a multi-layer implementation through
the computer-aided design (CAD) tools can unintentionally a combination of hardware, firmware, and software in the
result in the emergence of additional vulnerabilities in SoCs. SoC. Moreover, the definition of these policies might face
Moreover, rogue insiders in the design houses can intention- significant changes or refinements across the design flow,
ally perform malicious design modifications that create a which makes many of them invalidated.
backdoor to the design. An attacker can utilize the backdoor The definition of the security policies and their expan-
to leak critical information, alter functionalities, and control sion is also entirely dependent on the pre-defined and pre-
the system. assessed scenarios and test cases that consist of prohibited
It should be noted that many of the existing security and illegal actions. For SoCs becoming larger and more
countermeasures introduced in the literature or widely used complex with larger sets of IPs and components integrated
in the industry, such as hardware obfuscation, watermark- into it, the process of detection, gathering, and building
ing, metering, camouflaging, etc. [7], [8], [9], [10], [11], [12], all these test-cases that lead to the definition of security
have nothing to do with such SoC-level vulnerabilities, as policies are becoming worse. Hence, unknown security vul-
many of these security vulnerabilities originate precisely nerabilities will still appear in SoC transactions, leading to
from unexpected interactions between layers and compo- breaches of confidentiality, integrity, or authenticity, despite
nents, and traditional techniques fail at catching these cross- verification being properly performed based on the known
layer problems or do not scale to real-world designs. There- security policies. Additionally, due to the lack of reciprocal
fore, apart from the existing hardware security countermea- trust between different entities involved in SoC design and
sures that might be applied to the design, the security verifi- implementation, or based on the time of security verification
cation is required to be evaluated meticulously, particularly (i.e., pre- or post-verification), the access of the security
for the SoC-level vulnerabilities. verification engine/tool will vary to the system, from full
Considering that the different IP components of a SoC access with knowledge about all internal operations, wires,
have their own highly sensitive security assets that should registers, etc., to NO access to the internal knowledge of the
be accessed/exploited by some other components, most system. The access differs case by case; however, in all cases,
system design specifications include a (limited) number of it will affect the outcome of security verification, in terms of
security policies that define access constraints and permis- security policies conformance, performance, the complexity
sions to these assets at different phases during the system of security verification flow, etc.
execution. As SoC complexity continues to grow and time- A literature review on the software testing domain re-
to-market shrinks, verification of these security policies has veals that the procedure of software testing has been suffer-
FUTURE MICROELECTRONICS SECURITY RESEARCH SERIES 3
ing for almost two decades from the same challenges and (Non)-Security Creation of
Property Assertion
difficulties. Software testing is the main way of verifying Statements
Declaration
software, from specification to release, against the defined
requirements and accounts for about half of the cost and Formal Failed Identifying Source
time of development [13], [14]. For instance, considering the Design Checking of Vulnerability
Spec
main objective of software testing and verification, which Passed
is systematically evaluating the software, in carefully con- DONE!
Design and Building
trolled circumstances, the scope of software testing is heav-
Integration Countermeasures
ily dependent on the knowledge and internal access, which (HW Implementation) (Fixing Design)
categorizes the software testing into three main breeds,
namely (i) white box with full knowledge of code, (ii) black Fig. 3: Property-Driven Functional Verification.
box with no knowledge relevant to the internal structure
of the code, and (iii) gray box with limited knowledge
of the internal structure. In this case, the automation of gies is hard because researchers often do not have enough
verification depends on the breadth of testing. Recently, access to all parts of the system, which is particularly true
in software testing, with the emergence of automated and for proprietary hardware micro-architectures. Authors in
semi-automated techniques, like the usage of self-guidance [27] demonstrate how commercially available tools can fail
and self-refinement concepts (e.g., artificial intelligence, ma- to detect security-relevant RTL bugs including some that
chine learning, the mutation-based approaches like fuzzing originate from cross-modular and cross-layer interactions.
[15], the testing procedure has been evolved tremendously The work in [28], presented a methodology to infer security-
in this domain. These techniques are highly successful in critical properties from known processor errata. However,
detecting software vulnerabilities since they are automated, manufacturers only release errata documents enumerating
scalable to large codebases, do not require the knowledge of known bugs. Unknown security vulnerabilities can exist in a
the underlying system, and are highly efficient in detecting SoC design that are not listed in available errata documents.
many security vulnerabilities. Another approach for finding security bugs is information
The utilization of existing techniques for security ver- flow tracking (IFT) techniques. The authors in [29], [30], [31],
ification of modern SoCs is mainly limited to the expert [32], [33] utilize IFT and statistical models to map hardware
review, and they do not provide acceptable scalability [16]. security properties. However, this technique requires design
The top view of such techniques has been demonstrated in instrumentation and tainting all the input variables, which
Fig. 3. In such solutions, the conventional formal verification require more computational time and memory resources.
techniques, in spite of significant recent advances in au- Hence, IFT and statistical modeling, which requires expert
tomated formal technologies (functional verification) such knowledge of the design, become more complex with in-
as satisfiability (SAT) checking and satisfiability modulo creasing design complexity. There is an increasing need for
theories (SMT), that is also widely used for the evalution of methodologies and frameworks for security verification of
IP protection techniques (hardware obfuscation) [17], [18], modern SoCs that are scalable to large and complex de-
[19], cannot promise the desired scalability for the security signs, highly automatic, effective, and efficient in detecting
verification of modern SoCs, and the gap between the scale security-critical vulnerabilities.
and complexity of modern SoC designs and those which can Based on the trend of software verification testing tech-
be handled by formal verification techniques has continued niques and their efficacy for the evaluation of software
to grow. Similarly, symbolic execution and model checking specifications and requirements, it is evident that the same
[20] are suffering from scalability for verification at the SoC but futuristic trend will Potentially happen in the area of
level. As an instance, the work in [3], [21] proposed a tech- SoC security verification. However, there definitely exist nu-
nique to analyze the vulnerabilities of FSM, called AVFSM. merous limitations and challenges, in terms of the concept
However, apart from the FSM, a SoC contains other modules migration, implementation, assumptions, metrics, and the
(exception handlers, test and debug infrastructures, random outcome. Hence, with such a gap, and due to notable lack
number generators, etc.) which must be inspected during of detailed and comprehensive evaluation on SoC security
security verification. Authors in [22] provided a method to verification, in this paper, we will examine and re-evaluate
write the security-critical properties of the processor. They the principles and fundamentals of SoC security verifica-
found that the quality of security properties is as good as tion through automated and semi-automated architectures.
the developer’s knowledge and experience. Moreover, there Moving forward, with the ever-increasing complexity and
is a lack of comprehensive threat model definition, which size of SoCs and with the contribution of more and more
must be considered while developing security properties. IPs and less and less trustworthiness between components,
There are some other approaches which have developed SoC security verification through a more closed environ-
security properties and metrics by considering only a small ment, like the gray and black box model, will get more
subset of vulnerabilities (e.g., the vulnerability in hardware attention. Hence, in this paper, with more focus on such
crypto-systems [23], side-channel vulnerabilities [24], [25] models, and by trying to get the benefit of semi-automated
and abstraction level limitations like the behavioral model or automated approaches, like AI or ML, fuzz testing, and
[26]). penetration testing, we provide a comprehensive overview
Also, since many of the security vulnerabilities in the of SoC security verification as follows:
SoC originate precisely from unexpected interaction be-
tween layers and components, identifying novel methodolo- 1) We first identify the source of vulnerabilities indi-
FUTURE MICROELECTRONICS SECURITY RESEARCH SERIES 4
cating the necessity of an automated verification their IPs and other components, or unauthorized interac-
framework for the SoC security verification. tions with no monitoring, third-party IPs can pose the same
2) We then define the assumptions for SoC security security issues.
verification based on different factors like the de- (SV4) EDA optimizations: Almost all efforts in the develop-
signer’s desire, followed by the requirements of the ment and improvement of CAD tools have been directed
framework, such as scalability, high coverage, etc. to optimize and increase the efficiency of synthesis, floor-
3) We examine the possibility of engaging software planning, placement, and routing in terms of area, power,
approaches, with more specific concentration on throughput, delay, and performance. These CAD tools are
self-guided or self-refinement approaches, such as not well-equipped with the understanding of the security
fuzz, penetration, and AI testing for SoC security vulnerabilities [21] of integrated circuits (ICs) and can, there-
verification. fore, introduce additional vulnerabilities in the design. As
4) We discuss about the future research directions and an instance, the CAD tools can potentially but unintention-
challenges for implementing an automated verifica- ally merge trusted blocks with the untrusted ones, or move a
tion framework to identify security vulnerabilities security asset to an untrusted observable point, which opens
based on the self-refinement approaches. the possibility of different attacks like eavesdropping on the
side-channel analysis.
The rest of the paper is organized as follows. Section (SV5) Security compromising stages of IC supply chain: For
2 provides the source of SoC security vulnerabilities at modern SoCs, design-for-test (DFT) and design-for-debug
different stages of an SoC lifecycle. Section 3 will review (DFD) are designed to increase the observability and
the terminologies that are directly and indirectly related controllability for post-silicon verification and debug
to the SoC security verification. Then, by reviewing the efforts. However, increasing the observability and
challenges for SoC security verification in Section 4, we preserving security are two conflicting factors. Test
will define the main assumptions required to be considered and debug infrastructures may create confidentiality and
through SoC verification in Section 5, which is followed by integrity violations. For example, in [26], scan chains have
the description of SoC security verification flow in Section 6. been exploited to leak security-critical information, and
Then, the models for the usage of fuzz testing, penetration numerous studies through the last decade have evaluated
testing, and AI testing will be covered in Sections 7-9. We the utilization of DFT/DFD infrastructure for attacking the
elaborate upon the future possible research directions and design [10]. Similar to (SV4), many of these vulnerabilities
the existing challenges for these self-refinement approaches are emerged unintentionally due to the accomplishment of
in Section 10, and we conclude the paper in Section 11. different stages of the IC design flow.
(SV6) Impact of hardware Trojan insertion: This case is a
derivation of (SV2) and (SV3), in which SoC designs are
2 S O C S ECURITY: S OURCE OF V ULNERABILITIES also prone to many maliciously introduced threats, such as
hardware Trojans. These hardware Trojans can be inserted
Fig. 4 firstly shows the main steps of a modern IC supply by untrusted entities involved in a number of design and
chain which is plunged in globalization with the involve- supply-chain stages including third-party IP (3PIP) vendors
ment of multiple IPs. As also demonstrated in Fig. 4, an and rogue in-house employees causing sensitive informa-
SoC design can encounter security vulnerabilities during tion leakage, denial-of-service, reduction in reliability, etc.
different stages of its design and life cycle, each is excited in the SoC. Also, insider threats are particularly dangerous
from a unique source. Beginning from the very early stages since they have full observability and access to the whole
of the IC design to its fabrication, the following are the main design and source files. When a chip is deployed into the
sources of such security vulnerabilities in the SoC design final design, and Trojan was inserted during stages, an at-
and implementation: tacker can monitor physical characteristics of a design (such
(SV1) Inadvertent designer mistakes: The development of as delays, power consumption, transient current leakage) to
multiple IPs/components may be distributed between dif- recover secret information.
ferent design teams as well as third-party IPs. This can (SV7) Lack of trustworthiness in EDA Tools: Although it is
result in a non-clear definition over the interaction between mostly assumed that the CAD tools are trusted nodes within
these IPs, non-sophisticated exception handling for inter- the IC design flow, modernizing SoC design infrastructure
component communications, limited knowledge about the with compatibility for cloud-based design development,
behavior of neighboring components, (communication) pro- and leveraging fully distributed processes with remote EDA
tocol malfunctioning, and lack of understanding of security tools violates this assumption. In such cases, software tools
problems due to the high complexity of the designs and cannot be considered trusted anymore (the current assump-
variety of assets. Hence, it may result in different forms of tion is that even for cloud-based systems, it is just assumed
vulnerabilities, such as secret information leakage or losing that the cloud infrastructure is secure, which is not always
the reliability of the SoC. correct.), and the SoC designs are accordingly prone again
(SV2) Rogue employee (insider threat): Unlike (SV1), deliber- to numerous malicious threats.
ate malfunction can be invoked by the rogue employee(s) (SV8) Untrusted fabrication site: The consequence of this case
that pose significant security threats to the security of the is similar to (SV7). In this case, the fab site as an untrusted
whole SoC. entity, with having full knowledge about the layout of the
(SV3) Untrusted third-party IP vendors: Similar to (SV2), with design, can apply manipulation before the fabrication for
violation of rules/protocols of communication between further usage after the fabrication.
FUTURE MICROELECTRONICS SECURITY RESEARCH SERIES 5
Design House
Design Design Team Physical
Spec Logic Synthesis Synthesis Layout IC End-user
Wafer Packaging
IC
Integration
3rd party IP System Original Fabrication Testing
owners/designers RTL Design
Hardware Trojan Insertion
- Insertion of Malicious function (Through 3PIPs, at Gate-level on synthesized netlist, etc.)
(SV9) Efforts on SoC design optimization: Most of the design- 3.1 Security Assets in SoC
ers are always concentrating on minimizing the overhead For any IP component, firmware, software, etc., involved
in terms of area, power, performance. However, in many of and integrated into the SoC, there exists a set of infor-
these optimization cases done by the designer, the outcome mation or data, whose leakage can lead to catastrophic
can possibly incur forms of security issues, such as sharing consequences with huge financial/reputation loss. This sen-
memory banks between non-trusting processes in multi- sitive information or data are known as security-critical
tenanted systems that were the main consequence of spectre values, a.k.a. security assets, that must be protected against
and meltdown attacks [34], [35]. any potential form of threat. Any successful retrieval of
(SV10) Hammering by the end-user: This group of threats can such information or data in an illegal way might result
be related to either logical or physical attributes and speci- in trade secret(s) loss for device manufacturers or content
fication of the components integrated into the SoC, such as providers, identity theft, and even destruction of human
memory. Similar to rowhammer attack on DRAMs, logic- life. These assets are usually known to the designers, and
level hammering and repeating of different sequences of they are defined based on the specifications of the design.
action(s) around the targeted component might lead to As an instance, encryption/decryption or private key in
the reveal of some information, and then by repeating the cryptographic primitives are assets, and the location and
scenario (sequence of actions), some form(s) of information usage of them are known for the designers through the
breach can happen in the SoC. SoC design and implementation [36]. The following gives
Table 1 provides a top view of these sources of vulnera- us some insight about the main primitives in an SoC that
bilities and their characteristics. It clearly shows that there is must be considered as the security assets:
a critical need to verify the security of the SoC at each of the (SA1) On-device key: (Secret/Private key(s) of an encryption
stages and verify the trustworthiness of an SoC. However, if algorithm) These assets are stored on a chip mostly in some
the vulnerabilities reach the post-silicon stage, there would form of non-volatile memory. If these are breached, then the
be limited flexibility (almost none) in changing or fixing confidentiality requirement of the device will be compro-
them. Moreover, the cost of fixing the design is significantly mised.
higher as we advance through the later stages of the design. (SA2) Manufacture firmware: (Low level program instruc-
Furthermore, vulnerabilities that reach the manufacturing tions, proprietary firmware, protected states of the controller(s))
stage will cause revenue loss. Therefore, it is essential to These assets may have intellectual property and system-
develop efficient security verification approaches to ensure level configuration values and compromising these assets
the security and trustworthiness of SoC designs with more would allow an attacker to counterfeit the device.
concentration at the pre-silicon stage. (SA3) On-device protected data: (Sensitive user data + meter
reading) Leakage of these assets is more related to identity
theft, and an attacker can invade someone’s privacy by steal-
ing these assets or can benefit himself/herself by tampering
3 S O C S ECURITY V ERIFICATION : T ERMINOLOGY these assets.
(SA4) Device configuration: (Service/Resource access configu-
The main aim of this section is to provide the basics and ration) These assets determine which particular services or
principles around the concept of SoC security verification. resources are available to a particular user and an attacker
For this purpose, we comprehensively review the main may want to tamper these assets to gain illegal access to the
terminologies that are directly related to this trend. resources.
FUTURE MICROELECTRONICS SECURITY RESEARCH SERIES 6
TABLE 1: Overview of Source of SoC Security Vulnerabilities.
Category Source of Vulnerability Design Flow Stage Examples of Vulnerabilities Threat Model
(SV1) Inadvertent designer mistakes software, firmware, boot (1) Insecure implementation of controller circuit (FSM) or boot Insecure boot mode, Inter-
loader, register file, cache, loader, (2) Incorrect synchronization or protocol handshaking IP illegal accesses, Informa-
register-transfer (RT) level, between IPs (master/slave), (3) Incorrect mutual exclusion of tion breach, malfunction-
high-level (HLL) write/execute operation leading to illegal access. ing of security operations
(SV2) Rogue employee at design software, firmware, boot (1) manipulating hardware, firmware, software, that facili- Insecure boot mode, Inter-
house (insider threat) loader, register file, cache, tates obtaining the security assets after fabrication, including IP illegal accesses, Informa-
RTL, HLL logic analyzer module insertion, disabling security permis- tion breach, malfunction-
sions/policies, etc. ing of security operations
(SV3) Untrusted third-party IP ven- HLL, RTL, Gate-level (1) Continuous watching/monitoring the bus for obtaining Inter-IP Illegal actions, By-
dors information by the IP, (2) Incorrect protocol handshaking that passing IP-based checks
leading to illegal actions (illegal memory write/read). (security checks)
(SV4) EDA optimizations logical or physical synthesis (1) Insecure optimization of the design, such as sharing (merg- Information breach, Inter-
flow ing between trusted and untrusted region), insecure control IP illegal accesses, Flaws
flow, insecure data flow. in security policies imple-
mentation
(SV5) Security compromising stages Gate-level, design service (1) Opening backdoor for attacks through test/debug infras- Information breach
of IC supply chain provider (DFT/DFD) tructures, (2) Reading internal values of the design
(SV6) Impact of hardware Trojan in- Gate-level (1) manipulating hardware that facilitates obtaining the secu- Information breach, Inter-
sertion rity assets after fabrication, (2) Insertion of Trojan for malfunc- IP privacy violation, Mal-
tioning functioning
(SV7) Lack of trustworthiness in logical and physical synthe- (SV2) + (SV6) (SV2) + (SV6)
EDA Tools sis flow
(SV8) Untrusted fabrication site GDSII at fabrication site (SV6) (SV6)
(SV9) Efforts on SoC design opti- RTL, HLL (1) Incorrect optimization with open corner cases that leads to Information breach, inter-
mization security vulnerabilities IP illegal accesses, insecure
protocol implementation
(SV10) Hammering by the end-user Post-silicon over the fabri- (1) applying continuous and repeating tests on specific target Information breach
cated chip based on physical or logical reflects
(SA5) Entropy: (Random numbers generated for cryptographic For SoC security verification, (security policies/properties
primitives), These assets are directly related to the strength define a mapping between the requirements and some
of cryptographic algorithms embedded into the SoC, e.g., design constraints. Then, to fulfill the desired protection
initializing vector or cryptographic key generation. Success- level, these constraints must be met by building some
ful attacks on these assets would weaken the cryptographic infrastructures, and these infrastructures are built by the
strength of a device. IP design team(s) or SoC integration team. The definition
Choice of security assets varies design by design and of security policies/properties is dependent on different
abstraction layer by an abstraction layer. Mainly, the dec- actions/behaviors located at multiple stages or abstraction
laration of security assets is heavily dependent on the se- layers, and they might also be updated or refined through
curity policies that is defined by the designers of different various stages [38], [39]. As an instance, the requirement
component integrated into the SoC. Hence, apart from these defined as: switching the chip from functional to test mode
general security assets listed here, which are known to the should not leak any information related to the security assets, in a
hardware designers, the security assets begin to expand typical SoC can be mapped to constraint (policy/property)
within the SoC due to the interaction of different IPs. Con- defined as An asynchronous reset signal assertion for scan chain
sequently, with the increase of the list of security assets, SoC is required for secret/private key registers while the chip’s mode is
security verification through traditional methodologies, e.g., switching from the functional to the test mode.
formal-based and satisfiability-based approaches, becomes Per each requirement, for mapping to a secu-
almost impractical. rity policy/property, details and underlying conditions
must be considered meticulously, and then these condi-
3.2 Security Policies/Properties in SoC tions/constraints must be met to guarantee the protection
Based on the source of vulnerabilities, the threat model of the asset(s). It is evident that the definition of security
per each source, and security assets defined for the design policies/properties may vary depending on multiple fac-
under investigation, a set of requirements will be defined, tors, like architecture and components of the SoC, inter-
whose realization will assist the design team to guaran- connections and bus interface, state of the execution (e.g.,
tee the protection of the security assets against the given boot time, normal execution, test time), and the state in
threat models. Different threat model categorization can be the development life-cycle (e.g., manufacturing, production,
evaluated for SoC verification, such as (i) overwriting or test/debug), etc. Below we provide a categorization for gen-
manipulating confidential data by the unauthorized entity eral policies that are required to be considered for security
(integrity violation), (ii) unauthorized disclosure of confi- purposes in SoCs. This categorization covers both system-
dential information/data (confidentiality violation), and (iii) level and lower-level security policies/properties.
malfunctioning or disruption of function/connectivity of a (SP1) Definition of access restriction policies: This set of poli-
module (availability violation) [37]. cies/properties is one of the main requirements that define
FUTURE MICROELECTRONICS SECURITY RESEARCH SERIES 7
the constraints regarding how different components, either any part of inter-component communicated data.
hardware, framework, or software, can access security as- (SP8) Definition of inter-component authenticity policies:
sets. The definition of access restricting policies/properties Another low-level policy that verify the authenticity
directly/indirectly affects other policies/properties as well. of the component requesting a security asset.
For instance, given specific restricted access will change The definition of security assets, security policies, and
the data flow or control flow in the SoC. So, other poli- building a clear relationship between them are indispens-
cies/properties can be met/violated based on a newly de- able preliminary steps of SoC security verification. In the
fined access restriction policy/property. context of security verification, a security policy/property
(SP2) Definition of data/control flow restricting policies: In must be a complete statement that can check assump-
many cases, particularly security assets related to tions, conditions, and expected behaviors of a design [40],
cryptographic primitives, By observing the responses which is more likely a formal description of the de-
of the components under investigation to a sequence of sign behavior/specification. The coverage of security poli-
actions, the security assets can be retrieved with no direct cies/properties can be considered as a metric for the secu-
access. In such cases, even though (SP1) has been defined rity assessment of an SoC, and the violation of the poli-
and established properly, since there exists an insecure cies/properties implies that the design should be fixed.
form of data/control flow, the confidentiality would be
violated. Unlike (SP1), the realization of this group of 3.3 Security Policy/Property Languages in SoC
policies/properties requires highly sophisticated protection
To build the automated SoC security verification, a straight-
techniques with advanced formulation/model. So, to keep
forward constraint definition is required to verify that based
the complexity of security policies at a reasonable degree,
on the specification of the security policies/properties, the
this policy/property is more efficient to be employed for
design adheres to those properties. To meet such require-
high-critical assets with high confidentiality requirements.
ments, there should be a unified language for the decla-
(SP3) Definition of HALT/OTS/DOS restricting policies: This
ration of the security properties, and the security verifica-
policy/property indicates the liveness of components
tion framework must be able to convert this language to
throughout the execution of different operations. This
hardware implementation and testing. Due to the dynamic
mostly can be done by checking the status signals per each
nature of the threat model, the language must be rich
request for the component, to make sure that there is no halt,
and amenable to be expanded as needed and to specify
out-of-service (OTS), or denial-of-service (DOS) that violates
characterizations like sensitivity levels, affectability, and
the system availability requirements. Policies/properties
observability. As an instance, security policies/properties
related to the malfunctioning can also be part of availability
determine the secure vs. insecure region(s), and accordingly,
violation (as the component does not provide the correct
the security language must be able to check policies like
functionality or correct timing behavior).
confidentiality, as the sensitive data from the secure region
(SP4) Definition of insecure sequence execution restricting policies:
should not leak to any insecure region.
An authorization always is required once a component
These days, designers mostly use one of the powerful
needs to get access to a security asset in the SoC. However,
assertion languages such as property specification language
the flow between the authorization and getting access must
(PSL) [41] and SystemVerilog Assertions (SVA) [42] to describe
be flawless with no possibility of changes that invalidates
interesting behavioral events of a design. These languages
the access control. One of the examples for this group
use temporal logic representations such as Linear Temporal
of policies is time-of-check to time-of-use (TOCTOU),
Logic (LTL) [43] and Computational Tree Logic (CTL) [44].
which shows in the middle of authorization and usage, the
Languages based on LTL and CTL usually describe design
changes of the state of the security asset can lead to some
behaviors and properties in four layers: Boolean expression,
invalid/illegal actions which are happening only when the
sequence, property specification, and assertion directive
resource is in an unexpected state.
layers. These layers can be used on top of different HDL
(SP5) Definition of insecure boot restricting policies: The pro-
languages including Verilog and VHDL.
cess of the boot may involve multiple critical security
assets, including the definition of access restriction poli-
cies, cryptographic and on-device keys, firmware, etc., and 3.4 Pre-silicon vs. Post-silicon Verification in SoC
any security leakage can lead to multiple vulnerabilities SoC security verification can be done either before or af-
at different layers. Policies for protecting the boot can be ter the fabrication stage, called pre-silicon and post-silicon
defined individually related to (SP1-4) or unified on a set of verification, respectively. Generally, in the pre-silicon veri-
actions/requirements. fication, the verification target is typically a model of the
(SP6) Definition of inter-component integrity policies: This design (a representation of the design at a specific design
policy is more likely low-level, i.e., IP-level inter- stage, like post-synthesis netlist, or the generated layout)
communication, showing that any (secure) communication than an actual silicon artifact. The pre-silicon verification
between two components must be kept intact and with no activities consist of code and design reviews, simulation and
changes done by a third component. testing, as well as formal analysis. These tests are running
(SP7) Definition of inter-component confidentiality policies: at different corner cases with constrained inputs. One of the
Similar to (SP6), at low-level, any inter-communication biggest advantages of pre-silicon verification is its high ob-
between two components must be kept fully secure and servability as the design representation is available and any
confidential between these two components, and there internal signal/wire/register can be observed and verified.
should be no possibility for any third component to receive However, since it is mostly simulation-basis at MHz speed
FUTURE MICROELECTRONICS SECURITY RESEARCH SERIES 8
limited by the simulator performance, it takes a lot of time SoC as almost a black box, and only the general ports and
for verification of all policies/properties. potentially scan pins are available for testing. The gray
On the other side, post-silicon verification can be consid- model is any verification model that stands between white
ered as one of the most crucial yet expensive and complex and black models, and the level of access to internal parts
forms of the SoC verification, in which a fabricated, but pre- may be different per each case.
production silicon of the targeted SoC (initial prototypes of
the chips that are fabricated and are used as test objects) will
3.7 Scope of Security Verification in SoC
be picked as the verification vehicle, and then a comprehen-
sive set of tests will be executed on it. The goal of post- With the involvement of multiple abstraction layers in a
silicon verification is to ensure that the silicon design works complex and heterogeneous SoC, based on the security
properly under actual operating conditions while executing policies/properties associated with each abstraction layer,
real software, and identify (and fix) errors that may have the SoC security verification can be categorized into three
been missed during pre-silicon verification. In post-silicon main domains:
verification, since the silicon is used as the verification (SC1) Low-Level (Hardware-level): Once the security vulner-
vehicle, tests can run at target clock speed enabling the abilities arise from the underlying hardware, such as RTL
execution of long use cases at a much smaller time slot. and gate-level netlist, the scope of verification needs to be
However, it is considerably more complex to control or concentrated at logic-level (low-level), such as hardware
observe the execution of silicon than that of a pre-silicon Trojan insertion, counterfeiting, fault injection, etc.
verification, as the access and observability of many internal (SC2) Platform-Level (System-level): In this category, the vul-
nodes will be lost. nerabilities associated with the inter-IP communication and
system-level bugs that exploited mostly by untrusted third-
3.5 Adversarial Model in SoC party components during run-time. As an example, any con-
fidentiality and integrity violation for any inter-component
To ensure that an asset is protected, the design team needs, communication that can lead to information leakage or un-
in addition to the security policies/properties, to define and expected actions can be considered as system-level vulnera-
determine comprehension of the power of the adversary. bilities that require system-level security policies/properties
The notion of the adversary can vary depending on the asset definition.
being considered. For instance, regarding the cryptographic (SC3) Software-Level (Framework-level): This group refers to
and on-device keys as the security asset, the end-user would vulnerabilities arising from the intercommunication be-
be an adversary, and the keys must be protected against the tween hardware parts and the software or the framework.
end-user. As another example, the content provider (and Additionally, network-based vulnerabilities, like communi-
even the system manufacturer) may be included among ad- cation of an embedded computing unit with off-chip mod-
versaries in the context of protecting the private information ules or cloud can be categorized as a member of this group.
of the end-user. So, based on the definition of security assets, In this group, the definition of security policies/properties
source of vulnerabilities, and security policies/properties, would be more at the higher level of abstraction combined
the adversary model also needs to be clearly defined helping with checking flags/status at hardware levels.
to have a stronger SoC security verification mechanism.
Hence, rather than focusing on a specific class of users
as adversaries, it is more convenient to model adversaries 4 S O C S ECURITY V ERIFICATION : C HALLENGES
corresponding to each policy and define protection and
Based on what we learned so far, for the automation of
mitigation strategies with respect to that model.
SoC security verification, we need to accomplish some steps:
(i) identification of the source of vulnerabilities (SVs), that
3.6 Verification Model in SoC helps to define the threat models (adversarial model); (ii)
The SoC security verification can be done at different indicating the security assets per component (SAs); (iii)
stages each focusing on different sets of security poli- definition of security policies/properties (SPs) based on
cies/properties. Targeting the design at different stages will the threat model and the chosen security assets; (iv) for-
affect the model defined for the verification. In addition, the malizing the security policies/properties using the unified
model can be also related to the adversarial model as the language with consideration of the security domain (SCs);
source of threats. For instance, for the untrusted foundry and (v) implementation and testing. To accomplish these
with having access to all masking layer information, to steps, there exist challenges showing why new approaches
malicious end-user that has access to the fabricated chip like the self-refinement technique, i.e., fuzz, penetration,
mostly as a black-box, the security verification can be mod- AI testing, are needed for the SoC security verification.
eled differently. Hence, based on the access provided for the Following section covers some of the biggest challenges that
SoC verification framework, it can be generally divided into cannot be solved using the existing approaches, like formal
three main categories: white, gray, and black. The definition satisfiability-based techniques, model checking, information
of these verification models in SoC is very close to their flow tracking, etc.:
definition at the software level. In the white verification (i) Preciseness: For almost all aforementioned steps, the
model, all internal wires, signals, nodes, registers, etc., are course of action(s) decided and accomplished by the de-
fully observable allowing the security properties/policies signer(s) requires the highest precision to make the whole
to be implemented in detail and very specifically based SoC security verification procedure a successful practice.
on the requirement. The black verification model treats the Precisely evaluation of SVs, and understanding the threat
FUTURE MICROELECTRONICS SECURITY RESEARCH SERIES 9
models, precisely choosing SAs, precisely defining SPs, (i) Time of Verification: Since moving towards the final stages
and formalizing using the selected language directly and of SoC design, implementation, and testing makes the eval-
significantly affects the outcome o the framework. For SoCs uation and investigation much harder, it is paramount to
getting more complex and more heterogeneous, precisely in- accomplish the verification, particularly for primary SAs
dicating exacts SVs, SAs, and SPs becomes more and more and SPs at the earliest possible stage. Assuming that the
challenging. The designer(s) needs to know all underlying vulnerabilities reach the post-silicon stage, the verification
information about all components, modules, frameworks, model will change from a low-level white model to a
their intercorrelation, handshaking, their corresponding se- chip-level black model, which makes the verification much
curity levels, etc., which is almost impractical in modern harder. Moreover, the cost of fixing the design is signifi-
SoCs. cantly higher as we advance through the later stages of the
(ii) Multi-stage/layer verification: Definition of SVs, SAs, and design. Furthermore, vulnerabilities that reach the manu-
SPs are fully dependent on the stage of the design flow, and facturing stage will cause revenue loss. According to the
the abstraction layer of the design. There is no guarantee rule-of-ten for product design [46], modifying a design at
that when the security verification is passed in one stage the later stages of the SoC design flow is ten times costlier
of the design, then the same SPs will be passed in another than doing in the previous steps.
abstraction layer. Changes and refinements per each stage, (ii) Evolutionary Mechanism: Almost all existing security veri-
and moving from one abstraction layer to another one, fication frameworks that are relying on conventional formal-
might arise new vulnerabilities. A simple example for this based, satisfiability-based, or model-checking-based tools,
case is the transitions done by synthesis tools, like high-level tend to provide a binary response to the SP(s) associated
synthesis (HLS) and other computer-aided design (CAD) with a set of SAs. So, the analysis behind the verification is
tools, which may add new data/control flow that can be very limited to a specific set of events, and the verification
exploited leading to a new vulnerabilities [21], [45]. This is solutions just indicate that no flow occurs or that there is at
why the invocation of SoC security verification is required least one that violates the defined SP(s). However, they do
at different stages and different layers of abstraction. not provide any indication in between, e.g., predicting that
(iii) Verification vs. Dynamicity: Based on how the SVs, SAs, we are approaching a vulnerability or not (majority part
and SPs are defined, dynamicity might happen during run- of SP(s) will be satisfied/passed). Providing evolutionary
time, meaning that there is potentially new SAs introduced response (like providing feedback) can help the framework
when a specific set of operations are executed on the orig- by itself to mutate the test-cases based on the collected data
inal SAs. This will propagate the original ones and based and in a smart way and narrow the search space for reaching
on the dependency/relation to other variables, some other to potential vulnerabilities. For building such structure, a
variables may preserve critical information that must be definition of some security metrics or coverage metrics is
considered as new SAs. Additionally, the threat models will needed as well to guide the test-cases to approach the
be updated over time, resulting in new SVs, which lead vulnerability and eventually get the global minimal.
to the introduction of new SPs. Also, protecting the design (iii) Hammering-based Verification: As mentioned in Section 4,
against one SV may make it vulnerable to the other one. For due to the dynamic nature of threat model as well as the
example, protecting a design against information leakage propagation of SAs that results in the introduction of newer
may create side-channel leakage that can be exploited by an SAs, unknown security vulnerabilities will appear in SoC
attacker to retrieve sensitive information. transactions, leading to breaches of confidentiality, integrity,
(iv) Unknownness: As we mentioned previously, the uti- or authenticity, even while the verification has been done
lization of conventional techniques and tools, i.e., formal appropriately based on the known security policies. Hence,
satisfiability-based techniques, model checking, information it is crucial for a security verification framework to be capa-
flow tracking, etc. for SoC security verification is mainly ble of finding both known and unknown vulnerabilities in
limited to the expert review, and they do not provide accept- the SoC, even though there exists no clear/precise definition
able scalability. This is getting worse when the side-effect of for the actual vulnerabilities in the targeted SoC. This is
dynamicity comes to the action, which is the introduction when evolutionary-based verification comes into action and
of unknown vulnerabilities. With unknown vulnerabilities, may help to detect such unknown vulnerabilities by a smart
there is no precise definition for SV, SA, and SP, and they hammering, and helping to avoid such scenarios before
might be caught by accident/chance through either the tests moving to the next design stage.
by the designer(s) or by the attacks (like hammering) by (iv) Hardware-software Co-Verification: Some hardware vul-
the adversary. This is when the self-refinement techniques’ nerabilities in SoC designs are not explicitly vulnerable
contribution plays an important role, by using smart ham- unless triggered by the software [27]. In some cases, it might
mering and testing, to detect such vulnerabilities before be possible to formalize such vulnerabilities via a complex
releasing the SoC into the field/market. sequence checking at the hardware level, but if the veri-
fication mechanism allows doing either hardware-software
co-analysis or software-level verification, the definition of
5 S O C S ECURITY V ERIFICATION : A SSUMPTIONS SP(s) can become more straightforward. So, the SoC security
verification framework should handle such interactions as
To have a successful SoC security verification solution, well to ensure the security of the SoC.
there exist some fundamental assumptions that must be (iv) Verification at Different Level of Access: As the contribution
considered meticulously as the most basic requirements of of proprietary third-party IPs in modern SoCs is getting
the proposed solution: more and more, it decreases the visibility of the verification
FUTURE MICROELECTRONICS SECURITY RESEARCH SERIES 10
engineer from the internal specifications of the SoC. There- In the following Sections, we will discuss how self-
fore, the security verification framework should be able to refinement techniques and tools, like fuzz, penetration, and
verify the security of the SoC considering gray box or black AI testing can be engaged along with the consideration
box model. of cost function definition for finding both known and
unknown vulnerabilities of the SoC.
6 S O C S ECURITY V ERIFICATION : F LOW
Considering the aforementioned challenges and assump- 7 S O C S ECURITY VERIFICATION : F UZZING
tions, to get the benefit of self-refinement techniques for The words fuzz testing or fuzzing represent randomized
SoC security verification, the followings are the major steps testing of programs to find out the anomalies and vulner-
that must be considered as fundamentals of the SoC security abilities. Fuzzing is usually an automatic or semi-automatic
verification for both known and unknown vulnerabilities: approach that is intended to cover numerous pre-defined
(step1) Risk Assessment: The main purpose of this step is (instrumented) corner cases of invalid inputs to trigger
to identify the SA(s). The SA(s) will be determined based the existing vulnerabilities in a program. Fuzzing was first
on the ownership, domain, usage, propagation, static or applied by Miller et al. [47] to generate random inputs,
dynamic nature, etc., and the outcome of this step would which were provided to Unix to find the specific inputs
be a semantic definition as the requirements of verification that cause crashes.
activities. Generating random inputs without any feedback from
(step2) Definition of Adversarial Entry Region: Per each de- the design under investigation is referred to as blind fuzz
fined SA, this step defines the most intuitive adversarial ac- (close to random) testing which suffers from low coverage,
tions (SVs) around the SA that might lead to governing the especially when more complex input models are introduced.
assets, such as different entry point candidates, data/control This has led to additional stages being introduced to the
relevancy between assets, and untrusted region(s), etc. fuzzing platforms. Generally, three main steps are involved
(step3) Definition of Security Policies/Properties (Known): in a fuzzing that will be invoked iteratively on the program:
Based on SA(s) and SV(s), each vulnerability is converted (i) Test scheduling, which relates to the problem of ordering
to a set of rules and then those rules will be converted to a the initial seeds of each fuzz iteration in a way that leads to
set of properties (SPs). For widely arisen vulnerabilities that full coverage as fast as possible.
can be categorized as known vulnerabilities, the formalism (ii) Mutation steps, which incorporate a variety of methods
can be done by definition of assertions that monitor illegal ranging from a simple crossing of seeds to optimized genetic
events/sequences. The richability of the unified language algorithms in order to produce new seeds for further exploit
plays an important role in this step (e.g., LTL [43] and CTL generation.
[44]) that pave the way for converting the SPs to hardware (iii) Selection, where the useful seeds are pruned out of all
implementation. generated seeds. this process requires metric evaluation as
(step4) Definition of Security Policies/Properties (Unknown): the prominent way of deciding which seeds result in better
Unlike known vulnerabilities, in case of unknown coverage of the whole design under test [48].
vulnerabilities (e.g., any possible scenario that leads to Amongst the existing and widely used fuzzer tools,
security asset leakage), the policy/property can be defined american fuzzy lop (AFL) [15] is one of the most popu-
in the form of a cost function, which represents the lar software fuzzers that uses an approximation of branch
evolutionary behavior of the vulnerability and tries to coverage as its metric, while another famous fuzzer named
approach a certain state/location of the design in a different Hongfuzz [49] accounts for the unique basic blocks of code
way to trigger unknown vulnerabilities. Cost functions visited. Based on the benchmarks used by different fuzzers,
can be considered as a relaxed or higher-level version of the baseline of fuzzing, crash type, coverage mode, and
assertion-based SPs, and by using self-refinement tools, seed formulation, past fuzzing mechanisms divided into
as discussed hereinafter, self-mutation can help to move numerous groups, whose details and comparison can be
towards testing data that excite the conditions leading to found in [50]. The following first shows how the verification
the specific class of vulnerability under investigation. model can change the way fuzzer acts on the targeted
(step5) Hardware Implementation of Security Verification Model: program, and then we will investigate how the fuzzer can
For both known and unknown cases, by using the unified be engaged for SoC security verification. As discussed previ-
languages, all SPs must be converted to hardware ously, the purpose of using such self-refinement approaches
implementation. For known cases, it is more likely is to overcome the scalability issues of formal verification
assertion-based scenarios that check the sequence of events methods [51].
for a specific incident. For unknown cases, it is based on
instrumentation and mutation, which helps to build the
evolutionary mechanism. 7.1 Formal Definition of Fuzz Testing
(step6) Security Verification and Testing: This step involves Based on the verification model, fuzzing-based techniques
running verification, testing, and refinement-based tools can be classified into three categories: white box, gray box,
based on the definition of SPs and their hardware imple- and black box, which are defined based on the availability
mentation that lead to verifying the security of the SoC. This of information during verification or run-time phases. This
step can be done at different stages of the IC design, from information may include the source code of the hardware or
high level to GDSII as pre-silicon, and after fabrication on software designs, detail information about the security spec-
initial prototypes as post-silicon. ification and functionalities, code coverage, control and data
FUTURE MICROELECTRONICS SECURITY RESEARCH SERIES 11
flow graphs, execution (simulation or emulation) related the vulnerabilities and explore new paths in the program,
information such as CPU utilization, memory usage, etc. which significantly improves the detection capability and
The following describes these three categories of fuzzing. coverage of the fuzzing approach.
Seeds
C/C++ Security
Wrapper Policies or input
Properties Executable
Binary feedback
RTL Description
Feedback
DB of
of SoC Fuzzer Vulnerabilities
toHLL
conversion
Instrumentation Compilation
HW Simulation Timeout/Finished? Corpus
Crash?
C/C++ Model Exit/Done
Fig. 5: A Fuzzing Framework on Software Level of RTL.
the vulnerability instance that is currently being targeted. DifFuzz proposed in [60] is a fuzzing-based approach in
This means that instead of a general feedback from the order to detect side-channel vulnerabilities related to time
fuzzer outputs, one can make more suitable decisions for and space. This technique analyzes two copies of the same
further analysis because now the feedback is tailored for the program which have different secret data while providing
targeted bug in the running verification session. the same inputs to them both. The authors developed a
cost metric which estimates the resource consumption, such
7.2.2 Prior Art HW-to-SW Fuzzing as the number of executed instructions and memory foot-
This section reviews the studies that incorporate hardware- print, between secret-dependent paths for both programs.
to-software abstraction conversion for realizing software However, this technique is primarily dependent on the
fuzzing for hardware verification and the prominent de- details of microarchitectural implementation information.
ficiencies of each method are mentioned. The authors in The more visibility into the microarchitectural state, the
[57] proposed a mutational coverage-guided fuzzing-based more coverage is achieved. Unfortunately, obtaining good
framework in order to resolve modern SoC verification chal- visibility is not always available for many hardware designs
lenges, such as co-verification of hardware and firmware, which binds these techniques to low accuracy. Yuan Xiao
as well as scalability. The framework includes adversarial et al. in [61] developed a software framework leveraging
behavior and coverage metrics to evaluate the security prop- fuzzing concepts to identify Meltdown-type vulnerabilities
erties written in their proposed logic HyperPLTL (Hyper in the existing processors. The authors build up the code
Past-time Linear Temporal Logic). However, the primary using some templates, which are executed to find out the
disadvantage of this technique is that it requires a huge vulnerabilities. The authors leveraged cache-based covert
amount of technical expertise, particularly for building SPs, channel and differential tests to gain visibility into the
and hence incurs higher chances of erroneous and low microarchitectural state changes, which eventually helps to
coverage results which could eventually limit the scope analyze the attack scenarios.
of bugs detection. Tripple et al. [51] developed software Ghaniyoun et al. [62] proposed a pre-silicon framework
models of hardware and then applied fuzzing on the soft- for detecting transient execution vulnerabilities called Intro-
ware leveraging Google’s OSS-fuzz. The main problem with Spectre. IntroSpectre is developed on top of Verilator. The
this approach is due to its dependence on general metrics authors resolved the challenges of lacking visibility into the
such as code coverage rather than a dedicated cost function microarchitectural state by integrating it into the RTL design
targeted at hardware model properties. flow which makes it identify unreachable potential side-
Moghimi et al. [58] utilized fuzzing by mutating the channel leakages. The authors utilize the fuzzing approach
seeds of existing Meltdown variants in order to discover to generate different attack scenarios consisting of code
various Meltdown-type attack variants. The authors pro- gadgets and analyze the logs obtained from simulation to
vided randomly mutated inputs to the associated faulty identify the potential transient execution vulnerabilities.
loads and used the cache as proof of covert channel for side-
channel leakage. Oleksenko et al. in [59] developed a dy- 7.2.3 Limitations and Challenges
namic testing methodology, SpecFuzz for identifying spec- There are many rudimentary differences between fuzzing
ulation execution vulnerabilities (e.g., Spectre). SpecFuzz a software program and a RTL hardware model, the first
instruments the program and runs the simulation for spec- one of which is due to the difference in input arguments.
ulative execution in software-level traversing all possible A digital circuit has the notion of input ports that take
reachable code paths that may be triggered due to branch different values in each cycle, unlike software that reads
mispredictions. During simulated execution of the program, its inputs from a variety of sources including arguments,
speculative memory accesses are visible to integrity check- files, or through OS system-calls. Fuzzing requires a solid
ers which is combined with traditional fuzzing techniques. definition of the format of the input so it can do meaningful
SpecFuzz can detect potential spectre like vulnerabilities but mutations and generate new passable tests. So the actual
it is only useful for this certain type of attacks that exploit input to be fuzzed should be thoroughly explained to the
speculation in CPU pipeline. fuzzer when working with a translated hardware design.
FUTURE MICROELECTRONICS SECURITY RESEARCH SERIES 13
The format of the input has a great impact on how well the software [20]. Again, the verification engineer may not get
mutation engine performs. a complete manual for third-party IPs. Hence, SoC verifi-
The second issue to address is due to the difference in cation still is a burdensome task that demands significant
software versus hardware coverage metrics. Many fuzzers research to develop a robust and scalable solution.
depend on instrumentation to obtain coverage metrics and In such an environment the emulation-based fuzzing
direct the seed generation towards uncovered sections of methods, with more concentration on gray-box fuzzing
the design. While some metrics such as branch and line mode, can greatly improve the performance of the
coverage can approximately be mapped to each other in simulation-based approaches by running the design on an
both hardware and software models, other metrics such as FPGA and interfacing the fuzzer to the prototype design
FSM coverage or toggle rate are not translatable [51]. The under test in a more realistic manner. Figure 6 represents
traditional hardware verification uses these metrics to target a general platform for interfacing FPGA with a test gener-
specific classes of vulnerabilities and any effort in software ator fuzzer. The difference of this method with what was
domain should comply with these previous platforms as discussed in Figure 5 is mostly in the bitstream generation
well. as opposed to a binary under test and the fact that the
The third dominant issue with fuzzing on the translated test cases are introduced to the actual design through a
hardware is with regard to the cost function. Software direct memory access channel. The instrumentation in this
fuzzers look for crashes, exceptions and memory checks approach is outputted through the probe analyzers available
as the vulnerabilities that could exist in a software model. at FPGA monitoring implementation and it helps the cost
These scenarios are not convertible to hardware designs, function to better guide the mutation engine.
particularly low-level or platform-level vulnerabilities, be- Direct fuzzing on FPGA-accelerated simulation incurs
cause these forms of errors like exceptions do not exist in the some major challenges such as resetting the FPGA, which
hardware realm. Hardware is inherently different when it is required to take the program to a particular known state
comes to targeting vulnerabilities because software crashes to decrease the verification time or defining the branch
can still happen even when the hardware is fully functional coverage, which is needed to track the verification coverage.
and secure. Fuzzing the translated model without introduc- In order to solve the first challenge, memory snap-shooting
ing our own notion of hardware vulnerabilities through a techniques can be used to reset the program and set the
cost function will result in the fuzzer expending its resources desired state as a preparation for each test intended to be
on detecting bugs induced by the translator rather than the performed with new fuzzing inputs. In order to estimate
hardware model itself [63]. Another problem related to this the branch coverage, branches are mapped to multiplexers
issue is due to the changes in variables and functions when which output one of two input values in each cycle.
performing the translation. This makes the translation of Kevin Laeufer et al. in [66] proposed a coverage-directed
properties from hardware RTL model to software a crucial fuzzing approach for RTL testing on FPGAs leveraging
task that has not been investigated. ideas from the software testing community (RFuzz). The
authors proposed a new approach for coverage metrics,
7.3 Direct Fuzzing on Hardware RTL which uses multiplexer as branch coverage. RFuzz employs
The verification faces additional challenges with enlarging FPGA accelerated fuzzing in order to speed up the fuzzing
and growth of the design’s size and scope, e.g., the ver- procedure, which may increase specialized hardware costs
ification for a full-scale complex and heterogeneous SoC. and also is limited to language support. However, it is
The primary challenge relevant to the SoC security verifi- still lack of the definition of the cost function for security
cation is the lack of end-to-end verification methods which verification of the design under investigation. Also, this
can resemble the behavior of every hardware component mechanism supports a very limited set of design types, as
and the run-time implications of software components or they cannot accomplish the testing while the processor is in
framework(s) executing on it, i.e., hardware/software co- place, and the verification is required to be intercorrelated
verification with maximum coverage. Again, scalability is between hardware, software, and firmware.
the biggest challenge in modern SoC verification due to In general, fuzzing can be a promising solution for devel-
the complexity and massive size of the designs. In order oping an end-to-end mechanism for full system verification
to tackle scalability, an automated and systematic verifi- and it requires a minimum amount of adjustment since it
cation platform is required. However, automation in SoC is already well-developed in other areas of research, stan-
verification is very difficult for several reasons. Firstly, the dalone simulation-based fuzzing or FPGA-based interfacing
verification engineer faces extreme challenges to precisely still cannot cover a wide variety of threat models and source
assess the security policies/properties of an SoC design due of vulnerabilities showing further investigation is inevitable
to the enormous number of components that interact with in this domain.
each other. Identifying the security policies/properties in a
comprehensive fashion largely influences the quality of SoC
verification. Secondly, the verification engineer faces chal- 8 S O C S ECURITY V ERIFICATION : P ENETRATION
lenges in modeling the attack scenarios that may happen in Penetration testing (PT) is a methodology to assess the
the SoC. These attacks may include side-channel based, di- vulnerabilities in an application or a network and exploit
rect hardware or software exploitable attacks that could run those weaknesses to gain access to the resources. Since
on the SoC. It’s very challenging to prepare different attack accessing security-critical resources of a computing system
models from the specifications for different untrusted third- necessitates the exploration of exploitable vulnerabilities,
party IPs [64], untrusted OEM firmware [65] and untrusted vulnerability assessment (VA) is an indispensable precursor
FUTURE MICROELECTRONICS SECURITY RESEARCH SERIES 14
Seeds
Compilation Executable Fuzzer Mutation Cost Function
Binary feedback
HLL Code
on SoC
input
Security
RTL Policies or Synthesis / DB of
feedback
Description Properties PnR / bitGen Bitstream input Vulnerabilities
Output
of SoC
respone Monitoring
to PT. In literature, therefore, VA and PT are often consid- in which the penetration test engineer tries to anticipate the
ered a single framework to assess the security of a network course of action that malicious actors might adopt while
or application [67]. trying to compromise the security of the system. An analogy
is often used in penetration testing colloquialism, where it
8.1 Higher Abstraction Penetration Testing: Types, is compared to hiring a thief to break into one’s own house
Practices and Objectives to find the loopholes in the implemented security system.
Motivated by the ever-growing attack surface of software, The term penetration in this context, therefore, stands for
the secure software development lifecycle has been adopted acquiring legitimate access to resources illegitimately. We
by leading software companies over the last two or three present the typical steps in a typical VAPT approach in the
decades. Security issues are considered, analyzed, and at- following:
tempted to be discovered at even early stages of devel- (i) Reconnaissance: In this step, the tester gathers extensive
opment [68]. Consequently, in the software development knowledge on the application, network, or computing sys-
lifecycle, VAPT has become a well-defined methodology to tem to be pen tested. The principal goal of this step is to
weed out bugs and vulnerabilities. In fact, there are govern- get familiarized with the particular technology, protocols,
mental and private organizations that accredit individuals software versions, IP addresses, the configuration being
and groups based on their ability [67], [69]. Open Web used by the system. Websites, job postings, and even social
Application Security Project (OWASP) is a non-profit orga- engineering can be used to gather information [74]. Httrack,
nization, for example, that has published and sanctioned a Harvester, and Whois are some tools that can help in this step.
detailed step-by-step process of performing penetration test-
ing of web applications and firmware [70]. Open Source Se- (ii) Scanning: In the scanning step, the system or network is
curity Testing Methodology Manual (OSSTMM) also offers at first probed to find points of entry. An example is Nmap
detailed guidelines on how to conduct VAPT on different program to find open ports in a network. Subsequently,
systems [71]. CREST, Tigerscheme, and the Cyberscheme are a commercially available tool like Nessus is used to find
some other professional bodies that provide industrial cer- known vulnerabilities in the system.
tifications and qualifications. CREST in particular provides
(iii) Exploitation: From the list of known vulnerabilities, the
the intelligence-driven red team penetration testing frame-
tester comes up with an attack or exploitation plan. The goal
work to assess the robustness of an operational team against
of this step is to identify the sequence of actions that can be
cyberthreats [72]. Mitre sponsors the maintenance of two de-
taken to gain access to unprivileged resources in the system.
tailed databases of known software vulnerabilities- the CVE
Metasploit is an open-source tool that is frequently used in
(Common Vulnerabilities and Exposure List) and the CWE
academic and professional settings to realize this step. Effec-
(Common Weakness enumeration). These two databases
tive exploit management (search, upgrade, documentation)
contain extensive information on most common software
or a large number of payloads (tasks that are done after
vulnerabilities, example implementations, and even discuss
successful exploitation of the target system) are available in
potential mitigation solutions. Therefore, VAPT is a well-
Metasploit. In general, payloads can be either simple and
delineated methodology in software and network security
focused on a single activity (for example, user creation) or
that has been implemented in practice successfully over the
complex and comprehensive and provide more advanced
course of the last two or three decades.
functionality.
The main objective of a penetration test pertaining to an
application or network is to find out exploitable vulnerabil- (iv) Post Exploitation: The post exploitation step documents
ities. Colloquially, VAPT has also been described as ethical the steps taken to gain access to non-permitted resources
hacking [73]. This terminology is inspired by the fact that (if successful). It might also involve the tester attempting
often, the best approach to finding unanticipated vulnera- to escalate privileges already gained in the system. The
bilities in a computer system is to hack into the system from documentation of the steps taken gives valuable insight into
the outside looking in. This is essentially a simulated attack, the weakness of the system.
FUTURE MICROELECTRONICS SECURITY RESEARCH SERIES 15
8.2 Formal Definition of Penetration Testing hardware. In contrast to software that can be updated with
Similar to other testing approaches, based on the depth and a patch once a vulnerability has been discovered, hardware
level of access as well as verification model, there are three can not be easily patched (especially ASICs). A pre-silicon
types of penetration testing: testing methodology would be much more beneficial to the
designers and verification engineers.
8.2.1 Black-box Pen Testing In light of the challenges and foregoing differences with
The testers do not have any prior access to any resources the software domain, we define pre-silicon hardware pen-
on the test target when performing black box penetration etration testing as a testing methodology that propagates
testing. They are expected to figure out all of the minutiae the effects of vulnerability to an observable point in the
of the system, as well as any flaws, depending on their design in spite of cross-modular and cross-layer effects
previous experience and individual expertise. The tester’s present in the design. In contrast to randomized testing
primary goal is to audit the external security boundary of which develops test patterns without the knowledge of the
the test target; as a result, the tester replicates the activities vulnerability it is seeking to detect, hardware penetration
and procedures of an actual attacker who may be located testing assumes a gray or black box knowledge of the
at a location other than the test target’s boundary and who specification of the design and a gray box knowledge of the
does not know anything about the target. OSSTMM makes bug or vulnerability it is targeting. The gray box knowledge
a distinction within what would be typically referred to as of the bug implies that the tester has knowledge of the
black-box PT between blind and double blind testing. In type of bug or vulnerability it is, and how it might impact
double blind testing, the target is not notified ahead of time the system but not the precise location of its origin or the
of the audit whereas in blind testing it is informed ahead of precise point in the design where it might manifest in a
time. complex SoC. Penetration in this context, therefore, refers
to the propagation of a vulnerability from an unobservable
8.2.2 White-box Pen Testing point in the design to an observable point.
Contrary to black box PT, the testers are provided with all
8.4 Penetration Testing on Hardware: Framework
of the internal information about the system. This is meant
to simulate an attack from an internal threat like a malicious In this section, we demonstrate how a binary particle swarm
employee. White box PT offers higher granularity of testing optimization (BPSO) based penetration testing framework
while at the same time offering the benefit of not relying can be used as a promising solution for the SoC security
heavily on trial and error as is common in black box PT. verification domain.
Not detected DB of
Vulnerabilities
Generate detected
Initial / Update Evaluate
start Mutated
the Swarm Inputs Fitness
Security Policies
input or Properties
Output +
DUT respone Monitoring Past outputs
(at desired abstraction level) Cost Function
(e.g., HLL Model, netlist, FPGA-emulation, ...)
Generator
Fig. 7: A BPSO based Hardware Penetration Testing Framework for Detecting RTL Vulnerabilities.
8.4.2 BPSO-based Hardware Penetration Testing with the impact they might have. The framework’s primary
Based on our earlier discussions on hardware penetration application would be to test vulnerabilities for which the
testing, the framework shown in Figure7 can be a BPSO- tester has a high level working knowledge of how they can
based pen testing approach on hardware that can be appli- result in a security policy violation.
cable for SoC security verification. At the core of the vulner- (ii) The tester should have access or visibility to certain
ability detection process is the cost function generator. This points in the design anticipated to be affected by the trig-
generator (ideally automatically) generates a mathematical gering of the vulnerability. The encryption key used in
function that describes the vulnerability that the tester is the crypto core of an SoC is an asset. To test whether a
attempting to detect. The input to the design is described in vulnerability exists in the design which can lead implicit
terms of a binary vector which is mutated upon based on or explicit flow of this asset to a PO, the observable point in
the evaluation of cost function for a generation of input test the design would be the PO. On the other hand, if we are
vectors. For each generation of the swarm, the algorithm testing to check if the vulnerability enables flow of the asset
tries to minimize (or maximize) the cost function (that to an unauthorized 3PIP, the observable point should be the
helps to build the evolutionary mechanism). The observed SoC bus through which this type of transaction might take
output can be any observable point in the SoC including place.
hardware signals during functional simulation, memory (iii) The tester should have reasonable (but not necessarily
address contents available from simulation or emulation, exact) knowledge of how to trigger the vulnerability. This
observable points or signals created by RTL instrumentation in turn would dictate the input to mutate on. For example,
or scan chain insertion, and even the output of a user let us consider the debug unit vulnerability described in
space program. The cost function generator generates the [85] where the password check of the JTAG interface could
mathematical description of the vulnerability based on the potentially be bypassed by resetting the unit. In this case,
observable output point, a description of security policy, the hardware debug signals exposed to the outside world
and an append-only database of observed past outputs for a would be the relevant inputs. For the key asset scenario
particular sequence of inputs applied or actions performed. described earlier, the input to mutate would be physical
The cost function generator must rely on keeping a record signals of the crypto core exposed to the outside world or a
of outputs attained for a sequence of inputs since only a user space program that can access AES resources. Similarly,
sequence of inputs can trigger certain hard-to-detect vulner- for triggering software exploitable vulnerabilities the input
abilities. For example, the AES-T1100 Trojan described in the to mutate would be the data and control flow of a user space
Trust-Hub database [79] gets activated upon the application program.
of a predefined sequence of plaintext. This stands in contrast to random and blind fuzz testing,
In order to apply such BPSO-based pen testing frame- which find vulnerabilities by applying random inputs to the
work, there are three prerequisites that must be met: design which in turn can unexpectedly lead the design to
(i) The tester should possess a preliminary knowledge of a non-functional or vulnerable state. The BPSO algorithm
vulnerability in addition to the impact it might have on is suited for vulnerability detection at the pre-silicon level
observable output and how it can lead to the violation of since any input to a digital design can be considered in
predefined security policies of the device. We argue that terms of binary vectors. Discernibly hardware signals are
this is a reasonable supposition since a significant portion binary quantities. The user space programs run on modern
of RTL hardware vulnerabilities have well studied effects. SoCs can also be described in terms of binary vectors by
For example, hardware Trojans can cause integrity, confi- translating the associated program into corresponding as-
dentiality, and availability violations in a circuit [6], [80]. sembly instructions. Additionally, to incorporate sequential
Security unaware design practices can lead to a design inputs, each particle in the swarm can be considered as
having unanticipated leakage of information and assets to input vectors applied at different clock cycles.
an observable point or to an unauthorized 3PIP in the design
[81], [82], [83], [84]. Furthermore, there are open source 8.4.3 Validity of Gray-Box Assumptions
databases (e.g. Common Weakness Enumeration) that cat- Since the BPSO-based penetration test framework assumes
alogue commonly found vulnerabilities in hardware along knowledge on the part of the tester and the availability of
FUTURE MICROELECTRONICS SECURITY RESEARCH SERIES 17
RTL code, the readers might assume that this violates the learning, the first step is to formulate the problem statement.
gray-box testing goals of the framework. We now discuss After that, collecting appropriate data is an essential stage
why the prerequisite knowledge assumed earlier does not to train the model to solve the problem like a human. The
necessarily violate gray-box assumptions especially in the overall performance of the ML model depends on how
context of SoC verification. Modern SoCs can contain tens representative the training data is. ML model learns from
of different third-party IPs, many of which are too complex some extracted features of the relevant data. The higher the
with their implementation details abstracted away by inte- amount of data, the better the training would be. Hence,
grating CAD tools. Even during functional verification, the ML data must be the adequate amount (the more the data,
verification engineer would only have a high level knowl- the better), must have appropriate depth and feature (if
edge of the vulnerability and functionality of the integrated inherently 2d data is represented by 3d data, the training
3PIP and not the minutiae of RTL implementation. In such method will not be effective). Data also have to be repre-
cases, it can become extremely challenging for the verifi- sentative unbiased and should include every corner case
cation engineer to trigger the vulnerability with existing possible. Once the developers have enough data, they can
verification tools due to complex transactions occurring in- move forward to train different ML models to see which
side the SoC, implicit timing flows, or unanticipated security model works best for their problem statement and collected
unaware design flaws. We also note that formal verification data.
tools such as Cadence JasperGold⃝ R
, even with definitive Fig. 9 shows different tasks a machine learning model
knowledge of the impact a vulnerability might have, can can perform (classification, anomaly detection, etc). De-
throw false positives or suffer from state explosion problems pending on the task, developers have to choose a method of
[86]. training, for example, supervised or unsupervised learning.
Once the developers have selected the task, training method
and the model itself; they can then fine-tune the best model
9 S O C S ECURITY V ERIFICATION : AI T ESTING
for their application and deploy it for usage.
Pre-silicon verification is an essential but time-consuming,
and tedious part that consumes about 70% of the total time
allocated for hardware design flow [87]. Similar to fuzz and
pen testing, machine learning (ML) can be used for the
SoC security verification to make the process automatic and Classification
Train,
Define Collect, evaluate, &
Analyze, & Selecting Deploy
Problem test different the best trained
Statement Pre-process models w/
Data model model
collected data
ulus, which will ensure to hit the hard-to-hit combination are encrypted. However, the level of access to the design
in highly complex functional design space. Hutter et al. information (white-box/ gray-box / black-box setting) is
[89] used AI to automatically tune the decision-making crucial for collecting training data for the machine learning
procedure of bounded model checking SAT solvers, which model.As shown in figure 10, to get the appropriate training
in result would boost the verification procedure. Sometimes data, the verification engineer may need to instrument the
verification engineers need to recreate a failure to trace back IP. This training data have to fulfill the requirement of being
the input that causes system failure. However, stimulat- comprehensively representative of the complete problem
ing the origin of a system failure is time-consuming and statement, unbiased, and should cover all corner cases. After
computationally draining. Gaur et al. [90] proposed an ML collecting required training data and analyzing the data
model for this debugging purpose. The ML model will be structure, it is mandatory to extract features that represent
trained to calculate the switching probability of the design the problem statement the best. These data features will be
output, which can be used to simulate system failures with used to statistically model the problem statement and give
negligible overhead. human-like predictions using ML models.
Depending on the data available on the IP (as the model
can have one of the black, gray, or white-box approaches),
9.2 AI for Hardware Verification
the verification engineer must select the best ML method
When trying to use ML for hardware verification, the ML (shown in figure 9) suited for the predefined problem state-
workflow shown in figure 8 has to be followed. Figure 10 ment and collect data features. Selecting an appropriate
illustrates the workflow on how ML can be used to enhance model will be a trial and error process because the model
hardware verification process. In the following subsections, will work differently for different problem statements and
different requirements and possible challenges of using this data features.For example, in figure 10, different ML models
workflow for hardware verification are discussed in detail. (model 1, model 2,.... , model N) are trained using the same
training data and the performance of all these different
9.2.1 Requirements/Workflow for using ML in Verification ML models are compared on the basis of the same cost
Hardware verification is done in every abstraction level of function, and the best performing model will be selected
the hardware design flow. For example, after materializing a for automation. While selecting the appropriate method, the
design’s concept and architectural specification, behavioral verification engineer must also consider the computational
verification is done on the RTL level. Next, functional ver- power, platform, and resource available for training.
ification is done at a gate level, transistor level, or during After analyzing, optimizing, and evaluating the trained
DFT insertion. Finally, the synthesized layout goes through Model, the automated part of the verification process will
a physical verification process. Therefore, while implement- be integrated with the verification steps that do not require
ing ML for automating verification, the first requirement is further optimization. This integrated part will produce an
to select the level of abstraction, meaning which method accelerated ML-based hardware verification method.
(behavior, function, or physical verification) to use and One of the critical requirements of using ML in verifica-
which level (RTL, gate, transistor, or layout) to use. This tion is forming an evaluation matrix and objective function
is the first step of using ML in pre-silicon verification as to measure the performance of the ML model. This objective
shown in figure 10. function and evaluation matrix must be dynamic, compre-
Next, verification engineers must choose what part of hensive, and analog so that reinforcement feedback can be
the verification to automate through ML. For example, ML provided for increasing model accuracy.
can generate stimuli, generate new test cases to increase
code or branch coverage, produce new guided, constrained- 9.2.2 Challenges of ML-based Verification
random, or entirely random inputs to hit more functional The accuracy and performance of the ML model depend
or behavior nodes. Therefore, it is required to select the hugely on the quality and quantity of collected data, and
abstraction level and the role of ML in automation as a collecting an adequate number of relevant, unbiased, com-
problem statement as the first part of ML workflow. As prehensive data is always a challenge. It takes numerous
shown in figure 10, verification engineers have to divide the man-hours, computational resources, and manual interven-
verification steps into two parts. The first part can be auto- tions to collect these data. However, most of the collected
mated through ML, and the second part is the state-of-the- data are noisy, biased, and not comprehensive in real-life
art verification steps that do not need further optimization. cases. For this reason, pre-processing and data sorting im-
Verification engineers have to build problem statements and poses a challenge for hardware verification. Extracting ap-
cost functions corresponding to the first part and leave the propriate features from collected data is another demanding
second part. task. While building a model, verification engineers need
One factor that goes into consideration while estab- to spend most of their time analyzing the data to get the
lishing a problem statement is if the verification engineer appropriate specification and feature depth of their model
has full access to the design (white-box setting) or has that represents the problem statement the best. Selecting the
only access to primary input and primary output of the wrong data feature will result in poor model performance.
design (black-box setting). Traditionally design engineer This is why careful selection of data features is an essential
and verification engineer are two different entities, and task.
verification engineer does not require any knowledge of Efficient model selection has one of the highest impacts.
the design itself. Also, if the IP came from a 3PIP vendor, The model structure and depth must be coherent and com-
white box knowledge of the IP is inaccessible, as often 3PIPs plementary with the data structure and must train itself
FUTURE MICROELECTRONICS SECURITY RESEARCH SERIES 19
Identify Train
Model Deploy
HLL Model Steps Cost Function Instrumentation Collect, trained
netlist w/ high Generator Analyze, & 1
model
FPGA-based Automation Pre-process
… emulation Probability Data Train
Select Model
Abstraction Level Identify 2
Select
Steps
…
Verification Feature the highest
Steps w/ low Security Policies performing
Automation or Properties Extraction Train model
Probability Model
N DB of
Vulnerabilities
from the given data features. As there are many ML meth- vulnerabilities in emerging SoCs. In addition, while design-
ods and models to choose from; training different models time security verification techniques can detect certain types
and comparing different methods can give the best-suited of vulnerabilities, it is infeasible to remove all possible
model for a particular hardware verification problem. How vulnerabilities during pre-silicon security verification. Due
and where to instrument the code, what platform to use, and to observability constraints in fabricated SoCs, post-silicon
what properties to check are also very challenging questions security verification (on initial prototypes or FPGA-based
from the verification perspective. Testing, developing, and emulation) approaches should be considered as well. So,
implementing appropriate objective functions are challeng- the future research needs to employ both post-silicon and
ing in real life. For example, one objective function may be pre-silicon security verification. Finally, security verifica-
appropriate for a certain scenario and irrelevant for another. tion tools need to check for various security vulnerabilities
Developing a generalized, scalable, and appropriate object- across different phases in the design cycle. Specifically per
ing function poses a challenge from this perspective. each technique, the following draws some of the possible
future research directions:
software based on fuzz testing that can be incorporated as Of course, there will always be unique cases where man-
is into other verification tools. ual interventions will be needed. However, with the help
of reinforcement learning, the rate of human intervention
10.2 Pen Testing requirement will reduce exponentially. One of the possible
future direction would be building of this kind of unbiased,
The efficacy of the pre-silicon hardware Pen Testing frame-
scalable, reusable ML model which will increase verification
work, such as the previously-mentioned BPSO-based archi-
coverage and drastically decrease manual efforts required
tecture, is dictated by how effectively the associated cost
by the existing process. The best way to implement all of
functions can encapsulate the vulnerability being targeted.
these is to tackle one abstraction level at a time, starting from
The effectiveness of the cost function, in turn, is contingent
the RTL level. Verification engineers have to generate an
upon the tester’s ability to identify corresponding inputs
automated method to check the scope of using ML in each
and effects of the vulnerability. As we mentioned previously,
step of the verification process and start utilizing ML for the
there is an ever growing database of vulnerabilities that the
promising steps. Also, generating an evolving cost function
community understands how to trigger (at a high level)
with dynamic behaviors is another task. After collecting
and what impacts they might have. However, due to the
data and training the model, verification engineers have
modular design practices prevalent in the industry today,
to check the performance of the whole verification step,
the visibility and accessibility within the design is getting
with and without ML. If the ML model outperforms the
reduced. This means gaining access to the signals or points
traditional approach with lower overhead, then the ML
of interest may be a challenge especially taking time con-
approach should be established as the standard procedure.
straints into consideration. For example, the designer may
anticipate that the impact of a vulnerability may be visible
through the common bus used in the SoC. However, in a
pre-silicon setting, the time taken to simulate the design to
11 C ONCLUSION
appropriate number of clock cycles such that the vulnera- In this paper, we re-evaluated the fundamentals and prin-
bility is triggered, may become unacceptably high. In such ciples of SoC security verification. By reviewing the defi-
cases, FPGA emulation of the design can be considered as a nitions, requirements, and existing challenges through the
promising approach to speed up the process. SoC security verification, we investigated the possibility of
We note that the previously-mentioned BPSO based utilization of self-refinement techniques for building a more
framework assumes no particulars on how the observable efficient and scalable methodology for security verification
point is observed. It can be through simulation, emulation of the design, specifically for complex and heterogeneous
or any other approach preferred by the tester based on time SoCs. We discussed the need for further investigation on
and cost considerations. So long as the tester can provide the these techniques, and by assessing the challenges and the
algorithm with observed outputs, the algorithm would be possible directions, we hope this article will serve as a
able to mutate the input based upon the feedback provided navigator for building the next steps in this domain.
by evaluation of the cost function. Manual formulation of
cost functions can become non-scalable if the designers
want to test a large variety of vulnerabilities across different R EFERENCES
platforms and architectures. The best approach to tackle this
challenge is to devise an automatic cost function generation [1] A. Yeh, “Trends in the Global IC Design Service Market,” DIG-
ITIMES, 2012.
methodology based on a general description of the type [2] M. Rostami, F. Koushanfar, and R. Karri, “A primer on hardware
and scope of the vulnerability as well as microarchitectural security: Models, methods, and metrics,” Proceedings of the IEEE,
implementation details. vol. 102, no. 8, pp. 1283–1295, 2014.
[3] A. Nahiyan, K. Xiao, K. Yang, Y. Jin, D. Forte, and M. Tehranipoor,
“Avfsm: A framework for identifying and mitigating vulnerabil-
10.3 AI Testing ities in fsms,” in 2016 53nd ACM/EDAC/IEEE Design Automation
Conference (DAC). IEEE, 2016, pp. 1–6.
State-of-the-art verification processes require hours and [4] M. Tehranipoor and F. Koushanfar, “A survey of hardware trojan
hours of manual intervention but fail to achieve desired taxonomy and detection,” IEEE design & test of computers, vol. 27,
coverage goals. Moreover, so many different IPs ( hard, soft, no. 1, pp. 10–25, 2010.
[5] G. K. Contreras, A. Nahiyan, S. Bhunia, D. Forte, and M. Tehra-
firm IPs) are integrated from so many different vendors in
nipoor, “Security vulnerability analysis of design-for-test exploits
a practical system that developing scalable and reusable for asset protection in socs,” in 2017 22nd Asia and South Pacific
test cases becomes a daunting task. ML has the potential Design Automation Conference (ASP-DAC). IEEE, 2017, pp. 617–
to overcome both of these issues faced by the traditional 622.
[6] K. Xiao, D. Forte, Y. Jin, R. Karri, S. Bhunia, and M. Tehranipoor,
verification process. However, ML has its own challenges “Hardware trojans: Lessons learned after one decade of research,”
because the usage of ML for hardware verification is still ACM Transactions on Design Automation of Electronic Systems (TO-
in its early stages. Once verification engineers pass the DAES), vol. 22, no. 1, pp. 1–23, 2016.
initial hurdle of trial and error and start exploring ML [7] Y. Alkabani and F. Koushanfar, “Active Hardware Metering for
Intellectual Property Protection and Security,” in USENIX Security
for each verification stage, specification of the collected Symposium, 2007, pp. 291–306.
data, extracted best features, appropriate model structure [8] J. Rajendran, M. Sam, O. Sinanoglu, and R. Karri, “Security Analy-
- everything required to build an automated structure will sis of Integrated Circuit Camouflaging,” in Proceedings of the ACM
SIGSAC conference on Computer & communications security, 2013, pp.
be established. From these established structures and using
709–720.
transfer learning of ML, different IPs with different specifi- [9] D. Forte, S. Bhunia, and M. M. Tehranipoor, Hardware protection
cations can be automated to increase verification coverage. through obfuscation. Springer, 2017.
FUTURE MICROELECTRONICS SECURITY RESEARCH SERIES 21
[10] K. Z. Azar, H. M. Kamali, H. Homayoun, and A. Sasan, “From Workshop on Microprocessor and SOC Test and Verification (MTV).
cryptography to logic locking: A survey on the architecture evolu- IEEE, 2016, pp. 51–56.
tion of secure scan chains,” IEEE Access, vol. 9, pp. 73 133–73 151, [30] M. Tiwari, J. K. Oberg, X. Li, J. Valamehr, T. Levin, B. Hardekopf,
2021. R. Kastner, F. T. Chong, and T. Sherwood, “Crafting a usable
[11] N. N. Anandakumar, M. S. Rahman, M. M. M. Rahman, R. Kibria, microkernel, processor, and i/o system with strict and provable
U. Das, F. Farahmandi, F. Rahman, and M. M. Tehranipoor, “Re- information flow security,” in 2011 38th Annual International Sym-
thinking watermark: Providing proof of ip ownership in modern posium on Computer Architecture (ISCA). IEEE, 2011, pp. 189–199.
socs,” Cryptology ePrint Archive, 2022. [31] R. Kastner, J. Oberg, W. Huy, and A. Irturk, “Enforcing information
[12] H. M. Kamali, K. Z. Azar, F. Farahmandi, and M. Tehranipoor, flow guarantees in reconfigurable systems with mix-trusted ip,” in
“Advances in logic locking: Past, present, and prospects,” Cryptol- Proceedings of the International Conference on Engineering of Reconfig-
ogy ePrint Archive, 2022. urable Systems and Algorithms (ERSA). The Steering Committee
[13] D. Beyer and T. Lemberger, “Software verification: Testing vs. of The World Congress in Computer Science, Computer . . . , 2011,
model checking,” in Haifa Verification Conference. Springer, 2017, p. 1.
pp. 99–114. [32] W. Hu, J. Oberg, A. Irturk, M. Tiwari, T. Sherwood, D. Mu, and
[14] C. Calcagno, D. Distefano, J. Dubreil, D. Gabi, P. Hooimeijer, R. Kastner, “On the complexity of generating gate level informa-
M. Luca, P. O’Hearn, I. Papakonstantinou, J. Purbrick, and D. Ro- tion flow tracking logic,” IEEE Transactions on Information Forensics
driguez, “Moving fast with software verification,” in NASA Formal and Security, vol. 7, no. 3, pp. 1067–1080, 2012.
Methods Symposium. Springer, 2015, pp. 3–11. [33] J. Oberg, W. Hu, A. Irturk, M. Tiwari, T. Sherwood, and R. Kast-
[15] (2018) American fuzzy lop (afl) fuzzer. [Online]. Available: ner, “Information flow isolation in i2c and usb,” in 2011 48th
http://lcamtuf.coredump.cx/afl/ ACM/EDAC/IEEE Design Automation Conference (DAC). IEEE,
[16] W. Chen, S. Ray, J. Bhadra, M. Abadir, and L.-C. Wang, “Chal- 2011, pp. 254–259.
lenges and trends in modern soc design verification,” IEEE Design [34] P. Kocher, J. Horn, A. Fogh, D. Genkin, D. Gruss, W. Haas,
& Test, vol. 34, no. 5, pp. 7–22, 2017. M. Hamburg, M. Lipp, S. Mangard, T. Prescher et al., “Spectre at-
[17] K. Z. Azar, H. M. Kamali, H. Homayoun, and A. Sasan, “Smt tacks: Exploiting speculative execution,” in 2019 IEEE Symposium
attack: Next generation attack on obfuscated circuits with capabil- on Security and Privacy (SP). IEEE, 2019, pp. 1–19.
ities and performance beyond the sat attacks,” IACR Transactions [35] M. Lipp, M. Schwarz, D. Gruss, T. Prescher, W. Haas, A. Fogh,
on Cryptographic Hardware and Embedded Systems, pp. 97–122, 2019. J. Horn, S. Mangard, P. Kocher, D. Genkin et al., “Meltdown:
[18] K. Z. Azar, H. M. Kamali, , H. Homayoun, and A. Sasan, “Nngsat: Reading kernel memory from user space,” in 27th USENIX Security
Neural network guided sat attack on logic locked complex struc- Symposium (USENIX Security 18), 2018, pp. 973–990.
tures,” in 2020 IEEE/ACM International Conference On Computer [36] E. Peeters, “Soc security architecture: Current practices and emerg-
Aided Design (ICCAD). IEEE, 2020, pp. 1–9. ing needs,” in 2015 52nd ACM/EDAC/IEEE Design Automation
[19] K. Z. Azar, H. M. Kamali, F. Farahmandi, M. Tehranipoor, Conference (DAC). IEEE, 2015, pp. 1–6.
“Warm Up before Circuit De-obfuscation? An Exploration through [37] S. J. Greenwald, “Discussion topic: what is the old security
Bounded-Model-Checkers,” in International Symposium on Hard- paradigm?” in Proceedings of the 1998 workshop on New security
ware Oriented Security and Trust (HOST), 2022, pp. 1–4. paradigms, 1998, pp. 107–118.
[20] P. Subramanyan, S. Malik, H. Khattri, A. Maiti, and J. Fung, “Ver- [38] N. Farzana, F. Rahman, M. Tehranipoor, and F. Farahmandi, “Soc
ifying information flow properties of firmware using symbolic security verification using property checking,” in 2019 IEEE Inter-
execution,” in 2016 Design, Automation & Test in Europe Conference national Test Conference (ITC). IEEE, 2019, pp. 1–10.
& Exhibition (DATE). IEEE, 2016, pp. 337–342. [39] N. Farzana, F. Farahmandi, and M. Tehranipoor, “Soc security
[21] A. Nahiyan, F. Farahmandi, P. Mishra, D. Forte, and M. Tehra- properties and rules,” Cryptology ePrint Archive, 2021.
nipoor, “Security-aware fsm design flow for identifying and [40] P. Mishra, M. Tehranipoor, and S. Bhunia, “Security and trust
mitigating vulnerabilities to fault attacks,” IEEE Transactions on vulnerabilities in third-party ips,” in Hardware IP Security and
Computer-aided design of integrated circuits and systems, vol. 38, no. 6, Trust. Springer, 2017, pp. 3–14.
pp. 1003–1016, 2018.
[41] M. Gruninger and C. Menzel, “The process specification language
[22] B. Kumar, A. K. Jaiswal, V. Vineesh, and R. Shinde, “Analyzing
(psl) theory and applications,” AI magazine, vol. 24, no. 3, pp. 63–
hardware security properties of processors through model check-
63, 2003.
ing,” in 2020 33rd International Conference on VLSI Design and 2020
[42] S. Vijayaraghavan and M. Ramanathan, A practical guide for Sys-
19th International Conference on Embedded Systems (VLSID). IEEE,
temVerilog assertions. Springer Science & Business Media, 2005.
2020, pp. 107–112.
[23] B. Yuce, N. F. Ghalaty, and P. Schaumont, “Tvvf: Estimating the [43] A. Pnueli, “The temporal logic of programs,” in 18th Annual
vulnerability of hardware cryptosystems against timing violation Symposium on Foundations of Computer Science (sfcs 1977). IEEE,
attacks,” in 2015 IEEE International Symposium on Hardware Ori- 1977, pp. 46–57.
ented Security and Trust (HOST). IEEE, 2015, pp. 72–77. [44] A. Cimatti, E. Clarke, F. Giunchiglia, and M. Roveri, “Nusmv:
[24] J. Demme, R. Martin, A. Waksman, and S. Sethumadhavan, “Side- a new symbolic model checker,” International Journal on Software
channel vulnerability factor: A metric for measuring information Tools for Technology Transfer, vol. 2, no. 4, pp. 410–425, 2000.
leakage,” in 2012 39th Annual International Symposium on Computer [45] C. Dunbar and G. Qu, “Designing trusted embedded systems from
Architecture (ISCA). IEEE, 2012, pp. 106–117. finite state machines,” ACM Transactions on Embedded Computing
[25] A. Nahiyan, J. Park, M. He, Y. Iskander, F. Farahmandi, D. Forte, Systems (TECS), vol. 13, no. 5s, pp. 1–20, 2014.
and M. Tehranipoor, “Script: A cad framework for power side- [46] D. M. Anderson, Design for manufacturability: How to use concurrent
channel vulnerability assessment using information flow tracking engineering to rapidly develop low-cost, high-quality products for lean
and pattern generation,” ACM Transactions on Design Automation production. CRC press, 2020.
of Electronic Systems (TODAES), vol. 25, no. 3, pp. 1–27, 2020. [47] B. P. Miller, L. Fredriksen, and B. So, “An empirical study of the
[26] H. Salmani and M. Tehranipoor, “Analyzing circuit vulnerability reliability of unix utilities,” Communications of the ACM, vol. 33,
to hardware trojan insertion at the behavioral level,” in 2013 IEEE no. 12, pp. 32–44, 1990.
International Symposium on Defect and Fault Tolerance in VLSI and [48] J. Wang, Y. Duan, W. Song, H. Yin, and C. Song, “Be sensitive and
Nanotechnology Systems (DFTS). IEEE, 2013, pp. 190–195. collaborative: Analyzing impact of coverage metrics in greybox
[27] G. Dessouky, D. Gens, P. Haney, G. Persyn, A. Kanuparthi, fuzzing,” in 22nd International Symposium on Research in Attacks,
H. Khattri, J. M. Fung, A.-R. Sadeghi, and J. Rajendran, Intrusions and Defenses (RAID 2019). Chaoyang District, Beijing:
“{HardFails}: Insights into {Software-Exploitable} hardware USENIX Association, Sep. 2019, pp. 1–15. [Online]. Available:
bugs,” in 28th USENIX Security Symposium (USENIX Security 19), https://www.usenix.org/conference/raid2019/presentation/wang
2019, pp. 213–230. [49] (2018) honggfuzz. [Online]. Available: http://honggfuzz.com/
[28] R. Zhang, N. Stanley, C. Griggs, A. Chi, and C. Sturton, “Iden- [50] G. Klees, A. Ruef, B. Cooper, S. Wei, and M. Hicks, “Evaluating
tifying security critical properties for the dynamic verification of fuzz testing,” in Proceedings of the 2018 ACM SIGSAC Conference on
a processor,” ACM SIGARCH Computer Architecture News, vol. 45, Computer and Communications Security, 2018, pp. 2123–2138.
no. 1, pp. 541–554, 2017. [51] T. Trippel, K. G. Shin, A. Chernyakhovsky, G. Kelly, D. Rizzo,
[29] W. Hu, A. Althoff, A. Ardeshiricham, and R. Kastner, “Towards and M. Hicks, “Fuzzing hardware like software,” arXiv preprint
property driven hardware security,” in 2016 17th International arXiv:2102.02308, 2021.
FUTURE MICROELECTRONICS SECURITY RESEARCH SERIES 22
[52] J. De Ruiter and E. Poll, “Protocol state fuzzing of {TLS} imple- The OWASP Testing Framework/1-
mentations,” in 24th {USENIX} Security Symposium ({USENIX} Penetration Testing Methodologies
Security 15), 2015, pp. 193–206. [71] [Online]. Available: https://www.isecom.org/OSSTMM.3.pdf
[53] P. Godefroid, M. Y. Levin, D. A. Molnar et al., “Automated white- [72] [Online]. Available: https://www.crest-approved.org/what-is-
box fuzz testing.” in NDSS, vol. 8, 2008, pp. 151–166. star-fs/index.html
[54] R. L. Seagle Jr, “A framework for file format fuzzing with genetic
algorithms,” 2012. [73] R. Baloch, Ethical hacking and penetration testing guide. Auerbach
[55] M. M. Hossain, F. Farahmandi, M. Tehranipoor, and F. Rahman, Publications, 2017.
“Boft: Exploitable buffer overflow detection by information flow [74] T. Dimkov, A. Van Cleeff, W. Pieters, and P. Hartel, “Two method-
tracking,” in 2021 Design, Automation & Test in Europe Conference & ologies for physical penetration testing using social engineering,”
Exhibition (DATE). IEEE, 2021, pp. 1126–1129. in Proceedings of the 26th annual computer security applications confer-
[56] [Online]. Available: https://www.veripool.org/verilator/ ence, 2010, pp. 399–408.
[57] S. K. Muduli, G. Takhar, and P. Subramanyan, “Hyperfuzzing [76] M. He, J. Park, A. Nahiyan, A. Vassilev, Y. Jin, and M. Tehranipoor,
for soc security validation,” in Proceedings of the 39th International “Rtl-psc: Automated power side-channel leakage assessment at
Conference on Computer-Aided Design, 2020, pp. 1–9. register-transfer level,” in 2019 IEEE 37th VLSI Test Symposium
[58] D. Moghimi, M. Lipp, B. Sunar, and M. Schwarz, “Medusa: Mi- (VTS). IEEE, 2019, pp. 1–6.
croarchitectural data leakage via automated attack synthesis,” in [77] H. Wang, H. Li, F. Rahman, M. M. Tehranipoor, and F. Farahmandi,
29th {USENIX} Security Symposium ({USENIX} Security 20), 2020, “Sofi: Security property-driven vulnerability assessments of ics
pp. 1427–1444. against fault-injection attacks,” IEEE Transactions on Computer-
[59] O. Oleksenko, B. Trach, M. Silberstein, and C. Fetzer, “Spec- Aided Design of Integrated Circuits and Systems, 2021.
fuzz: Bringing spectre-type vulnerabilities to the surface,” in 29th [78] J. Kennedy and R. C. Eberhart, “A discrete binary version of the
{USENIX} Security Symposium ({USENIX} Security 20), 2020, pp. particle swarm algorithm,” in 1997 IEEE International conference on
1481–1498. systems, man, and cybernetics. Computational cybernetics and simula-
[60] S. Nilizadeh, Y. Noller, and C. S. Pasareanu, “Diffuzz: differential tion, vol. 5. IEEE, 1997, pp. 4104–4108.
fuzzing for side-channel analysis,” in 2019 IEEE/ACM 41st Inter-
[79] H. Salmani, M. Tehranipoor, and R. Karri, “On design vulnera-
national Conference on Software Engineering (ICSE). IEEE, 2019, pp.
bility analysis and trust benchmarks development,” in 2013 IEEE
176–187.
31st international conference on computer design (ICCD). IEEE, 2013,
[61] Y. Xiao, Y. Zhang, and R. Teodorescu, “Speechminer: A framework
pp. 471–474.
for investigating and measuring speculative execution vulnerabil-
ities,” arXiv preprint arXiv:1912.00329, 2019. [80] B. Shakya, T. He, H. Salmani, D. Forte, S. Bhunia, and M. Tehra-
[62] M. Ghaniyoun, K. Barber, Y. Zhang, and R. Teodorescu, “In- nipoor, “Benchmarking of hardware trojans and maliciously af-
trospectre: a pre-silicon framework for discovery and analysis fected circuits,” Journal of Hardware and Systems Security, vol. 1,
of transient execution vulnerabilities,” in 2021 ACM/IEEE 48th no. 1, pp. 85–102, 2017.
Annual International Symposium on Computer Architecture (ISCA). [81] X. Zhang and M. Tehranipoor, “Case study: Detecting hardware
IEEE, 2021, pp. 874–887. trojans in third-party digital ip cores,” in 2011 IEEE International
[63] A. Tyagi, A. Crump, A.-R. Sadeghi, G. Persyn, J. Rajendran, Symposium on Hardware-Oriented Security and Trust. IEEE, 2011,
P. Jauernig, and R. Kande, “Thehuzz: Instruction fuzzing of pp. 67–70.
processors using golden-reference models for finding software- [82] X. Zhang and H. Salmani, “Integrated circuit authentication: hard-
exploitable vulnerabilities,” 01 2022. ware trojans and counterfeit detection.” 2014.
[64] A. Basak, S. Bhunia, T. Tkacik, and S. Ray, “Security assurance [83] A. Nahiyan, M. Sadi, R. Vittal, G. Contreras, D. Forte, and
for system-on-chip designs with untrusted ips,” IEEE Transactions M. Tehranipoor, “Hardware trojan detection through information
on Information Forensics and Security, vol. 12, no. 7, pp. 1515–1528, flow security verification,” in 2017 IEEE International Test Confer-
2017. ence (ITC). IEEE, 2017, pp. 1–10.
[65] P. Subramanyan and D. Arora, “Formal verification of taint-
[84] S. Bhunia and M. Tehranipoor, “The hardware trojan war,” Cham,,
propagation security properties in a commercial soc design,” in
Switzerland: Springer, 2018.
2014 Design, Automation & Test in Europe Conference & Exhibition
(DATE). IEEE, 2014, pp. 1–2. [85] S. Gogri, P. Joshi, P. Vurikiti, N. Fern, M. Quinn, and J. Valamehr,
[66] K. Laeufer, J. Koenig, D. Kim, J. Bachrach, and K. Sen, “Rfuzz: “Texas a&m hackin’aggies’ security verification strategies for the
Coverage-directed fuzz testing of rtl on fpgas,” in 2018 IEEE/ACM 2019 hack@ dac competition,” IEEE Design & Test, vol. 38, no. 1,
International Conference on Computer-Aided Design (ICCAD). IEEE, pp. 30–38, 2020.
2018, pp. 1–8. [86] F. Farahmandi, Y. Huang, and P. Mishra, System-on-Chip Security.
[67] S. Shah and B. M. Mehtre, “An overview of vulnerability as- Springer, 2020.
sessment and penetration testing techniques,” Journal of Computer [87] H. Kaeslin, Digital integrated circuit design: from VLSI architectures
Virology and Hacking Techniques, vol. 11, no. 1, pp. 27–49, 2015. to CMOS fabrication. Cambridge University Press, 2008.
[68] H. H. Thompson, “Application penetration testing,” IEEE Security [88] W. Hughes, S. Srinivasan, R. Suvarna, and M. Kulkarni, “Optimiz-
& Privacy, vol. 3, no. 1, pp. 66–69, 2005. ing design verification using machine learning: Doing better than
[69] W. Knowles, A. Baron, and T. McGarr, “The simulated security random,” arXiv preprint arXiv:1909.13168, 2019.
assessment ecosystem: Does penetration testing need standardis-
[89] F. Hutter, D. Babic, H. H. Hoos, and A. J. Hu, “Boosting verification
ation?” Computers & Security, vol. 62, pp. 296–316, 2016.
by automatic tuning of decision procedures,” in Formal Methods in
[70] [Online]. Available: https://owasp.org/www-
Computer Aided Design (FMCAD’07). IEEE, 2007, pp. 27–34.
project-web-security-testing-guide/latest/3-
[75] H. Khattri, N. K. V. Mangipudi, and S. Mandujano, “Hsdl: A [90] P. Gaur, S. S. Rout, and S. Deb, “Efficient hardware verification
security development lifecycle for hardware technologies,” in 2012 using machine learning approach,” in 2019 IEEE International
IEEE International Symposium on Hardware-Oriented Security and Symposium on Smart Electronic Systems (iSES)(Formerly iNiS). IEEE,
Trust. IEEE, 2012, pp. 116–121. 2019, pp. 168–171.