CYBERSECURITY FOR
8
THE SMART GRID
Adam Sorini and Ernesto Staroswiecki
Exponent Inc., Menlo Park, CA, United States
CHAPTER OUTLINE
8.1 Introduction 233
8.2 Cybersecurity Best Practices and Guidelines for the Power
Industry 235
8.2.1 Risk-Based Thinking 235
8.2.2 C2M2 for Cybersecurity 236
8.2.3 ES-C2M2 for Cybersecurity 239
8.2.4 Specific Smart Grid Related Cybersecurity Guidance and
Regulations 241
8.3 Cybersecurity Risk Assessments and Tools 243
8.3.1 Penetration Testing 243
8.3.2 Preliminary Hazards Analysis 246
8.3.3 Failure Modes and Effects Analysis 248
8.3.4 Fault Tree Analysis 249
8.4 Challenges and Conclusions 251
Further Reading 252
8.1 Introduction
The term “grid” generally refers to the network of wires, substa-
tions, transformers, switches, and other utility industry equipment
used to carry electricity from the source to consumers. Until recently,
data gathering and management of devices in the “grid” was a man-
ual process involving utility workers physically visiting the locations
in the grid. For example, utility workers visit consumer premises to
read meters, inspect for broken equipment, measure voltages, etc.
Slowly, however, computerization and automation are transforming
the industry. This computerization of the grid, has led to the use of
the term “smart grid,” much as the term “smart phone” has been
used to refer to the integration of the computer and the phone.1
1
http://energy.gov/oe/services/technology-development/smart-grid.
The Power Grid. DOI: http://dx.doi.org/10.1016/B978-0-12-805321-8.00008-2
Copyright © 2017 Elsevier Ltd. All rights reserved. 233
234 Chapter 8 CYBERSECURITY FOR THE SMART GRID
Broadly speaking, the term “smart grid” refers to computer-
based remote control and automation technologies that enable
efficiencies in energy use for consumers. The smart grid relies
on digital control, monitoring, and telecommunications to pro-
vide bi-directional flow of energy and information to various
stakeholders in the energy chain, including the electricity gener-
ation plant, commercial, industrial, and residential end users.
This enables a number of efficiencies such as energy manage-
ment during peak hours, “microgrids,” real-time pricing, mobil-
ity services, and ultimately, allows customers to better manage
their consumption and even sell unused home-produced elec-
tricity back to the grid.2
However, with computers and connectivity arises the oppor-
tunity for mischief and attacks. Essentially, a malicious user, be
it a government or another powerful entity, could potentially
launch an attack that undermines the integrity of the smart grid
system, or cause localized-to-widespread failures. As the smart
grid becomes the predominant infrastructure to generate
and distribute electricity, the impact of a widespread attack
could be catastrophic to the economy of the nation and its
infrastructure.
In fact, recent events underscore the potential impact of
well-planned attacks to computerized energy systems. A not-
able attack against the U.S. Government’s Office of Personnel
Management was reported in 2015, in which approximately
21.5 million individual background investigation records were
stolen by cyber criminals (some including fingerprint data).3 An
example (now classic) of a true cyber physical attack is the
StuxNet computer worm, which is believed to have been devel-
oped in order to manipulate and destroy nuclear centrifuge
systems.4 Another well-known cyber physical attack was the
recent demonstration5 of remote manipulation of control sys-
tems in automobiles (which are mainly controlled by micropro-
cessors on a controller area network), which resulted in a recall
of 1.4 million vehicles.6
“The architecture of the nation’s digital infrastructure, based
largely upon the Internet, is not secure or resilient. Without
major advances in the security of these systems or significant
2
Aillerie, Y; Kayal, S; Mennella, J; Samani, R; Sauty, S; Schmitt, L. Smart Grid Cyber
Security. McAfee White Paper. http://www.mcafee.com/in/resources/white-papers/
wp-smart-grid-cyber-security.pdf.
3
See, e.g., https://www.opm.gov/news/releases/2015/09/cyber-statement-923/.
4
See, e.g., https://www.wired.com/2014/11/countdown-to-zero-day-stuxnet/.
5
http://illmatics.com/Remote%20Car%20Hacking.pdf.
6
https://www.wired.com/2015/07/jeep-hack-chrysler-recalls-1-4m-vehicles-bug-fix/.
Chapter 8 CYBERSECURITY FOR THE SMART GRID 235
change in how they are constructed or operated, it is doubtful
that the United States can protect itself from the growing threat
of cybercrime and state-sponsored intrusions and operations.
Our digital infrastructure has already suffered intrusions that
have allowed criminals to steal hundreds of millions of dollars
and nation-states and other entities to steal intellectual prop-
erty and sensitive military information. Other intrusions
threaten to damage portions of our critical infrastructure. These
and other risks have the potential to undermine the nation’s
confidence in the information systems that underlie our eco-
nomic and national security interests.”7
8.2 Cybersecurity Best Practices and
Guidelines for the Power Industry
8.2.1 Risk-Based Thinking
Risk-based thinking (RBT) is the currently dominant trend in
quality management practices. RBT is featured prominently in
the 2015 version of the most recognized quality management
standard, ISO 9001, becoming the official tag line for the stan-
dard itself.
RBT is a paradigm that emphasizes risk and risk manage-
ment as concepts that are to be taken into account during every
stage of a project lifecycle. Under this paradigm, risks should be
identified and documented, first early on during the inception
of a product or process. As a project advances, new risks are
identified and the ones previously documented are updated.
These risks are used throughout the development of a product
to inform the decision making process at every stage.
Within the RBT framework, risk is defined as the “the effect
of uncertainty on an expected result.” This variance or deviation
from an expected outcome does not need to be construed
exclusively as a negative effect. Risk, as defined, can also lead
to opportunity. Furthermore, risks, as defined, do not need
to be necessarily eliminated or minimized, yet they do need to
be understood as thoroughly as possible, and managed
appropriately.
The energy industry is very familiar with and has been a
trendsetter in risk management. Energy generation and distri-
bution involve activities and products that are inherently risky.
7
White House, Cyberspace Policy Review, available from http://www.whitehouse.gov/
assests/documents/Cyberspace_Policy_Review_Final.pdf.
236 Chapter 8 CYBERSECURITY FOR THE SMART GRID
Power plants and power lines have risks relating to safety, secu-
rity, quality of service, etc. Furthermore, as part of the critical
infrastructure of the country, some of the most stringent stan-
dards relating to risk and quality management are applied to
the energy sector.
The smart grid incorporates an entire new set of risks to
those typically present for the energy sector: those involving the
elements of information technology used to make the power
grid a smart grid. Now it is necessary to consider risks involving
cyber security, bad actors, and inadvertent or intentional soft-
ware errors that could be present at any point in the grid, from
the power generating plant, to the final residential consumer.
From an RBT point of view, these risks should be assessed as
early as possible in the deployment of a smart grid, and enter
the risk management process immediately. For example, con-
sider the risk of a zero-day exploit being used to access metering
devices at consumers’ locations without authorization. After this
risk is identified, a further analysis and assessment of this item
would lead to an understanding of the potential effects of this
event. One of these effects could be a disruption of service to the
affected consumer, which would be a potential safety concern in
cases when consumers depend on the electrical network.
Another effect could be the loss of privacy of consumers if such
unauthorized access allows hackers to track energy usage by
consumers. Further analyses should shed light on other aspects
of these events associated with risks, including their causes, like-
lihood, potential preventive measures, detection, correction after
the event, notification of users, mitigation, etc. This allows for
the management of these risks as is necessary to maintain the
levels of services required for the energy grid.
Some of the classical tools and methods of risk and quality
management, including PHA, FMEA, FTA, etc., have been used
extensively by the energy sector, and in some cases even intro-
duced or developed by this industry. With some changes, which
are introduced below in following sections, they can be adapted
to be used to take into consideration the cyber security con-
cerns introduced by the deployment of the smart grid. The cur-
rent trend of RBT makes these tools, and risk management in
general, essential to the successful development and deploy-
ment of smart grid technologies.
8.2.2 C2M2 for Cybersecurity
The Cybersecurity Capability Maturity Model (C2M2) pro-
gram was developed by the United States Department of Energy
Chapter 8 CYBERSECURITY FOR THE SMART GRID 237
to help improve industrial cybersecurity. It is a sub-sector non-
specific model that was developed based on a previous electric-
ity-sector specific model, the ES-C2M2 (Electricity Subsector
Cybersecurity Capability Maturity Model). The C2M2 seeks to
help strengthen cybersecurity capabilities of organizations and
to help them effectively measure their cybersecurity capabili-
ties. This model is applicable to organizations regardless of size
or specific industry and concentrates on cybersecurity imple-
mentation and management in relation to industrial technology
assets, e.g., information technology. The C2M2 supports adopt-
ing the NIST Cybersecurity Framework.8
The C2M2 is a generalization of an electricity subsector spe-
cific model called the ES-C2M2. It was also specialized for the
oil and natural gas subsector in a model called the ONG-C2M2.
At the time of the publication of this book the most recent ver-
sions of the C2M2 and ES-C2M2 are versions 1.1, which are
available here: http://energy.gov/sites/prod/files/2014/03/f13/
C2M2-v1-1_cor.pdf; http://energy.gov/sites/prod/files/2014/02/
f7/ES-C2M2-v1-1-Feb2014.pdf.
The C2M2 model itself is described in the document “C2M2-
v1-1_cor.pdf” and the recommended usage of the model is
depicted in Fig. 8.1. An “initial evaluation,” illustrated as the top
hexagon in Fig. 8.1, is the starting point of the model and the
return point for reevaluation after a cycle is complete.
In order to enter into the evaluation cycle and complete a
C2M2-based self-evaluation it is important to select an effective
facilitator. A facilitator is a person who is familiar with the
model and who can help an organization to establish scope for
the cybersecurity evaluation and help a group within the orga-
nization work through the model and achieve the model goals.
The C2M2 supplies a Facilitator Guide.9 An initial evaluation is
performed (with the help of a facilitator) in a workshop-like set-
ting and the end result of the evaluation is a scoring report that
helps identify gaps in performance. The evaluation relies on
participant responses to evaluation questions. The second step
in the model involves analyzing the identified performance
gaps to determine if the identified gaps are truly meaningful for
the organization performing the evaluation. The gaps will gen-
erally relate to a specific Maturity Indicator Level (MIL) that the
organization desires to achieve for a given function. MILs are
8
See, http://energy.gov/oe/services/cybersecurity/cybersecurity-capability-maturity-
model-c2m2-program; http://energy.gov/oe/services/cybersecurity/reducing-cyber-
risk-critical-infrastructure-nist-framework.
9
See, http://energy.gov/mode/795826.
238 Chapter 8 CYBERSECURITY FOR THE SMART GRID
Figure 8.1 Recommended
usage of the C2M2 model:
Evaluate a function against
model practices; Identify gaps
in performance; Prioritize and
plan actions required to
achieved desired capabilities;
Implement plans to address
gaps; Repeat. [See, C2M2-v1-
1_cor.pdf document referenced
in text at Section 4].
discussed in greater detail below. A given organization will not
necessarily achieve the highest MIL in all the domains of the
C2M2, but rather should seek to achieve the appropriate level
in light of the organization’s overall cybersecurity strategy. In
the third stage of the model, the participants in the model
workshop will determine how the important gaps in cybersecur-
ity practices affect the organization’s objective. The costs and
benefits of implementing plans to fill the cybersecurity gaps
will be evaluated and a specific plan will be developed. The
fourth stage entails actually implementing the plans developed
previously, tracking progress toward the planned goals, and
periodic reevaluation.
The C2M2 is divided into 10 domains, which are groupings
of cybersecurity objectives and practices. The 10 domains of the
C2M2 are illustrated in Fig. 8.2.
Each of the 10 domains is associated with specific objectives
and practices, which are achievements and activities that
establish capabilities and support the domain. For example,
the Risk Management domain has three objectives: (1) Establish
Cybersecurity Risk Management Strategy; (2) Manage
Cybersecurity Risk; (3) Management Activities. Each objective can
be rated at a given MIL depending on the level of practice. For
example, the three objectives in the Risk Management domain can
each be described by one of four MILs: MIL0, MIL1, MIL2, and
MIL3, as shown in Fig. 8.3 [See C2M2-v1-1_cor.pdf at Section 5.1].
Chapter 8 CYBERSECURITY FOR THE SMART GRID 239
Figure 8.2 The 10 domains of the C2M2. [See, C2M2-v1-1_cor.pdf document referenced in text at Section 5].
In general, MIL0 indicates that there is no practice; however,
MIL1 can also be attained for some objectives with no practice.
Typically, practices at MIL1 are ad hoc. In addition, MIL1 has
been designed such that all organization should be capable of
achieving at least MIL1 in all 10 domains. Each of the practices
required for MIL1 in all 10 domains is illustrated in Fig. 8.4.
8.2.3 ES-C2M2 for Cybersecurity
The ES-C2M2 was developed by the United States
Department of Energy to help improve cybersecurity in the
electricity subsector. This electricity subsector-specific guidance
was used as the basis for the subsector-independent guidance
of the C2M2 discussed above.
The general purpose of the ES-C2M2 is the same to the
above-described C2M2. That is, the purpose is to improve
cybersecurity capabilities, allow consistent benchmarking, and
so on. The 10 domains of the ES-C2M2 are also identical to the
10 domains of the C2M2. Indeed, much of the ES-C2M2 model
and program described in the main body of the document
“ES-C2M2-v1-1-Feb2014.pdf” (available at http://energy.gov/
sites/prod/files/2014/02/f7/ES-C2M2-v1-1-Feb2014.pdf) is effec-
tively identical to the C2M2 described above, except for Section 3
240 Chapter 8 CYBERSECURITY FOR THE SMART GRID
Figure 8.3 Example MILs for the three objectives of the Risk Management domain of the C2M2, reproduced from
the C2M2 documentation. [See, C2M2-v1-1_cor.pdf document referenced in text at Section 5.1].
of the ES-C2M2, which gives some general information regarding
the electricity subsector. In particular, it is noted that the
ES-C2M2 was developed with four primary electricity sector
functions in mind: (1) generation; (2) transmission; (3) distribu-
tion; (4) markets. Each of these electricity sector functions is
linked either directly or indirectly with the others and each func-
tion also depends on a complex information technology and
operations technology infrastructure.
An additional difference between the ES-C2M2 and the
C2M2 is that some threat, vulnerability, and incident reporting
resources are identified in the electricity subsector specific
form. For example, the Electricity Sector Information Sharing
and Analysis Center (ES-ISAC, See https://www.esisac.com/) is
Chapter 8 CYBERSECURITY FOR THE SMART GRID 241
Figure 8.4 The MIL1 practices of all ten C2M2 domains. [See, C2M2-v1-1_cor.pdf document referenced in text at
Section 5].
mentioned specifically in the ES-C2M2, whereas the C2M2
mentions ISACs in general. As an additional example, the ES-
C2M2 explicitly mentioned the DOE Form OE-417 (See https://
www.oe.netl.doe.gov/oe417.aspx) for reporting electric emer-
gency incidents and disturbances.
8.2.4 Specific Smart Grid Related Cybersecurity
Guidance and Regulations
Electrical power grid high-level domains include: generation,
transmission, distribution, service providers, customers, opera-
tions, and markets.10 Electrical energy flows between the gener-
ation, transmission, distribution, and customer domains.
However, information flows among all these high-level domains
as indicated in Fig. 8.5 below.
10
See, NIST SP-1108r3 table 5-1 and figure 5-1.
242 Chapter 8 CYBERSECURITY FOR THE SMART GRID
Figure 8.5 Smart Grid domains as specified in the NIST Smart Grid Framework (NIST SP 1108, rev 3).
Indeed, the “smart” grid is characterized by a fast two-way
information flow among elements of the high-levels domain.
For example, a “smart” meter may be able to transmit informa-
tion from a customer site to a service provider computer. If this
information flow is to take place wirelessly or over public net-
works the data channels may have to be secured. It is the intro-
duction of new fast data flow channels that allow the grid to act
“smartly” and this massive flow of data can present significant
cybersecurity challenges. Of course, an increase in cybersecurity
challenges affects all industries, not just the electric power
industry. However, there are cybersecurity challenges that are
unique to the electric power industry. For instance, since the
electrical power industry employs numerous and varied high
power physical control system it is subject to high impact com-
bined cyber-physical attacks, wherein a purely “cyber” attack is
combined with physical control systems to create a higher
impact attack.11
11
See, e.g., NIST IR 7628 at Section 1.4.
Chapter 8 CYBERSECURITY FOR THE SMART GRID 243
8.3 Cybersecurity Risk Assessments
and Tools
8.3.1 Penetration Testing
Penetration testing is the process of simulating realistic
attacks to test the functionality of security systems. The attacks
are simulated in the sense that even if the vulnerability is
exploited successfully by a penetration tester the underlying
assets are not subsequently harmed by the tester. Rather, a typi-
cal penetration test is performed with the asset owner’s permis-
sion. The purpose of the test is to gain information about
vulnerabilities and to determine if and whether suspected vul-
nerabilities are actually exploitable.
Penetration testing (in the context of cybersecurity) is often
performed using specially developed computer applications
that assist in testing. For example, the network tool “nmap” is a
freely available application that can be used to automatically
explore computer networks.12 Many other free and proprietary
tools are available that are used by professional penetration tes-
ters. Indeed, even entire operating system distributions (e.g.,
Kali Linux13) are designed specifically for penetration testing
and preloaded with numerous free “pen testing” tools. In this
context, pen tests often focus on computer networks since the
network is the path by which most information flows and by
which access to informational assets is likely to be most easily
attained surreptitiously. The concept of network penetration
testing is immediately applicable to smart grid technology.
In a smart grid system, the smart components are typically
connected via networks in order to make use of information
being produced by grid components. The amount of informa-
tion and the ease with which this information is transferred is
the underlying feature (or vulnerability) of the smart grid that
was not present in the “old” grid system. For example, in the
“old” electrical grid system the service provider would have to
be informed that the power was “out” via a land-line phone call
from a residential customer. However, with a smart meter, the
meter itself can sense and report a power outage to the service
provider via the Internet.
The connection of smart “things” (e.g., meters, solar panels,
light bulbs, cars, refrigerators, etc.) via network interfaces to the
Internet is known as the “Internet of Things.” This collection of
12
https://nmap.org/.
13
https://www.kali.org/.
244 Chapter 8 CYBERSECURITY FOR THE SMART GRID
nonhuman “smart” components communicates amongst them-
selves and with humans via the Internet. Electrical grid compo-
nents on the Internet of Things are particularly interesting from
a cyber-security perspective since they can control important
electrical-physical resources and are subject to cyber-physical
attacks.
Consider an example of a solar power generation system for
residential use. A typical system consists of roof-mounted solar
panels connected to inverter modules, and a distribution/com-
munication module located in the home. A system’s distribu-
tion/communication module can include built-in Power Line
Communication functionality for monitoring individual solar
panels/inverters. The built-in communication unit typically
includes a computer that can perform a number of functions.
For example, the system residential solar power system may be
able to: read program/data from a “SD Card”; display informa-
tion to the user; perform any number of other typical computer
functions.
The typical example system described above could also be
connected via Cat-5 cable to a router and it could send data
about the home solar power system to a remote server for sub-
sequent analysis via Web-based tools. Data sent to the external
remote server can typically be accessed via a Web interface by
the home owner. The registered system installer can also likely
access system data, possibly at a deeper level (more data) than
the home owner for diagnostic purposes.
Network-connected systems like the above-described solar
panel communication system communicate via the Internet. If
hundreds, or thousands, or millions of these systems are manu-
factured and set up in homes, they can present a large geo-
graphically distributed attack surface, where attack may be
possible via the Internet. For example, if these types of solar
energy communication systems (or any smart electrical system)
accept incoming connections or display banners on any
Internet socket port then they can likely be located by penetra-
tion testers or hackers on an Internet-of-Things search engine
such as shodan.io.14 For example, a search for “solar” on
Shodan.io turns up hundreds of hits in the United States alone.
Similarly, a search for “electric” turns up thousands of hits,
some of which are shown below in Fig. 8.6.
Penetration testing of smart-grids and other distributed
cyber-physical systems can be augmented by IOT search
engines as described above. Realistic penetration testing of the
14
https://www.shodan.io/.
Figure 8.6 Screen shot of an Internet-of-Things search result on “shodan.io” for the search term “electric,”
showing the first six search results with IP addresses located in the United States. Annotation in the form of black
bars has been added to obfuscate the IP addresses and other information regarding the “smart” devices
(modified from shodan.io).
246 Chapter 8 CYBERSECURITY FOR THE SMART GRID
grid will include similar types of attacks to test the functionality
of the system. The results of a penetration test will include use-
ful data on how to better secure the grid system against real-
world attackers.
A penetration test performed on the smart grid components for
a given company could involve testing the entire set of smart com-
ponents the company has produced and distributed to geographi-
cally disparate locations. If any of the smart devices are running
old operating systems with known vulnerabilities, one starting
point for penetration testing could be to try all known exploits of
the operating systems using automated techniques.15
8.3.2 Preliminary Hazards Analysis
An electrical power grid cybersecurity Preliminary Hazards
Analysis (PHA) is a top down risk assessment process that is used
for identifying assets, threats, and vulnerabilities of power grid
components. The analysis is conducted via a team approach where
the analysis team is made up of experts from different disciplines
including power grid technologies, cybersecurity principles, and
other diverse aspects of the specific electrical power components
under investigation. The goal is exhaustive analysis, and this is
more easily achieved when a multidisciplinary team attacks the
problem from a number of different perspectives.
The first step in the PHA is to identify the assets that need to
be protected. Although “asset discovery” may seem simple, it is
essential because if there are no assets to secure then there is
no need for security. Therefore one must identify all the relevant
assets. The valuation of assets is not performed at this initial
stage. Rather, initially, the team seeks to exhaustively identify all
assets that require protection. As part of the asset discovery
phase the PHA team needs to learn as much as possible about
the specific power grid functionality, service, or product that is
the subject of investigation. The team may have general exper-
tise, but specific details of the product/service are required in
order to have a comprehensive understanding of all the assets.
The second stage of the PHA is the “threat discovery phase”
in which a catalog of all the possible threats to the previously
identified assets are listed. The diverse make-up of the PHA
team adds to the effectiveness of threat discovery “brain storm-
ing.” Again, at this stage, the threats do not have to be ranked,
15
For example, using Metasploit (see https://www.metasploit.com/index.jsp) or other
penetration testing suites.
Chapter 8 CYBERSECURITY FOR THE SMART GRID 247
but rather the goal of threat discovery is to catalog all the possi-
ble threats to the power grid technology under investigation.
After the asset-discover and threat-discovery phases, the next
step is to estimate the severity level and the likelihood for each of
the identified threats. Severity level refers to the amount of damage
that would occur if the threat were to be successfully carried out.
The severity level scale is typical a qualitative one such as: “low,”
“medium,” and “high.” Likelihood refers to the probability of the
threat actually being carried out. Typically this scale will also be
qualitative (e.g., “low,” “medium,” “high”) or semi-quantitative. In
general estimates of likelihood and severity are subject to uncer-
tainty and quibbling over specific values is probably not useful.
When in doubt go with the more conservative estimate.
Establishing these severities and likelihoods for each identified
threat is important for having a well-defined system for ranking
threats. The risk associated with the threat is a function of both the
severity and the likelihood and can be taken as the product of
severity and likelihood in a semi-quantitative model.
As illustrated in Fig. 8.7, this relationship models the fact
that a high severity does not necessarily result in a high risk if
the likelihood is low enough (and vice-versa). For example, the
power grid system under consideration is certainly vulnerable
to destruction by a massive asteroid from space, but since this
particular threat is of very low likelihood the risk is low enough
that it need not even be considered.
After the likelihood and severity have been determined for
each threat, the risk associated with all the identified threats
Figure 8.7 Qualitative “Risk
Matrix” illustration of the
relationship between Severity
and Likelihood.
248 Chapter 8 CYBERSECURITY FOR THE SMART GRID
can be established. A key deliverable of the PHA is a risk-ranked
catalog of threats along with the associated documentation
used to establish risk, threat, and asset information. This pro-
vides a complete picture of the security environment and can
be used as a launching point for subsequently instituting risk
mitigation strategies in a focused manner.
8.3.3 Failure Modes and Effects Analysis
One of the most commonly used tools for risk management
is the Failure Modes and Effects Analysis (FMEA). FMEA is one
of the earliest systematic techniques developed for failure anal-
ysis, and has been part of the basic toolset of reliability and
quality engineers since the 1950s. A closely related analysis is
the Failure Modes, Effects, and Criticality Analysis, where the
criticality analysis is undertaken in more detail.
FMEAs are widely used in almost all stages of a product’s life-
cycle, and there are different variations of traditional FMEAs,
including Process, Design, and Functional FMEAs. In all cases, gen-
erating a successful FMEA requires first identifying all potential
failure causes and the failure modes they generate for in every
system, subsystem, and component of the product being developed.
A FMEA, typically requires a worksheet with an entry for each failure
mode identified. On each entry the potential effects and causes of
the failure are recorded. Following that, an FMEA requires an assess-
ment of each failure’s probability (P) (also called likelihood or
occurrence), severity (S), and detectability (D) (also called observ-
ability). Typically, each of these variables, P, S, and D, are assigned a
value between 1 and 10, with 1 being the least probable, least severe,
and most detectable value, and 10 being the most likely, most severe,
and least detectable value. The product P S is called the criticality
(C) of the failure, and the product P S D is the Risk Priority Number.
These values can be used to prioritize risk management activities
(i.e., the order in which the different potential failures should be
addressed). FMEA is a bottom-up, inductive reasoning analysis, and
is specially well suited to analyzing the effects of a single function or
component failures on systems or subsystems.
Traditional FMEAs work well for the design and other life-
cycle stages of traditional manufactured products. However,
with some modifications, FMEAs can be an effective and valu-
able tool for software and products including software, includ-
ing assessing risks related to cyber-security concerns.
The use of Software-FMEAs (SFMEA) has been proposed as
early as 1979. The changes required to make the FMEA process
useful for software products are based on the aspects of software
development that differ from hardware product manufacturing.
Chapter 8 CYBERSECURITY FOR THE SMART GRID 249
Hardware products are susceptible to manufacturing defects (i.e.,
defects resulting from the copying of a design into many products)
as well as wear from use. This is not the case for most software pro-
ducts or software components of products, where the defects, or
bugs, are introduced during the development of the software (i.e.,
the design or coding of the software product), a stage typically
associated with the design stage of a hardware product lifecycle.
Furthermore, it is understood that SFMEA are more useful if they
are undertaken early on in the software development lifecycle.
Additionally, the traditional hardware product FMEA does
not capture intentional actions by bad actors, yet this is neces-
sary in order to take into account threats, vulnerabilities, and
other cyber-security considerations. FMEA-based approaches
(e.g., Intrusion Modes, Effects, and Criticality Analysis, or
IMECA; Failure Modes, Vulnerabilities, and Effects Analysis, or
FMVEA) exist that use the structure and methods of FMEAs,
and incorporate traditional failure modes caused by intrusions
that have been enabled by system vulnerabilities.
As an example, FMVEAs are created using the basic structure of
an FMEA. An early step of an FMEA is to identify the failure causes
involved in the system under analysis. The FMVEA approach addi-
tionally requires the identification of the potential vulnerabilities
and threats combinations related to the same system as causes of
failures. The next step involves the identification of the failure modes
(in the case of failure causes) or threat modes (in the case of vulner-
abilities and threats) produced by the causes of failures. The follow-
ing stages are common to FMEAs and FMVEAs: evaluate the effects
of the failure or threat modes, assess the severity of said effects, and
finally estimate or measure the probability of each failure or threat
mode, allowing at this point an estimation of the criticality of each
failure or threat mode. Lastly, incorporating the observability of each
failure or threat mode to the analysis results in a more complete
picture of the risks associated with the system analyzed.
Finally, while Software FMEAs and FMEA-based security
assessment frameworks are extremely valuable tools for risk
management involving software products and cyber-security
concerns, given the high level of uncertainty in evaluating the
likelihood of software failures and cyber-security intrusions,
these tools have found their highest level of usefulness as quali-
tative tools that are well suited for comparisons and prioritiza-
tion of risk mitigation or elimination activities.
8.3.4 Fault Tree Analysis
Another widely used technique used for safety and reliability
analysis is the Fault Tree Analysis (FTA). FTA was developed in the
250 Chapter 8 CYBERSECURITY FOR THE SMART GRID
early 1960s, and it is a top down, deductive reasoning analysis,
making it complementary to FMEA. While FMEAs are intended to
analyze the effects of a single function or component failures on
systems or subsystems, FTAs are better suited to analyzing the
resistance of a system to single or multiple initiating faults. Given
their contrasting nature, typically both FMEA and FTA analyses are
carried out during a product’s lifecycle to evaluate associated risks,
reliability, quality, and safety of said product.
The FTA is a procedure used to determine which events,
actions, and faults, and which combinations of events, actions,
and faults can lead to a system failure or other undesired event.
The process begins with a conclusion, the top-level failure or fault
condition to be analyzed. The next step in an FTA is to construct
a multi-level logic diagram (called a fault tree) of the potential
causes for the analyzed failure, starting in a very general sense for
the causes at the second level, and increasing in specificity as the
levels increase. For example, consider a top-level failure of a
“computer crash.” Appropriate second-level causes include “hard-
ware failure,” and “software failure.” Third-level causes for
“software failure” could include “operating system failure,” “driver
failure,” and “application program failure.” Each of these failures
is considered a node, and each can in turn have causes.
The causes of a node are related to said node through a logical
operation: for example, if either one of several causes is enough to
generate the failure described in the node they relate to, the node
failure is the result of an OR operation with the causes as inputs; if
all of several causes need to occur to generate the failure described
in the node they relate to, the node failure is the result of an AND
operation with the causes as inputs. In many cases it is possible to
envision a failure node as the result of a Boolean-logic operation of
the causes considered for said node. However, there are extensions
to Boolean logic used for FTAs to allow considerations for the order
in which causes would need to occur to generate the failure
described in the node.
FTAs are typically used to identify potential causes of a sys-
tem failure before it occurs, and to guide testing and failure
analysis efforts once an undesirable event has taken place. If
the failure rates or probabilities of all lower level nodes are
known, FTAs can also provide estimates of the probability of
the top level even to occur. As with many other quantitative
and qualitative analysis techniques, there are software tools that
facilitate their application. Within the energy sector, one of the
most widely FTA software tools used is EPRI’s CAFTA
(Computer Aided Fault Tree Analysis).
The application of FTAs to software products is called a
Software FTA (SFTA), and was first described in the early 1980s.
Chapter 8 CYBERSECURITY FOR THE SMART GRID 251
As with other techniques, the modification required to make
FTAs useful for software are related to the different nature of
software faults from hardware faults: software faults are the
result of a defect, typically incorporated into the product during
the design stages; unlike hardware faults that can be the result
of wear or manufacturing problems. Therefore software faults
are not probabilistic in the same way that a hardware fault is: a
software fault is either present in the software (and in every
copy of it) or not. So while SFTAs are still powerful tools to
understand system failures, their quantitative aspects are lim-
ited compared to hardware FTAs.
In contrast with FMEAs, FTAs are well suited to consider
external causes of failure in its analysis. These external causes
may include actions of users or bad actors. An FTA-based tech-
nique focused on considering top-level undesirable events
resulting from attacks to a system in conjunction to the system
vulnerabilities is called an Attack Tree Analysis (ATA) or a
Threat Tree Analysis. ATAs have been described sin the 1990s,
and are useful tools to understand the potential security threats
to a system, the protective measures already in place, and the
risks remaining. They can also be used to prioritize these risks,
and to evaluate the impact of mitigating activities to the overall
security of the system being analyzed.
8.4 Challenges and Conclusions
We have presented a number of cybersecurity practices and
tools for protecting the smart grid. In order to be affective these
practices and tools must be properly applied and maintained.
To this end, employee awareness and education regarding
cybersecurity is essential. The cybersecurity capabilities of the
grid must be tested periodically via “penetration testing” tech-
nique. Periodic threat and risk assessment is also important.
Finally, even the most secure system will eventually be broken.
As such, a disaster recovery plan in the event of a breach of
security should also be implemented and tested.
There are many resources available to stakeholders including
education and training from the International Information
System Security Certification Consortium (ISC)2 , or the
International Council of E-Commerce Consultants (EC-
Council); literature from a variety of sources including the
American Bar Association associations books, The ABA
Cybersecurity Handbook and A Playbook for Cyber Events; and a
plethora of cybersecurity conferences throughout the world.
252 Chapter 8 CYBERSECURITY FOR THE SMART GRID
Further Reading
American Bar Association Standing Committee on Law and National Security.
2014. Cybersecurity working group. A Playbook for Cyber Events, second ed.
Greenberg, A., 2015. After jeep hack, Chrysler recalls 1.4M vehicles for bug fix,
Wired. Available from: ,https://www.wired.com/2015/07/jeep-hack-chrysler-
recalls-1-4m-vehicles-bug-fix/. (accessed 30.08.16).
Kali, 2016. Our most advanced penetration testing distribution, ever, Kali Linux.
Available from: ,https://www.kali.org/. (accessed 30.08.16).
McAfee, 2013. Smart grid cyber security, Alstom Grid, Intel, McAfee. Available
from: ,http://www.mcafee.com/in/resources/white-papers/wp-smart-grid-
cyber-security.pdf. (accessed 30.08.16).
Metasploit, n.d. Metasploit home page. Available from: ,https://www.metas-
ploit.com/index.jsp. (accessed 30.08.16).
Miller, C., Valasek, C., 2015. Remote exploitation of an unaltered passenger vehi-
cle, Illmatics. Available from: ,http://illmatics.com/Remote%20Car%
20Hacking.pdf. (accessed 30.08.16).
National Institute for Standards and Technology, 2010. Introduction to NISTIR
7628 guidelines for smart grid cyber security, US Department of Commerce.
Available from: ,http://www.nist.gov/smartgrid/upload/nistir-7628_total.
pdf. (accessed 31.08.16).
National Institute for Standards and Technology, 2014. NIST framework and
roadmap for smart grid interoperability standards, release 3.0, US Department
of Commerce. Available from: ,http://www.nist.gov/smartgrid/upload/NIST-
SP-1108r3.pdf. (accessed 31.08.16).
NMAP.org, n.d. NMAP home page. Available from: ,https://nmap.org/.
(accessed 30.08.16).
OPM.Gov, 2015. Statement by OPM press secretary Sam Schumach on background
investigations incident, US Government. Available from: ,https://www.opm.
gov/news/releases/2015/09/cyber-statement-923/. (accessed 30.08.16).
Rhodes, J.D., Polley, V.I., 2013. The ABA cybersecurity handbook: a resource for
attorneys, law firms, and business professionals. American Bar Association.
Shodan, 2016. Shodan home page. Available from: ,https://www.shodan.io/.
(accessed 30.08.16).
U.S. Department of Energy, n.d. Cybersecurity capability maturity model (C2M2)
program, US Government. Available from: ,http://energy.gov/oe/services/
cybersecurity/cybersecurity-capability-maturity-model-c2m2-program.
(accessed 30.08.16).
U.S. Department of Energy, n.d. Reducing cyber risk to critical infrastructure: NIST
framework, US Government. Available from: ,http://energy.gov/oe/services/
cybersecurity/reducing-cyber-risk-critical-infrastructure-nist-framework.
(accessed 30.08.16).
U.S. Department of Energy, n.d. Smart grid, US Government. Available from:
,http://energy.gov/oe/services/technology-development/smart-grid. (accessed
30.08.16).
Zetter, K., 2014. An unprecedented look at Stuxnet, the world’s first digital
weapon, Wired. Available from: ,https://www.wired.com/2014/11/count-
down-to-zero-day-stuxnet/. (accessed 30.08.16).