Cyber Infrastructure for the Power Grid
Dr. Anurag K. Srivastava, Dr. Carl Hauser, Dr. Dave Bakken
Computation Lecture 2: Data Management
Todays Content
1. Background
2. UML
3. XML
4. RDF
5. CIM (IEC 61970)
6. IEC 61850 and PMUs/WANs
Power Systems Data Formats and System
Integration
EMSs long had app-specific, proprietary file formats
Some
crude: column-oriented, fixed width, tabseparated (similar to FORTRAN)
Deregulation pushing vendors, utilities, etc from
doing own custom software and data formats
Interoperability
is key in the evolving EMS world
Problems arise
Sharing
data between different vendors/utilities SW
Sharing data even between two versions of same SW
Solution possibilities
1. Maintain multiple copies of same data in multiple
formats
2. Store the data in a format compatible with every
piece of SW
Have to remove app-specific data & lose precision
3. Store data in single, highly-detailed format and
create SW to convert to app-specific formats
4. Use highly detailed format compatible w/every app
#3 and #4 dont require separate copies but dont
lose precision
#4 is ideal solution
Data management in then grid
#4: Use highly detailed format compatible w/every
app
Requirements for this to work
1. Highly detailed model to describe the power
system
2. File format capable of storing extended data
without affecting the core data
3. Power system SW vendors and utilities adopt and
embrace the data model
CIM can do #1
XML with RDF can do #2
Todays Content
1. Background
2. UML
3. XML
4. RDF
5. CIM (IEC 61970)
6. IEC 61850 and PMUs/WANs
Unified Modeling Language (UML)
Used for modeling wide variety of SW components
Data
structures
System interactions
Use cases
Not tied to any one implementation technology:
realizeable on multiple platforms
Widely used outside the electricity sector
Standardized by the Object Management Group
(OMG), which largely does middleware
Classes
Class represents specific type of object
being modeled
Class hierarchy is an abstract model: defines
every type of component within a system as
a separate class
Very similiar to objected-oriented
programming languages: Smalltalk, C++,
Java, C#.
Each UML class can have
Its
own internal attributes
Relationships with other classes
Example circle class
Each instance has its
own copy of the data
Useful to have more
kinds of data than just
a circle!
Inheritance
Inheritance (AKA generalization) defines class and
sub-class relationships
Sub-classes
inherit attributes from parent class (and up)
Kids of classes
Concrete:
can instantiate instances
Abstract: cant instantiate instances, use as common
parent class for other classes
E.g. Shape class hierarchy (abstract class)
Circle
Triangle
Rectangle
Square
UML inheritance: clear triangle on parent side of line
Inheritance (cont.)
Association
Relationships other than parent-child are useful!
E.g., a Shape can have a Style
UML depiction: a line with role and cardinality at
each end
A
Shape has a role Style with cardinality 1
A Style has a role Shapes with cardinality 0..*
Implications: subclasses of Shape
Must
all have a Style
Different instances may share or have a different Style
( below this as in previous slides )
Aggregation
A special kind of association,
indicating one is a container of
instances of the other
UML depiction: a line with
a
clear diamond on the container end
with role & cardinality at each end
Clear diamond: If the container is
destroyed, the contained objects
still exist
Example: a Layer that contains
Shape objects and can be toggled
on and off
Composition
Specialized form of aggregation: contained object is
fundamental part of the container object
If
container destroyed then contained objects are
E.g., an Anchor class to show where a line can be
attached (anchored) on a Shape
UML depiction: filled diamond on the container end
Final example
Adding a new class, Connection, to represent the
connection between two Anchor objects
Two
associations from Connection to the same class
(Anchor)!
But different role names on each end
FromAnchor, ToAnchor
FromConnections, ToConnections
UML for Final Example
UML Discussion
In programming languages, what is the
difference between a procedure/routine and
an object?
What two things does an OO language have
that UML does not, and why?
Todays Content
1. Background
2. UML
3. XML
4. RDF
5. CIM (IEC 61970)
6. IEC 61850 and PMUs/WANs
Extensible Markup Language (XML)
Markup language (based on SGML, like HTML)
Defining
a set of rules for encoding documents/data
Both human-readable and machine-readable
Design goals: simplicity, generality, and usability
over the Internet
Element expressed as either
<tag>
contained data </tag>
<tag/>
Entry can contain attributes, either
<tag attributeOne=how attributeTwo=why/>
<tag attributeOne=how attributeTwo=why/></tag>
Above, anything in the is a child of the parent
Also, tag is NOT a keyword youre defining it!
Simple XML Example: Book
<book title=Hi author=Joe>
<rev num=2>
<year>2006</year>
<month>January</month>
<day>1</day>
</rev>
<chapter title=Preface>
<paragraph> . </paragraph>
.
</chapter>
. (another chapter)
</book>
XML Schema
XML per se has no set syntax or semantics for a tag
etc
Apps
have to know this to be able to parse and then
interpret an XML document
A schema expresses this
XML Schema Definition (XSD) defines
Elements
and attributes that may appear in a doc
Parent-child relationships
#children allowed for an element
Whether an element can include text
Data types for elements and attributes
If data items have fixed values or default values
Todays Content
1. Background
2. UML
3. XML
4. RDF
5. CIM (IEC 61970)
6. IEC 61850 and PMUs/WANs
Resource Description Framework (RDF)
XML so far only has parent-child links
RDF generalizes this
Element
can be assigned a unique ID attribute
E.g., for a book, sequel and sequelTo elements
RDF Schema (RDFS)
RDF can express relationships between resources
RDF cant define a vocabulary for them
RDF Vocabulary Description Language AKA RDFS
RDF Schema
Does
not provide vocabulary for specific resources or
classes
Does allow description of how classes and properties
should be used together
Basically provides a type system for RDF
E.g.
lib:book.title used to describe a lib:book
lib:book.sequel is an element of lib:book and refers to
another lib:book entry
Todays Content
1. Background
2. UML
3. XML
4. RDF
5. CIM (IEC 61970)
6. IEC 61850 and PMUs/WANs
CIM
Implementation-agnostic model for defining in UML
data used by electric utilities
Example: a Breaker (very common grid component)
Mechanical
switching device capable of making,
carrying, and breaking currents under normal circuit
conditions
Making, carrying for a specified time, and breaking
current under specified abnormal circuit conditions
CIM Breaker Example
IdentifiedObject: root class
PowerSystemResource: any
resource in a grid
Equipment: a physical device
(electrical or mechanical)
ConductingEquipment: carry
current or are conductively
connected to network
Switch: device that opens+closes
ProtectedSwitch: operated by
protection equipment
Breaker: also has a transit time
Subclasses of Switch
Todays Content
1. Background
2. UML
3. XML
4. RDF
5. CIM (IEC 61970)
6. IEC 61850 and PMUs/WANs
Review of IEC 61850:
The Good, the Bad and the Ugly
Designed for substation automation (see Prof.
Hausers slides for Segment 2)
Main features (from Wikipedia):
1.
2.
3.
4.
5.
6.
7.
Data Modeling
Reporting Schemes server-client relationship which can be
triggered based on pre-defined trigger conditions.
Fast Transfer of events Generic Substation Events (GSE) GOOSE
& GSSE.
Setting Groups The setting group control Blocks (SGCB) for
config
Sampled Data Transfer
Commands Various command types to maniuplate sub. Equip.
Data Storage Substation Configuration Language (SCL) is
defined for complete storage of configured data of the substation
in a specific format.
IEC 61850: For More Info
Wikipedia http://en.wikipedia.org/wiki/61850
Great detailed slide set:
http://seclab.web.cs.illinois.edu/wpcontent/uploads/2011/03/iec61850-intro.pdf
includes some UML
WAN
issues not addressed here
The Good
HUGE benefit compared to wires in substation
Data model elegant
Substation Configuration Language (SCL) elegant
The Bad
Complexity
Far
more complex than it has to be given the problem it
is tackling
Double the size/bandwidth of IEEE C37.118 with no
extra useful info
Feels to me like a spec doc by a 1975 Mechanical
Engineer specifying HW not a 1995 (or later) SW
Engineer specifying SW
Hype
Almost
sounds like it will cure cancer at times
PJM engineer: 4 substations (ISO has ~30% of the USA
footprint)
The Bad (cont.)
Performance
Subscriber
apps have to be able to detect missing and
duplicates (no sophisticated fault-tolerant multicast)
GOOSE authentication via RSA signatures (initial version):
way too expensive for many embedded devices
UIUC paper (Jaianqing Zhang and Carl Gunter, IEEE
SmartGRidComm 2010)
WSU paper (Hauser et al paper from HICS 45 (2012))
Shared-key
multicast authentication flavor allows
subscribers to spoof a publisher
GOOSE messages very CPU-intensive with ASN.1 integer
fields etc, expensive for many embedded devices
Have to be careful that the multicast (Ethernet
broadcast) does not overload small embedded devices
The Bad (cont.)
Misc
$3K
just to read the spec
Design by Committee before Full Implementation
Way better way: IETF and OMG
David Clark, Internet pioneer (1992)
"We reject: kings, presidents, and voting.
We believe in: rough consensus and running code"
Rickard Schantz, father of middleware (mid-90s)
Any time you standardize beyond the state of the practice you
are in trouble
The Bad (cont.)
Misc (cont.)
PMUs
often need many:one (to a PDC) not 1:many
communication
Lack of a reference implementation and reference test
suite
Have to test devices pairwise
Standard so huge many vendors dont implement all of it;
most vendors violate the standard in some way
The Ugly
No tools (configuration, administration, etc) that
work across multiple vendors
WANs are very different from LANs: partial failures &
widely-varying performance (incl. network jitter)
61850 assumes the same interface for a LAN will
magically work in a WAN
Known
by distributed computing practitioners and
applied researchers to be false since <= 1990
See the A Note on Distributed Computing by Waldo et al
1994 (from reading list for this class segment)
The Ugly (cont.)
61850-90-5 is the WAN extension
Dec 2010 draft says communications redundancy is crucial
But the draft has less than one page on it (Sec 8.8) that has
no meaningful details 61850-90-5 (cont.)
IETF RFC 2991 it relies on has nothing about end-to-end
latency, availability, exploiting a more controllable utility
infrastructure, tradeoffs below, etc
Advanced
multicast is hard, fault-tolerant is harder, realtime is harder yet, with security (not ruining perf.) worse
Wide range of properties could trade off, incl. latency,
jitter, consistency, throughput, resource consumption,
availability, ...
Do implementers (or drafters) know what this space of
possible properties is, what tradeoffs their given
implementations make? Very unlikely
Do utilities/ISOs know what tradeoffs they are being
sold, and how appropriate they are for them? Unlikelier!
The Ugly (cont.)
Bottom line: a lead control engineer from a large
utility to me
2009:
No way in hell am I letting it outside my
substations
2011: (ruefully) I was overruled from above, because its
a standard.
But a standard for doing what? With what properties
traded off?
Email from that Same Utility
I have little insight into the particulars, but I've been involved in
conversations about aligning the IEC 61850 with the CIM (an elusive
goal), plus some sidebar conversations on the "immaturity" of the
standard (although its been kicking around for 10 years). I think the
underlying reason for this perception is the vendor equipmentspecific configuration tools for 61850 and how each vendor cherrypicks the standard with little regard to its impact on the overall
substation configuration problem faced by a utility.
There is a need for a vendor-agnostic toolset that mirrors the utility
engineering process for constructing (or upgrading) a substation,
and the long-term maintenance of the substation configuration. This
process goes through several hands over several years, starting with
a substation designer and ending with project engineers. The
designer typically has templates to follow for the design, necessarily
at a high level to explain (and sell) the design. The electrical
equipment vendors associated with the utility at the beginning of the
design may not be the same when the time comes to purchase
equipment. [ continued]
Email from that Same Utility (cont.)
[ continued] Thus the need for the vendor-agnostic toolset to
support the design process and "seamlessly" transition to vendorspecific 61850 implementations as purchase orders are cut. Having all
the tools CIM compliant would be a nice touch, but the two standards
are not easily made compatible. There is much work to be done to
solve the 61850 design/maintenance tool problem.
There are a lot of communication protocols in the electric grid domain,
each reflecting the needs (and IT maturity state) of the time - from
Modbus to DNP3 to 61850 to GridStat. Unfortunately a utility cannot
green-field a new grid as each new protocol is developed, it has to
ensure its deployed assets remain useful while trying to realize the
benefits offered by maturing Information and Communications
Technologies. That is a major driver behind the XYZ Advanced
Technology lab - to determine which technologies have the potential to
improve the XYZ grid's "ities" : reliability, stability, profitability, etc.