Comparison Of Network On
Chip Topologies
OUTLINE
Introduction
Basic Definitions
Properties of a Topology
NOC Topologies
Evaluation
Conclusion
Introduction to NOC
NOC
A micronetwork of components
Transfers information between nodes
Challenges
Performance requirements
Latency as small as possible
As many concurent transfers as possible
Tight energy boundaries
Reliability requirements
Low Cost
NOC Motivation
Moores Law, doubling the number of gates
every18 months by shrinking the
technology dimensions
wire dimensions resistance (R=L/A)
inter-wire spacing capacitance (C =
oA/d)
Require the periodic insertion of repeaters
Consume more dynamic and leakage
power
50% of the power dissipation is due to the
(long) wires.[1]
What Chracterizes NOC
Topology (What)
Physical interconnection structure of the network graph
Routing Algorithm (Which)
Restricts the set of paths that msgs may follow
Switching Strategy (How)
How data in a message traverse a route
Circuit / Packet / Wormhole
Flow Control Mechanism (When)
when a msg or portions of it traverse a route
what happens when traffic is encountered?
Properties of a Topology
Performance
Diameter (Max routing Distance)
Average Distance
Cost
Avg. Nodal Degree (Avg number of links for each
node)
Number of links (Total number of links)
Reliability
Min number of links to disconnect the graph
NOC Topologies
Shared-Medium Local Networks
Contention Bus, Token Bus and Token
Ring
Direct Networks
1D: Linear, Ring
2D: Mesh, Tree
3D: Cube, Toroid
Indirect Networks
Crossbar, Benes, Perfect shuffle and
Omega
Shared-Medium Local
Networks
Local Area Networks
Contention Bus(Ethernet)
Token Bus (Arcnet)
Token Ring
All communication devices share the
transmission medium.
Only one device can drive network at a
time
Contention Bus (Ethernet)
All devices can monitor the state of the bus,
such as idle, busy, and collision.
collision means that two or more devices
are using the bus at the same time and their
data collided.
When the collision is detected, the
competing devices will quit transmission
and try later.
Ethernet adopts carrier-sense multiple
access with collision detection (CSMA/CD)
Token Bus & Token Ring
Contention Bus has an undeterministic nature
Not suitable for Real-Time applications
Solution:
Passing a token among network devices
The owner of the token has the right to acess to the bus
Maximum token holding time
Token Ring:
Natural extension of token bus
Passing of the token forms a ring structure
Properties of Shared Medium
LAN
Bus system is not scalable because bus becomes the
bottleneck.
Fully connected to each other
Bus systems:
Diameter = 1
Avg. Dist = 1
Reliability = 1
Number of links = N + Bus
Nodal Degree = 1
Ring Systems:
Diameter: N/2
Avg. Dist = N/2 = (N-1)*(N) / 2*(N-1)
Number of links : N-1
Nodal Degree = 2
Reliability = 2
Direct Networks (Router Based)
Strictly Orthogonal Topologies
Mesh
Torus
Hypercube
Other Topologies
Trees
Cube connected cycles
Node processors are connected directly with each
other by the network
Each node performs dataflow routing
Every direct network can be represented as
indirect, by splitting each node into a terminal and
a switch
Orthogonal
Every link and node can be arranged in
such a way that it produces a displacement
in a single dimension
Most of the implemented networks have an
orthogonal topology.
Orthogonal Topologies
4 ary 2 dim Mesh
8 Cube
Diameter = 6
Diameter = 3
Number of Links = 24 # of Links = 12
Node Degree = 3
Node Degree = 3
Avg Distance = 3
Avg. Distance = 1.71
Reliability = 2
Reliability = 3
Hypercubes
Diameter = logN
Node Degree = logN
Reliability = logN
Trees
Binary Tree
diameter: 2 log(N)
Reliability: 1
Total Number of links : N-1
Nodal Degree : 1<Nodal Degree <2
Problems
Congestion
Fault tolerance is low
Fat Trees
Fatter links (really more of them) as you go
up, so bisection BW scales with N
There are many possible paths, so at each
level the routing processor chooses a path
at random, in order to balance the load.
Cube Connected Cycles
Like n-dimensional
hypercube of virtual
nodes
each virtual node is a
ring with n nodes, for
a total of n2n nodes
Each node in the ring
is connected to a
single dimension of
the hypercube
diameter is
same with
hypercube of
similar size
Cube Connected Cycles
Total number of links : ( n2n * n )/ 2
Node Degree = Reliability : n
Diameter: 2*n
Embed Multiple Dimensions
Embed multiple logical dimension in
one physical dimension using long
wires
Indirect Networks(Switch Based)
Crossbar
Fully Connected
Perfect Shuffle
Multistage Interconnection Networks
Blocking Networks
Omega
Banyan
Non Blocking Networks
Clos Network
Benes Network
node processors (1 n ) node switches
Switches
Switches
perform the routing
Provide a programmable connection
between their ports
Do not perform information processing
Crossbar
Free of interconnect contention
Crossbar networks are used in the design
of high-performance small-scale
multiprocessors
However, the bit energy will increase
linearly with the number of input and
output ports N
Fully Connected Switch
Using a single N N crossbar is much
cheaper than using a fully connected
direct network topology
Requiring N routers, each one having
an internal N N crossbar
Perfect Shuffle Network
a) The perfect shuffle
b) Inverse perfect shuffle
c) Bit reversal permutations for N=8
Omega Networks
The omega network is another
example of a banyan multistage
interconnection network that can be
used as a switch fabric
The omega uses the perfect shuffle
Omega Networks
0
1 4
Omega Networks
0
Omega Networks
0
Omega Networks
0
Omega Networks
0
Omega Networks
0
Omega Networks
0
4 4
Path Contetion
The omega network has the problems
as the delta network with output port
contention and path contention
Again, the result in a bufferless switch
fabric is cell loss (one cell wins, one
loses)
Path contention and output port
contention can seriously degrade the
achievable throughput of the switch
Path Contention
0
1 4
3 5
Path Contention
0
6
7
Path Contention
0
Path Contention
0
Path Contention
0
Path Contention
0
6
7
Path Contention
0
Path Contention
0
5 5
Batcher Sorter & Banyan
Network
One solution to the contention
problem is to sort the cells into
increasing order based on desired
destination port
Banyan networks are a
class of MINs with the
property that there is a
unique path between
any pair of source and
destination
Batcher-Banyan Example
0 0
1 1
2 3
3 4
4 6
5 7
Batcher-Banyan Example
0
5
6
5
4
6
7
Batcher-Banyan Example
0
Batcher-Banyan Example
0
2
3
3
4
Batcher-Banyan Example
0
Batcher-Banyan Example
0
Batcher-Banyan Example
0
0 0
1 1
3 3
4 4
6 6
7 7
Clos Networks
Clos networks have three stages: the
ingress stage, middle stage, and the
egress stage. Each stage is made up
of a number of crossbar switches
BenesNetworks
Clos networks may also be generalised to
any odd number of stages. By replacing
each centre stage crossbar switch with a 3stage Clos network, Clos networks of five
stages may be constructed. By applying the
same process repeatedly,
Hybrid Networks
Multiple-backplane
Hierarchical buses
Cluster tightly coupled computational units with
high communication bandwidth
Provide lower bandwidth intercluster
communication link sctures
performance comparable with homogeneous,
high-bandwidth architectures
energy efficiency is a strong driver toward using
hybrid architectures.
Cluster Based 2-D Mesh
At the lower level, each cluster consists of four
processors connected by a bus.
At the higher level, a 2-D mesh connects the
clusters. The broadcast capability of the bus is used
at the cluster level
Evaluation I
# of links
Nodaldegree
Diameter
Avg. Dist
Reliability
7 BinTree
1.71
2.21
8 Ring
2.21
9 Mesh
12
2,66
8 Cube
12
1,71
Evaluation I
# of links
Nodaldegree
Reliability
Diameter
Avg. Dist
15 BinTree
14
1.87
3.5
16 Mesh
24
16 HyperCube
32
2.13
16 Chord.Ring
32
Power Consumption Under
Different Number of Ports
Conclusion
Shared Medium topologies have a
bottleneck on shared medium. So not
extensible
Direct topologies can be easily
extensible but there are thresholds
between cost, performance and
reliability
Embed multiple logical dimension in
one physical dimension using long
wires is another disadvantage
Conclusion
Indirect topologies blocking topologies
have contention problems. Non
blocking networks have extra stages
and costs.
Non-Blocking networks are cheaper
than a crossbar with the same size
Hybrid networks have high bandwith
and energy efficiency using clustering
Conclusion
Interconnect contention (internal
blocking) induces significant power
consumption on internal buffers, and
the power consumption on buffers will
increase sharply as throughput
increases.
References
[1]N. Magen, A. Kolodny, U. Weiser, and N. Shamir.
Interconnect-power dissipation in a microprocessor. In SLIP04,
Feb. 2004.
[2]Cidon, I., Keidar, I.: Zooming in on Network on Chip
Architectures. Technion Department of Electrical Engineering,
2005
[3]Jose Duato , Sudhakar Yalamanchili , Lionel Ni,
Interconnection Networks: An Engineering Approach, IEEE
Computer Society Press, Los Alamitos, CA, 1997
[4]T.T. Ye: On-Chip Multiprocessor Communication Network
Design and Analysis. Standford University of Electrical
Engineering, Dec. 2003
[5] L Benini and G.D. Micheli, Networks on chips: a new SoC
paradigm. IEEE Computer 35 1 (2002), pp. 7078
Questions ???
Thanks