Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
8 views24 pages

Lecture 6 SoC

The document discusses the importance of interconnects in System-on-Chip (SoC) designs, emphasizing that effective communication architecture is crucial for performance, power consumption, and cost. It covers various architectures such as bus-based systems and Networks-on-Chip (NoCs), detailing their components, functions, and challenges. The evolution of these architectures highlights the need for advanced interconnect solutions to manage increasing complexity in modern SoCs.

Uploaded by

Athmajan Vu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views24 pages

Lecture 6 SoC

The document discusses the importance of interconnects in System-on-Chip (SoC) designs, emphasizing that effective communication architecture is crucial for performance, power consumption, and cost. It covers various architectures such as bus-based systems and Networks-on-Chip (NoCs), detailing their components, functions, and challenges. The evolution of these architectures highlights the need for advanced interconnect solutions to manage increasing complexity in modern SoCs.

Uploaded by

Athmajan Vu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Interconnects

on SoCs
Zaheer Khan
Outline

• Introduction

• Bus-based architecture

• On-chip communication
standards

• Netw orks-on-Chip (NoCs)


• SOC designs involves the integration of
intellectual property (IP) cores, each separately designed
and verified.
• Most important issue is the method by which the IP
cores are connected together
• To the uninitiated, the interconnect is a bunch of wires.
• In modern complex SoC chips it is comprised of several
millions of gates of standard cells.
• The logic is used for arbitration, buffering, clock and
power domain crossing, muxing, and scheduling
Introduction
• M o s t w i r e les s r e s e a r ch w o r k
f o cu s e s o n a l g o r ith ms
( com p utation)

• Co mmu n i c a tion th e ke y p e r fo r ma nc e
b o ttl ene ck i n d e e p s u b mi c ron (DS M )
te ch n olog ies

Figure: Trend of total interconnect length on a chip


Example 1):

For connecting multiple masters and/or


multiple slaves an Interconnect is required.

The system has multiple masters attempting


to communicate with a single slave, then the
Interconnect may contain an arbiter

Routes data between the master and slave


interfaces.

This arbiter could be implemented using


simple priorities, a round-robin architecture,
or whatever suits the designer's needs.
Example 2)

• What about if there are


multiple slaves with a single
master?
• For this to work the
interconnect would need to
interpret the address and
route the transaction to the
proper slave.
• In this case a decoder could
work.
Example 3):

• Interconnects allow multiple masters


and/or multiple slaves to interface with each
other.
• Master and Slave communicate via
channels

• Notice that the write address channel and


the read address channel have their own
dedicated arbiters and decoders
• this way reads and writes can happen
simultaneously.
Example 4):
Communication
architecture design is
critical
• Exploding core counts requiring more advanced
Interconnects

• Complexity too high to handle (and need to


verify!)

• Critical Decision Is Interconnect Choice

• Communication Architecture Design and


Verification becoming Highest Priority in
Contemporary SoC Design!
Example Complex SoC today
Modern SoCs need
Communication Centric Design

• Communication is the most critical aspect affecting SoC system performance

• Communication architecture can consume up to 50% of total on -chip power

• Ever increasing number of wires, repeaters, bus components (arbiters,


bridges, encoders/decoders etc.) increases system cost

• Communication architecture design, customization, exploration, verification


and implementation takes up the largest chunk of a design cycle

• Communication architectures significantly affect performance, power, cost


and time-to-market
Evolution of on -chip communication architectures and system level view

Some Challenges
Evolution

1990 1995 2000 2005 2010

Custom shared Bus Hierarchical Bus Shared Bus Network on Chip


Buses
• S i m p l e s t c a s e : A p ro c e s s o r w o u l d
p e r f o r m re a d a n d w r i t e t ra n s a c ti o n s
ov e r t h e b u s t o a D R A M m e m o r y a n d , i f
i t u s e d a d i f f e re n t a d d re s s , t o o t h e r
t a r g e t p e r i p h e ra l s .

• E v e n t u a l ly, o t h e r i n i t i a t o r s u s e d t h e b u s ,
too, and arbiters became necessary to
a l t e r n a t e l y g ra n t d i f f e re n t i n i t i a t o r s
a c c e s s t o t h e i r re q u e s t e d t a r g e t s .

• M a n y c o m p an i e s ow n e d a n d d ev e l o p e d
t h e i r ow n b u s i n t e rc o n n e c t I P.

• 1 9 9 6 b ro u g h t a b o u t t h e f i r s t d e - f a ct o
i n d u s t r y s t a n d a rd b u s p ro t o c o l f o r o n -
c h i p i n t e rc o n n e c t s : A R M ’s A d v a n c e d
M i c ro c o n t ro l l e r B u s A rc h i t e c t u re ( A M BA ) .

• T h i s u s h e re d i n t h e a d v a n c e m e n t o f I P
c o re i n t e ro p e ra b i l i t y.
Bus-based
Architectures
• Master (or Initiator)
• IP component that initiates a read or write data transfer

• S l a v e ( o r Ta r g e t )
• IP component that does not initiate transfers and only

responds to incoming transfer requests

m a s te r /s l ave
• hybrid components (act as both master and slave)

• Ar b i t e r
• Controls access to the shared bus

• Uses arbitration scheme to select master to grant access

to bus

• Decoder
• Determines which component a transfer is intended for

• Bridge
• Connects two busses

• Acts as slave on one side and master on the other


Bus Signals
A b u s t yp i c a l l y c o n s i s t s o f t h r e e t yp e s o f s i g na l l i n e s

• Ad d r e s s

– C a r r y a d d r es s o f d e s t i nat i o n f o r wh i c h t r a n s fe r i s
initiated

– C a n b e s h a r e d o r s e p a r a t e f o r r e a d , wr i t e d a t a

• Data

– C a r r y i n f o r ma t i o n b e t we e n s o u r c e a n d d e s t i n a t i o n
c o m p o n e n ts

– C a n b e s h a r e d o r s e p a r a t e f o r r e a d , wr i t e d a t a

– C h o i c e o f d a t a wi d t h c r i t i c a l f o r a p p l i c a t i o n p e r f o r m a n c e

• C o n t r ol

– R e q u e s t s a n d a c k n o wl e d g e me n ts

– S p e c i f y m o r e i n f o r m a t i o n a b o u t t yp e o f d a t a t r a n s f e r

– B yt e e n a bl e, b u r s t s i z e, c a c h e a b l e / b u ff e r a b l e , wr i t e -
b a c k / th r o u g h
Bus width:

• The number of signals used to transmit are typically a power of 2 (common values are 16, 32,
or 64) and referred to as the bus width.

Shared/separate bus:

• single shared address bus for both reads and writes, it is possible to have separate address
buses for read and write data transfers.

Multiple Address Buses: Having multiple address buses improves the concurrency in the system,
since more data transfers

can occur in parallel. However, this comes at the cost of larger number of wires which can

Bus Signals
increase area and power consumption.

Data Bus: The typical number of signals in a data bus is 16, 32, 64, 128, 256, 512, and 1024
signals (called data

bus width).

Packing/Unpacking: The choice of data bus width is important because it determines whether
any packing or unpacking of data is necessary at component interfaces.

Example: Consider a case where the memory word size of a memory component is 64 bits and
the data bus width is 32 bits.

Then, every time a master requests data from the memory, the read data needs to be unpacked
(or split) into two data items of 32 bits in width before being transmitted onto the bus.

The data also needs to be packed (or merged) at the master interface before being sent to the
master component.

The packing and unpacking of data at the interfaces can introduce an overhead in terms of power,
performance, and area of the interface logic.
Decoding and Arbitration
➢ Decoding

• Determines the target for any transfer initiated by a master

➢ Ar b i t r a ti on

• Decides which master can use the shared bus if more than one master request bus access simultaneously

➢ Decoding and Arbitration can either be

• Centralized

• Distributed

➢ Ar b i t r a ti on S c h e m e s

• Random

• Randomly select master to grant bus access to

• Static priority

• Masters assigned static priorities

• Higher priority master request always serviced first

• May lead to starvation of low priority masters

• Round-robin

• Masters allowed to access bus in a round -robin manner

• No starvation – every master guaranteed bus access

• Inefficient if masters have vastly different data injection rates

• High latency for critical data streams


CROSSBAR

❑ Fo r h i g h p e r for ma nc e s y s te ms th a t r e q u i re e x te n s ive d a ta
tr a n s fe r p a r a llelism, th e full b us c r o ssb a r (a l s o c a l le d full b us
m a t r i x ) i s a s u i ta ble ch o i c e.

❑ A c r o ssb ar s wi t c h s ys t e m p e r m i ts s i mul taneous


t r a ns fer s f r om a l l m e m or y m odul e s b e c a use t h e r e i s a
s e p a rate p a t h a s s o ciate d wi t h e a c h m o d u l e .
➢ Po i n t -t o- po int co n n e ction b e tw e e n a l l .

➢ Ve r y hi g h thr oug hput, b ut ve r y costl y w i r i ng.

❑ C a n b e r e d u ce d i n to p a r tia l cr o s s b a r / ma trix.

❑ In a cr o s s b a r, th e r e i s a co m m u n ica tio n p a th
b e tw e e n e ve r y s o u r ce o f tra f f ic a n d e ve r y
d e s ti nation of tra f f ic; a nd a l l p a ths ca n w or k i n
p a ra l lel a s l o n g a s th e r e i s n o co n te n tio n

❑ tw o s o u r ce s th a t n e e d to s e n d tra f f ic to th e s a m e
ta r g e t a t th e s a m e ti m e . C o n te n tio n is m a n a g ed b y
a n a r b i ter f or e a ch ta r g e t, a nd f l ow contr ol i s a t the
s o u r ce .
Crossbar (Contd)
❑ C ro s s b a r - b a s ed s w i tc h i n g

w o r ks w e l l w h e n t h e n u m b e r o f s o u rc e s a n d t h e n u m b e r o f d e s t i n a t i o n s i s s m a l l .

❑ B u t a s S o C d e s i g n s g rew, t h e c ro s s b a r a rc h i te c t u re w h i c h s ca l e s i n a q u a d rat i c w ay,

w o u l d b e co m e i m p ra ct i ca l a n d o v e rd e s i g n e d — way to o l a rg e .

❑ U s i n g a l a rg e c ro s s b a r i n m o d e r n d e s i g n s l e av e s m u c h o f t h e c ro s s b a r u n d e r u s e d w i t h m a ny p a t h s i d l e a t a ny o n e t i m e .

l a rg e c ro s s b a rs - > h u ge d i e a re a t h ey co n s u m e.

❑ W i re s d o n o t s h r i n k a s fa s t a s t ra n s i sto rs i n a d va n c e d p ro c e s s e s , m a k i n g w i re - re l a te d co n g e s t i o n i n a l a rg e c ro s s b a r a re a l
challenge.

❑ to p a r t i t i o n a l a rg e c ro s s b a r i nto s m a l l e r u n i t s - > co n n e c t t h e m to i m p l e m e nt t h e d e s i re d to p o l o g y.

❑ re q u i r i n g a l o t o f l o g i c a t t h e i nte r fa ce b et w e e n t w o c ro s s b a rs to e nfo rce t h e p ro to co l r u l e s c h o s e n fo r t h e co n n e c t i o n.

❑ O t h e r i n ef f i ci e n ci es — de p e n d i n g o n t h e v e n d o r s p e c i f i c
Crossbar (Contd)

❑ When the number of elements that need to communicate in a chip is small

❑ a simple crossbar approach to the interconnect function is a possible


choice.

❑ When the number of elements in the system starts to grow, and the
distance between them becomes large with respect to the intended clock
period,

❑ crossbars no longer work

❑ a network-on-chip (NoC) approach is required.


Example:

• The crossbar has an integrated interrupt controller,


• Supports optional logic to transfer data across multiple clock
domains,
• Across multiple interface widths.
• Pipeline registers can be added at any point on a crossbar bus
to increase the value the maximum allowed clock frequency on
the bus.
• Separate read and write data buses can have widths of up to
128 bits for the example shown
Unlike traditional bus-based on-chip communication architectures, NoCs use packets to route
data from the source to the destination component,
via a network fabric that consists of switches (routers) and interconnection links (wires).

Two or more IPs are connected through this common sharing


network.

Network on Chips

An example of a (mesh type) NoC interconnection fabric


Several processing elements (PEs) connected together via routers and
regular sized wires.

A PE (also referred to as a node) in this case can be any component


such as a microprocessor, application-specific integrated circuit (ASIC)
block or memory, or a combination of components
connected together.

A network interface ( NI) at the boundary of each PE


is used to packetize any data generated by the PE.
NOCs
This NI is connected to a router, which has buffers at its input to
accept data packets from a PE or from other routers connected to it.

A crossbar switch inside the router is used to route the data


packets from the input buffers to the appropriate output link, based on
the address in the packet header.
An arbiter component is used to determine which packets get priority
when multiple packets from different sources are vying for transfer on
the same interconnection link
NOCs

B u s - b a s e d a r ch i te c t u r e s s u f f e r f r o m l o n g g l o b a l i n t e r c o n ne c t l e n g t h s t h a t r e q u i r e s e ve r a l c l o ck
c y c l e s t o t r ave r s e i n D S M ( d e e p s u b m i c r o n ) t e ch n o l o g i e s

w h e r e w i r e d e l ay i s i n c r e a s i n g l y d o mi n a t i n g g a t e d e l ay.

P i p e l i n i n g o r i n s e r t i n g b u f f e r s / r e g i s te r s c a n b e u s e d t o ove r c o me t h e s e l a r g e d e l ay s o n l o n g
i n t e r c o n n e c ts.

P i p e l i n i n g d iv i d e s a l o n g i n t e r c o n n e c t i n t o s h o r t e r s i z e d s e g m e n t s t h a t a r e t y p i c a l l y t r ave r s e d i n
a s i n g l e c l o ck c y c l e by t h e d a t a o n a b u s.

H owe ve r, r e s o u r c e u t i l i z a t i o n i n s u ch a c a s e i s u s u a l l y i n e f f i c i e n t b e c a u s e t h e w h o l e b u s i s i n a
b u s y s t a t e ( a n d va c a n t w i r e s e g m e n t s t y p i c a l l y c a n n o t b e u s e d by o t h e r m a s t e r s ) u n t i l a d a t a
transfer is completed

D a t a p a cke t s t y p i c a l l y t r ave r s e t h e s e l i n k s ( a l s o c a l l e d s w i t ch t o - s w i t ch l i n k s ) i n a s i n g l e c y c l e,
a n d a r e t h e n b u f f e r e d i n t h e r o u t e r s, b e f o r e b e i n g r o u t e d t o a n o th e r l i n k i n t h e s u b s e qu e n t c y c l e.

i n N o C s, a r b i t r a ti o n i s i n h e r e n t l y d i s t r i b u te d , o c c u r r i n g a t e a ch r o u t e r.
NOC Topologies

Topologies for NoCs can be classified into three broad categ ories —

direct networks, indirect networks, and ir regular networks. These are described
below.
Topologies for NoCs can be classified into three broad categories—
direct networks, indirect networks, and irregular networks. These are described below.

each node has direct point-to-point links to a subset


of other nodes in the system called neighboring nodes. The
nodes consist of computational blocks and/or memories, as
well as a NI block that acts as a router.

This router is connected to the routers of the neighboring


nodes through links

The direct network topology has the property that


as the number of nodes in the system increases, the total
available communication bandwidth also increases.

A fundamental trade-off in direct network design


is between connectivity and cost. Higher connectivity results
in higher performance,
but has greater energy and area costs for the router and link
implementations.
Every node in a 2-D mesh is connected to four
neighboring nodes, except for the nodes at the edges. The
Examples of direct network topologies: (a) point-to-point. (b) mesh area of a mesh grows
linearly with the number of nodes

You might also like