Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
101 views46 pages

01 - 01 PCI Express Basics & Background

Uploaded by

jimmy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
101 views46 pages

01 - 01 PCI Express Basics & Background

Uploaded by

jimmy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 46

PCI Express® Basics &

Background
Richard Solomon
PCI-SIG® PWG Member
Synopsys
Acknowledgements
Click to edit Master title style
Thanks are due to Ravi Budruk, Mindshare, Inc. for legacy material on PCI Express®
Basics

2
PCI Express Background

3
Revolutionary AND Evolutionary
Click to edit Master title style
• PCI™ (1992/1993)
• Revolutionary
• Plug and Play jumperless configuration (BARs)
• Unprecedented bandwidth
• 32-bit / 33MHz – 133MB/sec
• 64-bit / 66MHz – 533MB/sec
• Designed from day 1 for bus-mastering adapters

• Evolutionary
• System BIOS maps devices then operating systems boot and run without further knowledge of PCI
• PCI-aware O/S could gain improved functionality
• PCI 2.1 (1995) doubled bandwidth with 66MHz mode

4
Revolutionary AND Evolutionary
Click to edit Master title style
• PCI-X™ (1999)
• Revolutionary
• Unprecedented bandwidth
• Up to 1066MB/sec with 64-bit / 133MHz
• Registered bus protocol
• Eased electrical timing requirements
• Brought split transactions into PCI “world”

• Evolutionary
• PCI compatible at hardware *AND* software levels
• PCI-X 2.0 (2003) doubled bandwidth
• 2133MB/sec at PCI-X 266 and 4266MB/sec at PCI-X 533

5
Revolutionary AND Evolutionary
Click to edit Master title style
• PCI Express – aka PCIe® (2002)
• Revolutionary
• Unprecedented bandwidth
• x1: up to 4GB/sec in *EACH* direction (PCIe 5.0)
• x16: up to 64GB/sec in *EACH* direction (PCIe 5.0)
• “Relaxed” electricals due to serial bus architecture
• Point-to-point, low voltage, dual simplex with embedded clocking

• Evolutionary
• PCI compatible at software level
• Configuration space, Power Management, etc.
• Of course, PCIe-aware O/S can get more functionality
• Transaction layer familiar to PCI/PCI-X designers
• System topology matches PCI/PCI-X
• Doubling of bandwidth each generation (from 250MB/s/lane):
• PCIe 2.0 (2006) 500MB/s/lane
• PCIe 3.0 (2010) ~1GB/s/lane
• PCIe 4.0 (2017) ~2GB/s/lane
• PCIe 5.0 (2023) ~4GB/s/lane

6
PCI Concepts

7
Address Spaces – Memory &
Click to edit Master title style I/O
• Memory space mapped cleanly to CPU semantics
• 32-bits of address space initially
• 64-bits introduced via Dual-Address Cycles (DAC)
• Extra clock of address time on PCI/PCI-X
• 4 DWORD header in PCI Express
• Burstable
• I/O space mapped cleanly to CPU semantics
• 32-bits of address space
• Actually much larger than CPUs of the time
• Non-burstable
• Most PCI implementations didn’t support
• PCI-X codified
• Carries forward to PCI Express

8
Address Spaces – Configuration
Click to edit Master title style
• Configuration space???
• Allows control of devices’ address decodes without conflict
• No conceptual mapping to CPU address space
• Memory-based access mechanisms in PCI-X and PCIe
• Bus / Device / Function (aka BDF) form hierarchy-based address (PCIe 3.0 calls this
“Routing ID”)
• “Functions” allow multiple, logically independent agents in one physical device
• E.g. combination SCSI + Ethernet device
• 256 bytes or 4K bytes of configuration space per device
• PCI/PCI-X bridges form hierarchy
• PCIe switches form hierarchy
• Look like PCI-PCI bridges to software
• “Type 0” and “Type 1” configuration cycles
• Type 0: to same bus segment
• Type 1: to another bus segment

9
Configuration Space (cont’d)
Click to edit Master title style
Processor Processor Processor Processor

Address Port Data Port Address Port Data Port

Main Host/PCI Bridge


Host/PCI Bridge
Memory Bus = 4
Bus = 0
Subord = 3 Subord = 5

PCI Bus 0 PCI Bus 4

PCI-to-PCI PCI-to-PCI
Bridge Bridge

Primary = 0 Primary = 4
Secondary = 1 Secondary = 5
Subord = 3 Subord = 5

PCI Bus 1 PCI Bus 5

PCI-to-PCI PCI-to-PCI
Bridge Bridge

Primary = 1 Primary = 1
Secondary = 2 Secondary = 3
Subord = 2 Subord = 3

PCI Bus 2

PCI Bus 3

10
Configuration Space
Click to edit Master title style
• Device Identification
• VendorID: PCI-SIG® assigned
• DeviceID: Vendor self-assigned
• Subsystem VendorID: PCI-SIG
• Subsystem DeviceID: Vendor
• Address Decode controls
• Software reads/writes BARs to determine
required size and maps appropriately
• Memory, I/O, and bus-master enables
• Other bus-oriented controls

11
Configuration Space – Capabilities List
Click to edit Master title style
• Linked list
• Follow the list! Cannot assume fixed location of any given feature in any given device
• Features defined in their related specs:
• PCI-X
• PCIe
• PCI Power Management
• Etc.

31 16 15 8 7 0
Pointer to
Feature-specific Next Capability Capability ID Dword 0
Dword 1
Configuration Registers
Dword n
12
Configuration Space –
Click to edit
Extended Master
Capabilities title
List style
• Linked list – new with PCI Express
• Follow the list! Cannot assume fixed location of any given feature in any given device
• First entry in list is *always* at 100h
• Features defined in PCI Express and related (e.g. MR-IOV, SR-IOV) specifications
• Consolidated in PCI Code and ID Assignment Spec

31 20 19 16 15 8 7 0
Pointer to Next Capability
Version Capability ID Dword 0
Dword 1
Feature-specific Configuration Registers
Dword n
13
Interrupts
Click to edit Master title style
• PCI introduced INTA#, INTB#, INTC#, INTD# - collectively referred to as INTx
• Level sensitive
• Decoupled device from CPU interrupt
• System controlled INTx to CPU interrupt mapping
• Configuration registers
• report A/B/C/D
• programmed with CPU interrupt number
• PCI Express mimics this via “virtual wire” messages
• Assert_INTx and Deassert_INTx

14
What are MSI and MSI-X?
Click to edit Master title style
• Memory Write replaces previous interrupt semantics
• PCI and PCI-X devices stop asserting INTA/B/C/D and PCI Express devices stop sending
Assert_INTx messages once MSI or MSI-X mode is enabled
• MSI uses one address with a variable data value indicating which “vector” is asserting
• MSI-X uses a table of independent address and data pairs for each “vector”
• NOTE: Boot devices and any device intended for a non-MSI operating system generally
must still support the appropriate INTx signaling!

15
Split Transactions – Background
Click to edit Master title style
• PCI commands contained no length
• Bus allowed disconnects and retries
• Difficult data management for target device
• Writes overflow buffers
• Reads require pre-fetch
• How much to pre-fetch? When to discard? Prevent stale data?

• PCI commands contained no initiator information


• No way for target device to begin communication with the initiator
• Peer-to-peer requires knowledge of system-assigned addresses

16
Split Transactions
Click to edit Master title style
• PCI-X commands added length and Routing ID of initiator
• Writes: allow target device to allocate buffers
• Reads: Pre-fetch now deterministic
• PCI-X retains “retry” & “disconnect”, adds “split”
• Telephone analogy
• Retry: “I’m busy go away”
• Delayed transactions are complicated
• Split: “I’ll call you back”
• Simple
• More efficient

17
Benefits of Split Transactions
Click to edit Master title style
Bandwidth Usage with Conventional PCI Bandwidth Usage with PCI-
Protocols X Enhancements
275 275
100% 100%
Idle Time System Overhead
250 Idle Time 250 -- Scheduling
-- Unused BW
-- Unused BW 90% 90%
225 225 Transaction Overhead
80% 80% -- Addressing and Routing
Bandwidth MegaBytes /sec

Bandwidth MegaBytes /sec


200 200
System Overhead 70% 70%

Percent of Total Bandwidth


-- Scheduling

Percent of Total BandWidth


175 175
60% 60%
Transaction Overhead
150 -- Addressing and Routing 150
50% 50%

125 125
40% 40%

100 30% 100 30%


Transaction
Data Payload
50 Transaction Data Payload 20% 50 -- Actual user 20%
-- Actual user data data
25 10% 25 10%

1 2 3 4 5 1 2

Number of Load Exerciser Cards Number of Load Exerciser Cards

18
PCI Express Basics

19
PCI Express Features
Click to edit Master title style
• Dual Simplex point-to-point serial connection
• Independent transmit and receive sides
• Scalable Link Widths
• x1, x2, x4, x8, x12, x16, x32
• Scalable Link Speeds
• 2.5, 5.0, 8.0, 16.0 GT/s, 32GT/s
• Packet based transaction protocol

Packet
PCIe PCIe
Device Link (x1, x2, x4, x8, x12, x16 or x32) Device
A B
Packet
20
PCI Express Terminology
Click to edit Master title style
PCI Express Device A

Signal Link

Wire Lane

PCI Express Device B

21
Upstream/Downstream
Click to edit Master title style
• Relative to root – up is towards, down is away
• Note “streamness” of devices vs their ports
• The direction a gremlin standing on the device looks…

22
PCI Express Throughput
Click to edit Master title style
Link Width
Bandwidth (GB/s) x1 x2 x4 x8 x16
“2.5 GT/s” (PCIe 1.0+) 0.25 0.5 1 2 4
“5 GT/s” (PCIe 2.0+) 0.5 1 2 4 8
“8 GT/s” (PCIe 3.0+) ~1 ~2 ~4 ~8 ~16
“16GT/s” (PCIe 4.0+) ~2 ~4 ~8 ~16 ~32
“32GT/s” (PCIe 5.0+) ~4 ~8 ~16 ~32 ~64
“64GT/s” (PCIe 6.0) 8 16 32 64 128

Derivation of these numbers:


• 20% overhead due to 8b/10b encoding in 1.x and 2.x
• Note: ~1.5% overhead due to 128/130 encoding not reflected above in
8GT/s through 32GT/s values
23
Additional Features
Click to edit Master title style
• Data Integrity and Error Handling
• Link-level “LCRC”
• Link-level “ACK/NAK”
• End-to-end “ECRC”
• Credit-based Flow Control
• No retry as in PCI
• MSI/MSI-X style interrupt handling
• Also supports legacy PCI interrupt handling in-band
• Advanced power management
• Active State PM
• PCI compatible PM

24
Additional Features
Click to edit Master title style
• Evolutionary PCI-compatible software model
• PCI configuration and enumeration software can be used to enumerate PCI Express hardware
• PCI Express system will boot “PCI” OS
• PCI Express supports “PCI” device drivers
• New additional configuration address space requires OS and driver update
• Advanced Error Reporting (AER)
• PCI Express Link Controls

25
PCI Express Topology
Click to edit Master title style
CPU

Root Complex
Bus 0 (Internal) Memory

PCIe 1 PCIe 6 PCIe 7

PCIe 3 Switch
PCIe Switch PCIe Virtual
PCI
Endpoint Endpoint PCIe Bridge
Bus 2
Bridge To
PCIe 4 PCIe 5 PCI/PCI -X Virtual
PCI
Virtual
PCI
Virtual
PCI
Bridge Bridge Bridge

PCIe Legacy
Endpoint Endpoint PCI/PCI-X
Bus 8
Legend
PCI Express Device Downstream Port
PCI Express Device Upstream Port

26
Transaction Types, Address Spaces
Click to edit Master title style
Request are translated to one of four transaction types by the Transaction
Layer:
1. Memory Read or Memory Write. Used to transfer data from or to a memory mapped
location.
– The protocol also supports a locked memory read transaction variant
2. I/O Read or I/O Write. Used to transfer data from or to an I/O location.
– These transactions are restricted to supporting legacy endpoint devices
3. Configuration Read or Configuration Write. Used to discover device capabilities, program
features, and check status in the 4KB PCI Express configuration space.
4. Messages. Handled like posted writes. Used for event signaling and general purpose
messaging.

27
Three Methods For Packet Routing
Click to edit Master title style
• Each request or completion header is tagged as to its type, and each of the
packet types is routed based on one of three schemes:
• Address Routing
• ID Routing
• Implicit Routing
• Memory and IO requests use address routing
• Completions and Configuration cycles use ID routing
• Message requests have selectable routing based on a 3-bit code in the message
routing sub-field of the header type field

28
Programmed I/O Transaction
Click to edit Master title style
Processor Processor

MRd FSB
Requester:
-Step 1: Root Complex (requester)
initiates Memory Read Request (MRd) Root Complex
-Step 4: Root Complex receives CplD DDR
SDRAM
MRd CplD

Switch A Switch C
MRd

CplD

Switch B Endpoint Endpoint Endpoint

MRd CplD
Completer:
Endpoint Endpoint -Step 2: Endpoint (completer)
receives MRd
-Step 3: Endpoint returns
Completion with data (CplD)

29
DMA Transaction
Click to edit Master title style
Processor Processor

FSB
Completer:
-Step 2: Root Complex (completer)
receives MRd Root Complex
-Step 3: Root Complex returns DDR
Completion with data (CplD) SDRAM
CplD MRd

Switch A Switch C
CplD

MRd

Switch B Endpoint Endpoint Endpoint

CplD MRd
Requester:
Endpoint Endpoint -Step 1: Endpoint (requester)
initiates Memory Read Request (MRd)
-Step 4: Endpoint receives CplD

30
Peer-to-Peer Transaction
Click to edit Master title style
Processor Processor

FSB

Root Complex
DDR
SDRAM
CplD MRd MRd CplD

Switch A Switch C
CplD
MRd MRd CplD

Switch B Endpoint Endpoint Endpoint Completer:


-Step 2: Endpoint (completer)
receives MRd
CplD MRd -Step 3: Endpoint returns
Completion with data (CplD)
Endpoint Endpoint Requester:
-Step 1: Endpoint (requester)
initiates Memory Read Request (MRd)
-Step 4: Endpoint receives CplD

31
TLP Origin and Destination
Click to edit Master title style
PCI Express Device A PCI Express Device B

Device Core Device Core

PCI Express Core PCI Express Core


Logic Interface Logic Interface
TX RX TX RX

TLP Transaction Layer Transaction Layer TLP


Transmitted Received
Data Link Layer Data Link Layer

Physical Layer Physical Layer

Link

32
TLP Structure (Non-FLIT Mode)
Click to edit Master title style
Information in core section of TLP comes
from Software Layer / Device Core

Bit transmit direction

Start Sequence Header Data Payload ECRC LCRC End


1B 2B 3-4 DW 0-1024 DW 1DW 1DW 1B

Created by Transaction Layer

Appended by Data Link Layer

Appended by Physical Layer


*Slightly different at 8GT/s-32GT/s*
33
Flit Mode Overview – New Mode in PCIe 6.0
Click to edit Master title style
• Required for data rates of 64.0 GT/s and future higher ones
• Totally new TLP Headers are used in Flit Mode
• Discovered in hardware at Link initialization
• Once negotiated, applies to all data rates
• TLP Translation occurs when forwarding between Flit Mode and Non-Flit Mode Links
• FLIT size: 256B = 236B TLP, 6B DLP, 8B CRC, 6B FEC
• Interleaved FEC – single Symbol (8-bit) correct - Covers TLP Bytes, DLP, and CRC
• 8B of CRC: covers TLP and DLP (250B)
• No Sync hdr, no Framing Token (TLP reformat), no TLP/DLLP CRC
• Guaranteed Ack and credit exchange => low Latency, low storage
• Per-TLP Framing overhead becomes per-Flit overhead
• Small packets are now more efficient
• Large packets are now less efficient

34
TLP Structure – Flit vs Non-Flit
Click to edit Master title style
Start Sequence Header Data Payload ECRC LCRC End
1B 2B 3-4 DW 0-1024 DW 1DW 1DW 1B

TLP D000-D015
TLP D016-D031
TLP D032-D047
TLP D048-D063
TLP D064-D079
TLP D080-D095
TLP D096-D111
TLP D112-D127
TLP D128-D143
TLP D144-D159
TLP D160-D175
TLP D176-D191
TLP D192-D207
TLP D208-D223
TLP D224-D235 DLP 0-3
DLP 4-5 CRC 0-7 FEC 0-5

35
New TLP Header Formats for Flit Mode
Click to edit Master title style
• TLP Header is composed of a 3 to 7 DW TLP
Header Base, followed by 0 to 7 additional DWs of
“Orthogonal Header Content” (OHC)
First DW of FLIT Mode Header Base
• Single DW exceptions: NOP TLP and Reserved TLPs
• Fully-decoded 8b Packet Type field
• All 256 Type values are defined or earmarked to permit
proper framing and forwarding
• First DW of the Header Base includes all info required
to determine the full size of the TLP
64-bit Address Routed TLP – FLIT Mode • TLP Header Base, OHC, Payload, and Trailer
• Exception: Link-Local TLP Prefixes (rarely used)
• End-End TLP Prefixes are integrated into the Header
• Some are in architected OHC fields (e.g., PASID)
• Remaining ones become OHC-E
• Transaction Digest is replaced by a 0 to 5 DW “Trailer”
• Poison / nullify behavior is defined; two ways to poison
Example FLIT Mode OHC DWs

36
DLLP Origin and Destination
Click to edit Master title style
PCI Express Device A PCI Express Device B

Device Core Device Core

PCI Express Core PCI Express Core


Logic Interface Logic Interface
TX RX TX RX
Transaction Layer Transaction Layer

DLLP Data Link Layer Data Link Layer DLLP


Transmitted Received
Physical Layer Physical Layer

Link

37
DLLP Structure (Non-FLIT Mode)
Click to edit Master title style
Bit transmit direction

Start DLLP CRC End


1B 4B 2B 1B

Data Link Layer o ACK / NAK Packets


o Flow Control Packets
Appended by Physical Layer
o Power Management Packets
o Vendor Defined Packets

38
Data Link Layer Payload (DLP) in FLIT Mode
Click to edit Master title style
• Each FLIT has 6Bytes for DLP Non-Flit DLLP (Reference)
• First 2 bytes: FLIT reliability information Start DLLP CRC End
• Ack/Nak/retry - latency optimized 1B 4B 2B 1B
• 10 bit sequence num– low latency Ack/Nak
• Prior FLIT NOP TLPs only to avoid retry on error
Flit Structure

• 4 Bytes for DLLP Payload


• Standard DLLP Payload (similar to PCIe 5.0)
• Optimized perf critical credits: NP Hdr, P Hdr, P TLP D000-D223
Data - 2 DLLP equivalent (cpl expected infinite)

• Equivalent of 3 DLLPs: 1 Ack, 2 credits(P/NP) TLP D224-D235 DLP 0-3


• DLP not replayed – same as DLLP before DLP 4-5 CRC 0-7 FEC 0-5

Every FLIT has Ack/Nak and credits => smaller queue sizes
and low latency even on retry of FLIT

39
Ordered-Set Origin and Destination
Click to edit Master title style
PCI Express Device A PCI Express Device B

Device Core Device Core

PCI Express Core PCI Express Core


Logic Interface Logic Interface
TX RX TX RX
Transaction Layer Transaction Layer

Data Link Layer Data Link Layer

Ordered-Set Ordered-Set
Physical Layer Physical Layer
Transmitted Received

Link

40
Ordered-Set Structure (Non-FLIT Mode)
Click to edit Master title style
COM Identifier Identifier Identifier

o Training Sequence One (TS1)


• 16 character set: 1 COM, 15 TS1 data characters
o Training Sequence Two (TS2)
• 16 character set: 1 COM, 15 TS2 data characters
o SKIP
• 4 character set: 1 COM followed by 3 SKP identifiers
o Fast Training Sequence (FTS)
• 4 characters: 1 COM followed by 3 FTS identifiers
o Electrical Idle (IDLE)
• 4 characters: 1 COM followed by 3 IDL identifiers
o Electrical Idle Exit (EIEOS) (new to 2.0 spec)
• 16 characters

41
Ordered-Sets in FLIT Mode
Click to edit Master title style
• 64GT/s OS works with repetitions & handshake
• More complex rules and more variable formats compared to non-FLIT
• Transmitted between FLITs
• Formats permit receiver to distinguish OS from start of FLIT

• See PHY Logical presentation for details!

42
PCI Express Flow Control
Click to edit Master title style
Credit-based flow control is point-to-point based, not end-to-end

Buffer space
available
TLP
VC Buffer

Transmitter Receiver

Flow Control DLLP (FCx)

Receiver sends Flow Control Packets (FCP) which are a type of DLLP (Data Link Layer Packet)
to provide the transmitter with credits so that it can transmit packets to the receiver

43
ACK/NAK Protocol Overview
Click to edit Master title style
Transmit Receiver
Device A Device B
From To
Transaction Layer Transaction Layer
Tx Rx
Data Link Layer Data Link Layer
TLP DLLP DLLP TLP
ACK / ACK /
Sequence TLP LCRC NAK NAK
Sequence TLP LCRC

Replay
Buffer De-mux De-mux

Error
Mux Mux Check

Tx Rx Tx Rx
DLLP
ACK /
NAK

Link

TLP
Sequence TLP LCRC

44
ACK/NAK Protocol – FLIT Mode Impacts
Click to edit Master title style
• FEC Correction is applied *BEFORE* any error detection is performed

• If FLIT CRC fails, retry is triggered (same as LCRC failure)

• Optimization: Retry error FLIT only with existing Go-Back-N retry

45
ECRC Overview
Click to edit Master title style
Start Sequence Header Data Payload ECRC LCRC End

• “End-to-End CRC” AKA the “I Don’t Trust Switches” feature


• Part of the TLP, therefore it’s covered by the LCRC
• Covers “invariant” parts of the TLP (almost all bits)
• Intended for the ultimate recipient, but allowed to be checked along the way
• Switches pass value unmodified (Multi-cast complicates)
• Loosely defined behavior when mismatched
• Log and report the error like any other (including AER)
• Requests w/bad ECRC are “strongly recommended” to return Unsupported Request (UR) status
• NOTE: PCIe 6.0 made this Must@FLIT (i.e. required for devices implementing Flit mode)
• Even credit updates are only “strongly recommended” on Tx/Rx of bad ECRC packet

46

You might also like