Introduction to CPU
1 Concept and Classification
Content 2
Development Trends and
Characteristics
3 Selection Method of x86 CPU
2
CPU Basic Concept
◼ CPU (Central Processing Unit) is a very large-scale integrated circuit, usually called the brain of a
computer. It is the computing core (Core) and control unit of a computer. It is also the most
important component of the entire computer system.
◼ The central processing unit mainly includes an arithmetic unit (ALU, Arithmetic Logic Unit),a
Cache and a Bus for transmitting data, control and status signals.
◼ CPU, Memory and I/O Devices are called the three core components of an electronic computer.
✓ Translating instruction
Cache CPU
✓ Processing data of
Function software
Input Device ALU Output Device
✓ Control Unit
Controller Data Signal CPU ✓ Arithmetic Unit
Component ✓ Cache
Control Signal
3
CPU Classification - By instruction set
X86 ARM PowerPC MIPS DSP
…… ……
CISC RISC
◼ CISC(complex instruction set computer). The early CPUs were all CISC architectures, and were designed to perform the
required computational tasks with minimal machine language instructions. This architecture increases the complexity of
the CPU structure and the CPU process requirements, but is very beneficial for compiler development.
◼ RISC(Reduced Instruction Set Computer). The RISC architecture requires software to specify individual operational steps.
This architecture can reduce the complexity of the CPU and allow a more powerful CPU to be produced at the same
4
level of technology, but with higher requirements for compiler design.
Architecture -SMP vs NUMA
Architecture SMP NUMA
CPU CPU CPU
Technical The performance
CPU CPU CPU Characteristics improvement Difficult to achieve.
depends on the
speed of the CPU.
The nodes have
CPU CPU CPU Coupled computing independent memory
nodes share all and are
Coupling
resources. interconnected by
internal interconnect
modules.
CPU
Expansion
CPU CPU
Capability Low Middle
memory Scale 2~4 CPUs optimal. Support hundreds of
CPUs.
Memory access
Bottleneck conflicts and limited Non-local memory
SMP NUMA by bus bandwidth. access is slow.
◼ SMP (Symmetric Multi-Processor) refers to multiple CPUs working symmetrically in a server without primary or secondary or affiliation. Each CPU
shares the same physical memory, and each CPU accesses the same address in memory for the same amount of time, so SMP is also known as
Uniform Memory Access (UMA).
◼ NUMA (Non-Uniform Memory Access) is a non-uniform storage access structure. The basic feature of a NUMA server is that it has multiple CPUs. The
CPU has independent local memory, I/O slots, and so on. Each CPU can access the entire system's memory, but accessing local memory will be much
faster than accessing remote memory (memory of other CPUs in the system), which is why non-uniform storage accesses NUMA.
◼ The Intel Xeon processor supports the NUMA architecture. xFusion servers enable NUMA function by default.
5
1 Concept and Classification
Content 2
Development Trends and
Characteristics
3 Selection Method of X86 CPU
6
Brief history of x86 processor
x86 processor has a 50 years of history:
➢ Word length from 4 bits to 64 bits;
➢ Clock frequency is gradually improved;
➢ The scale of transistor integration is gradually expanding;
➢ Gradually improving performance. 8th generation
Multi-core
microprocessor era
6th generation
7th generation
Started in 2005
Enhanced
5th generation 64-bit
Pentium era
Pentium microprocessor era
4th generation
microprocessor era
3rd generation
32-bit Started in 1997 Started in 2001
microprocessor era
2nd generation 16-bit
1993-1996
1st generation High-end 8-bit microprocessor era 1983-1993
4-bit and low-end 8-bit microprocessor era 1978-1980
microprocessor era
1973-1978
1971-1973
7
Intel x86 CPU Roadmap
2012 2013 2014 2016 2017 2019 2020 2021
Romley Platform Grantley Platform Purley Platform Whitley Platform
Microarchitecture Microarchitecture Microarchitecture Microarchitecture
Sandy Bridge Haswell Skylake Copper
Sandy Ivy Bridge Cascade
Copper Lake Ice Lake
Bridge Haswell Broadwell Skylake Lake
14 nm 10 nm
32 nm 22 nm 22 nm 14 nm 14 nm 14 nm
New Micro- New New Micro- New New Micro- New Micro-
architecture Process architecture Process architecture
architecture
Tock Tick Tock Tick Tock Tick Tock Tick
◼ Tick-Tock:Tick(craft year) update production process, Tock(architecture year) update micro-architecture.
8
Intel CPU Naming rules
Intel® Xeon® CPU naming rules
ring interconnect Brand Prod Line Version
(E3, E5, E7) Prod Family (v2, v3, v4, etc) ◼ Ring interconnect(Grantley)
to mesh interconnect
Intel® Xeon® processor E# – # # # # v# (Purley);
Wayness, maximum number of CPUs in a node ◼ E5, E7 were merged, only
(1, 2, 4, 8) Platinum, Gold, Silver,
Socket type Bronze four levels in purley
(2, 4, 6, 8)
‘Low Power’ SKUs platform;
(after 4 digit numeric set):
Designator Actual Socket
8 LS (Westmere EX) Processor SKU
Alpha
Suffix
Description
◼ The Bronze is the 31xx series,
6 R (Sandy Bridge)
(i.e. 10, 20, 30, etc…) L Low Power which is the entry-level CPU
4 B2 (Sandy Bridge) (up to 8 cores); the Silver is
2 H2 (Sandy Bridge) the 41xx series (up to 12
cores) for medium loads; the
Gold is 51xx (up to 14 cores)
and 61xx (up to 22 cores) for
general purpose computing;
mesh interconnect Platinum is available for 81xx
(up to 28 cores) for critical
business and high
performance needs.
◼ Gold level CPU has the best
price/performance ratio 。
9
Cache
➢ The Cache Memory is a temporary memory located between the CPU and the memory. CPU has L1,L2,L3 Caches.
processor
Dram
Cache
Fast
Core speed Slow speed
small
capacity
large capacity
◼ Cache balance shifted from shared-distributed (prior architectures) to private-local (Skylake ◼ 1、 Memory reads fill directly to the MLC,
Server architecture) : no longer to both the MLC and LLC ;
➢ Shared-distributed(Haswell): shared-distributed L3 is primary cache ◼ 2、 When a MLC line needs to be removed,
➢ Private-local(Skylake ): private L2 becomes primary cache with shared L3 used as both modified and unmodified lines are
overflow cache written back ;
◼ Shared L3 changed from inclusive to non-inclusive: ◼ 3、 Data shared across cores are copied
➢ Inclusive ( Haswell): L3 has copies of all lines in L2 into the LLC for servicing future MLC
➢ Non-inclusive (Skylake): lines in L2 may not exist in L3 misses.
10
Frequency
➢ The Frequency of CPU includes Basic frequency, External frequency and multiplication frequency.
Basic Frequency = External Clock*Multiplier,which is the formula for the CPU calculation
frequency of the X86 architecture.
➢ Intel Turbo Boost, The Intel CPU operates above the nominal frequency (TSC) and is allocated on
demand for performance allocation.
CORE
CORE
External Multiplier Basic CORE
oscillator Clock Frequency
…
CORE
CPU Basic Frequency Turbo Boost Technology
11
Multicore and Hyper-Threading
➢ Multi-core processors integrate multiple CPUs (cores) into a single integrated circuit chip.
➢ Hyper-threading is a technology that allows one CPU to execute multiple control flows.
Multicore Technology Hyper-Threading Technology
12
Skylake CPU PCIe Extend Feature
➢ Each CPU has 4 IO modules. IOU0 is used to connect to PCH. IOU1~IOU3 are used to connect PCIE
devices. There are 16 PCIE channels under each IOU, which can be combined into X4, X8 and X16
as needed.
CPU IO module
13
1 Concept and Classification
Development Trends and
Content 2
Characteristics
3 Selection Method of X86 CPU
14
CPU Performance Query
➢ SPEC CPU is the CPU subsystem evaluation software introduced by the SPEC organization. Test results include:
SPECint_rate_base2006、SPECfp_rate_base2006、 SPECint®_rate2006、 SPECfp®_rate2006。
➢ tpmC Values are widely used in Domestic and foreign to measure the transaction processing power of computer systems,
which is the abbreviation of "transactions per minute".
tpmC performance estimation method
SPEC cpu test report tpmC estimation principle
➢ Principle 1st: Under the same configuration, the TPMC
performance of different manufacturers' equipment is equivalent;
➢ Principle 2nd: The tpmC value is proportional to the
SPECint_rate_base value;
➢ Principle 3rd: The tpmC value is proportional to the number of
CPUs.
Estimation method
➢ 1、 Query the tpmC value of a CPU on the tpmC official website;
➢ 2、Query the SPECint_rate_base value of the CPU,and divide
tpmC by SPECint_rate_base to get a scale factor;
➢ 3、 The tpmC value can be obtained by multiplying the
SPECint_rate_base value of the CPU that requires the tpmC value by
the coefficient of step 2.
SPEC CPU Query Link:http://www.spec.org/cgi-bin/osgresults?conf=cpu2006&op=form
15
Query link:https://ark.intel.com/#@Processors
CPU Parameter Query
➢ Detailed parameters of the CPU can be found on the official website of intel.
CPU Information Summary Table CPU Detailed parameter table
Query Link:https://ark.intel.com/#@Processors
16
CPU Transition of Different Generations
➢ TRANSITION GUIDE solves the problem of which new generation CPU model can replace the
previous generation CPU model.
①Log in Transition Guide ②Select CPU to be replaced ③ View recommended CPUs
① ③
②
Query link :https://xeonprocessoradvisor.intel.com/exodus/page?eventType=1&targetPageId=1290&defaultFlag=1
17
CPU Parameter Description
➢CPU key informations can be found in the CPU compatibility list
Architecture Package Frequency Voltage Word Length Power Micro architecture Model Core Number
Compatibility query link : https://support.xfusion.com/compatibility-query/#/en/rack-server
18
Thank you. Fusion X, Digital Infinity
Copyright©2022 xFusion Digital Technologies Co., Ltd.
All Rights Reserved.
The information in this document may contain predictive
statements including, without limitation, statements regarding
the future financial and operating results, future product
portfolio, new technology, etc. There are a number of factors that
could cause actual results and developments to differ materially
from those expressed or implied in the predictive statements.
Therefore, such information is provided for reference purpose
only and constitutes neither an offer nor an acceptance. xFusion
may change the information at any time without notice.