Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
19 views28 pages

Unit 5 Notes

A multiprocessor system consists of two or more CPUs that share access to a common RAM, aiming to enhance execution speed, fault tolerance, and application matching. There are two main types of multiprocessors: shared memory and distributed memory, with further classifications into symmetric and asymmetric multiprocessors. While multiprocessor systems offer advantages like increased reliability and throughput, they also come with challenges such as higher costs and the need for complex operating systems.

Uploaded by

Vijay Malviya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views28 pages

Unit 5 Notes

A multiprocessor system consists of two or more CPUs that share access to a common RAM, aiming to enhance execution speed, fault tolerance, and application matching. There are two main types of multiprocessors: shared memory and distributed memory, with further classifications into symmetric and asymmetric multiprocessors. While multiprocessor systems offer advantages like increased reliability and throughput, they also come with challenges such as higher costs and the need for complex operating systems.

Uploaded by

Vijay Malviya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

UNIT 5

Multiprocessor: A Multiprocessor is a computer system with two or more central processing units
(CPUs) share full access to a common RAM. The main objective of using a multiprocessor is to boost
the system’s execution speed, with other objectives being fault tolerance and application matching.
There are two types of multiprocessors, one is called shared memory multiprocessor and another is
distributed memory multiprocessor. In shared memory multiprocessors, all the CPUs shares the
common memory but in a distributed memory multiprocessor, every CPU has its own private
memory.

Most computer systems are single processor systems i.e., they only have one processor. However,
multiprocessor or parallel systems are increasing in importance nowadays. These systems have
multiple processors working in parallel that share the computer clock, memory, bus, peripheral
devices etc. An image demonstrating the multiprocessor architecture

Types of Multiprocessors
There are mainly two types of multiprocessors i.e. symmetric and asymmetric multiprocessors.
Details about them are as follows ?

Symmetric Multiprocessors

In these types of systems, each processor contains a similar copy of the operating system and they all
communicate with each other. All the processors are in a peer to peer relationship i.e. no master -
slave relationship exists between them.

An example of the symmetric multiprocessing system is the Encore version of Unix for the Multimax
Computer.

Asymmetric Multiprocessors

In asymmetric systems, each processor is given a predefined task. There is a master processor that
gives instruction to all the other processors. Asymmetric multiprocessor system contains a master
slave relationship.

Asymmetric multiprocessor was the only type of multiprocessor available before symmetric
multiprocessors were created. Now also, this is the cheaper option.

Advantages of Multiprocessor Systems

There are multiple advantages to multiprocessor systems. Some of these are ?

More reliable Systems

In a multiprocessor system, even if one processor fails, the system will not halt. This ability to
continue working despite hardware failure is known as graceful degradation. For example: If there
are 5 processors in a multiprocessor system and one of them fails, then also 4 processors are still
working. So the system only becomes slower and does not ground to a halt.

Enhanced Throughput

If multiple processors are working in tandem, then the throughput of the system increases i.e.
number of processes getting executed per unit of time increase. If there are N processors then the
throughput increases by an amount just under N.

More Economic Systems

Multiprocessor systems are cheaper than single processor systems in the long run because they
share the data storage, peripheral devices, power supplies etc. If there are multiple processes that
share data, it is better to schedule them on multiprocessor systems with shared data than have
different computer systems with multiple copies of the data.

Disadvantages of Multiprocessor Systems

There are some disadvantages as well to multiprocessor systems. Some of these are:

Increased Expense

Even though multiprocessor systems are cheaper in the long run than using multiple computer
systems, still they are quite expensive. It is much cheaper to buy a simple single processor system
than a multiprocessor system.
Complicated Operating System Required

There are multiple processors in a multiprocessor system that share peripherals, memory etc. So, it is
much more complicated to schedule processes and impart resources to processes than in single
processor systems. Hence, a more complex and complicated operating system is required in
multiprocessor systems.

Large Main Memory Required

All the processors in the multiprocessor system share the memory. So a much larger pool of memory
is required as compared to single processor systems.

Characteristics of Multiprocessor
There are the major characteristics of multiprocessors are as follows −
• Parallel Computing − This involves the simultaneous application of multiple
processors. These processors are developed using a single architecture to execute a
common task. In general, processors are identical and they work together in such a
way that the users are under the impression that they are the only users of the system.
In reality, however, many users are accessing the system at a given time.
• Distributed Computing − This involves the usage of a network of processors. Each
processor in this network can be considered as a computer in its own right and have
the capability to solve a problem. These processors are heterogeneous, and generally,
one task is allocated to a single processor.
• Supercomputing − This involves the usage of the fastest machines to resolve big and
computationally complex problems. In the past, supercomputing machines were
vector computers but at present, vector or parallel computing is accepted by most
people.
• Pipelining − This is a method wherein a specific task is divided into several subtasks
that must be performed in a sequence. The functional units help in performing each
subtask. The units are attached serially and all the units work simultaneously.
• Vector Computing − It involves the usage of vector processors, wherein operations
such as ‘multiplication’ are divided into many steps and are then applied to a stream
of operands (“vectors”).
• Systolic − This is similar to pipelining, but units are not arranged in a linear order.
The steps in systolic are normally small and more in number and performed in a
lockstep manner. This is more frequently applied in special-purpose hardware such as
image or signal processors.

SIMD (Single Instruction Multiple Data) and MIMD


(Multi Instruction Multiple Data)
The two basic classifications of parallel processing are SIMD which stands for Single
Instruction Multiple Data and MIMD which stands for Multiple Instruction Multiple Data.
The use of SIMD enables the processing of many data with a single instruction and is
applicable to most operations that are uniform such as image processing. On the other hand,
MIMD permits different instructions on different data items to be performed hence making
it more versatile for versatile applications, simulation applications, and multitasking
applications. Comparing SIMD and MIMD allows the decision on what architecture to use
when concentrating on a certain kind of computational problem.
SIMD stands for Single Instruction Multiple Data that is a specialized type of computer
architecture in which the processors perform all calculations on a series of data at one time.
This architecture is ideal for those applications that involve the same operation to be done
on large sets such as multimedia and scientific simulations. The SIMD can be done on
different types of hardware such as; the CPU with simultaneous multiple data hardware
such as the Intel SSE or the AVX and the GPU hardware.

Advantages of SIMD
• Efficiency: Specifically, SIMD is useful in operations that require the application of
similar operations in large sets such as images or matrices.
• Parallelism: It refers to the technique involving the computation through applying the
same instruction to numerous data points to cut down the time of processing large
amounts of data.
• Simpler control: Because only one instruction is issued to all of the data points, there
is little overhead with which to manage the tasks and ensure that they are synchronized.
Disadvantages of SIMD
• Limited Flexibility: As has been seen, SIMD is most useful when the required
operation in a task is the same for all the data points. It is not very efficient in what are
known as non-linear computations, that is, computations which require different
operations depending on the data at hand.
• Data Alignment Issues: It is for this reason that SIMD is well suitable for data arrays
that are structured properly. If the entries in the data are not standardized or reconciled
properly there can be ineffectiveness or even inaccuracies.
MIMD
MIMD or Multiple Instruction, Multiple Data is a type of parallel processing where in many
processors handle various instructions on various data at the same time. This architecture
allows for a great degree of adaptability and the system can be used for a wide range of
applications, from realistic modeling to multi-threaded program. MIMD systems find more
application in the current multi-core processors and distributed computational platforms.

Advantages of MIMD
• Flexibility: MIMD is preferably suited to multitask since it can perform different tasks
concurrently. Due to it’s risky nature, it is best suited for computations involving lots
of processing power such as multi-threaded processes.
• Scalability: It must be noted that MIMD architecture is highly scalable in nature since
the presence of more processors could always be included to tackle other activities
without interfering with the existing operations.
• Task variety: In MIMD system, while each of the processor can work on a completely
different operation, which is a desirable property in distributed system or simulation.
Disadvantages of MIMD
• Complexity: Inter-processor cooperation and coordination in executing tasks in the
MIMD type is very delicate and demanding to avoid such things as deadlock and race
conditions.
• Higher cost: MIMD systems require more coordination because of the increased
complexity and thus can be costly to implement and manage when compared to SIMD
systems.
• Overhead: The maintenance of a variety of activities and directions to varied
processors entails increased expenses as regards time and working resources.

Memory in Multiprocessor System


• Different Shared Memory Multiprocessor Models

The most popular parallel computers are those that implement programs in MIMD
mode. There are two major types of parallel computers such as shared memory
multiprocessor & message-passing multi computers. The main difference between
multiprocessors & multicomputer lies in memory sharing and the structure used for
interprocessor communication.
The processor in the multiprocessor system communicates with each other through a
shared variable in common memory. Each computer node in a multicomputer system
has local memory, unshared with different nodes. Inter-process communication is done
through message passing among nodes.
Three shared memory multiprocessor models are as follows −
1. UMA Model
UMA stands for Uniform memory access Model. In this model, the physical memory
is consistently joined by all the processors. All processors have the same access time
to all memory words, that’s why it is known as uniform memory access. Each
processor can use a private cache. Peripherals are also shared.
UMA model is applicable for time-sharing applications by various users. It can be
used to speed up the implementation of a single high program in time-critical
applications. When all processors have similar access to all peripheral devices, the
system is known as a symmetric multiprocessor. In this method, all the processors are
uniformly adequate for the running program, including the kernel.
2. NUMA Model
NUMA stands for Non-uniform memory access model. A NUMA multiprocessor is a
shared memory system in which the access time diverges with the area of the memory
word. There are two NUMA machine models are shown in the figure.
The shared memory is physically shared to all processors, known as local memories.
The set of all local memories forms a global address area approachable by all
processors.
It is quicker to create a local memory with a local processor. The approach of remote
memory connected to other processors takes higher because of the added delay
through the interconnection network.
3. Cache Only Memory Architecture (COMA):
The COMA model is a special case of the NUMA model. Here, all the distributed main
memories are
converted to cache memories.
Interconnection Structures:
The interconnection between the components of a multiprocessor System can have different physical
configurations depending n the number of transfer paths that are available between the processors
and memory in a shared memory system and among the processing elements in a loosely coupled
system.
Inter-processor Arbitration:

The processor, main memory and I/O devices can be interconnected by means of a common
bus. A bus is set of lines (wires) defined to transfer all bits of a word from a specified source
to a specified destination. Thus, bus provides a communication path for the transfer of data.

The bus includes data lines, address lines and control lines. Such a bus known as system bus.
Different types of arbitration: Serial (Daisy Chain) arbitration, Parallel arbitration, Dynamic
arbitration

1. Serial (Daisy Chain) arbitration

In this type of arbitration, processors can access bus based on priority. In serial arbitration, bus
access priority resolving based on the serial connection of the processors. This technique is
obtained from daisy chain (serial) connection of processors. The serial priority resolving
technique is obtained from daisy-chain connection similar to the daisy chain priority interrupt
logic. The processors connected to the system bus are assigned priority according to their
position along the priority control line.

Figure : Serial (Daisy Chain) Arbitration

When multiple devices concurrently request the use of the bus, the device with the
highest priority is granted access to it. Each processor has its own bus arbiter logic with
priority-in and priority-out lines. The priority out (PO) of each arbiter is connected to
the priority in (PI) of the next-lower-priority arbiter. The PI of the highest-priority unit
is maintained at a logic value 1. The highest-priority unit in the system will always
receive access to the system bus when it requests it. The processor whose arbiter has a
PI = 1 and PO = 0. That processor accesses the system bus.

Advantages

Simple and cheaper method

Least number of lines.

Disadvantages

Higher delay
Priority of the processor is fixed

Not reliable

2. Parallel Arbitration

In this technique uses an external priority encoder and decoder as shown in figure below.
Each bus arbiter in the parallel scheme has a bus request output line and a bus acknowledge
input line. When processor wants to access system bus at that time arbiter of that processor
enables request line. The processor takes control of the bus if it acknowledges input line
is enabled.

Figure : Parallel Arbitration

Figure shows the request lines from four arbiters going into a 4 x 2 priority encoder. The
output of the encoder generates a 2-bit code, which represents the highest-priority unit
among those requesting the bus. The 2-bit code from the encoder output drives a 2×4
decoder which enables the proper acknowledge line to grant bus access to the highest-
priority unit. It works on priority encoder truth table.

Advantage

Separate pair of bus request and bus grant signals, so it is faster.

Disadvantage

Require more bus request and grant signal.

3. Dynamic Arbitration

Discussed two bus arbitration procedures use a static priority algorithm. The priority of
each device is fixed by the way it is connected to the bus. In contrast, a dynamic priority
algorithm gives the system the capability for changing the priority of the devices while the
system is in operation. Few dynamic arbitration procedures that use dynamic priority
algorithms: Time Slice, Polling, LRU, FIFO

Time Slice: In this algorithm allocates a fixed-length time slice of bus time that is offered
to each processor in sequentially manner, in round-robin fashion. The service provide to
each processor with this scheme is independent of its location along the bus. No preference
is given to any particular device since each is allotted the same amount of time to
communicate with the bus.

Polling: In a bus system that uses polling, the bus-grant signal is replaced by a set of lines
called poll lines, which are connected to all units. Poll lines are used by the bus controller
to define an address for each device connected to the bus. The bus controller, arrange
address in a sequence through prescribed manner. When a processor that recognizes its
address, it activates the bus busy-line and then accesses the bus. After a number of bus
cycles, the polling process continues by choosing a different processor. The polling
sequence is normally programmable, and as a result, the selection priority can be randomly
under program control.

LRU: The LRU (least recently used) algorithm gives the highest priority to the requesting
device that has not used the bus for the longest interval. The priorities are adjusted after a
number of bus cycles according to the LRU algorithm. With this procedure, no processor
is favoured over any other since the priorities are dynamically changed to give every device
an opportunity to access the bus.

FIFO: In the first-come, first-serve scheme, requests are served in the order received. To
implement this algorithm, the bus controller establishes a queue arranged according to the
time that the bus requests arrive. Each processor must wait for its turn to use the bus on a
first-in, first-out (FIFO) basis.

Advantages

The priority can be changed by altering the sequence stored in controller.

More reliable.

Inter-Processor Communication and Synchronization:


• In computer science, inter-process communication or inter process communication
(IPC) refers specifically to the mechanisms an operating system provides to allow the
processes to manage shared data.
• Typically, applications can use IPC, categorized as clients and servers, where the
client requests data and the server responds to client requests. Many applications
are both clients and servers, as commonly seen in distributed computing.
• Methods for doing IPC are divided into categories which vary based on software
requirements, such as performance and modularity requirements, and system
circumstances, such as network bandwidth and latency.
• In order to cooperate concurrently executing processes must communicate and
synchronize. Inter process communication is based on the use of shared variables
(variables that can be referenced by more than one process) or message passing.
Process Synchronization:
Process Synchronization means sharing system resources by processes in such a way that,
Concurrent access to shared data is handled thereby minimizing the chance of inconsistent
data. Maintaining data consistency demands mechanisms to ensure synchronized execution
of cooperating processes. Process Synchronization was introduced to handle problems that
arose while multiple process executions. Synchronization is often necessary when processes
communicate. To make this concept clearer, consider the batch operating system again. A
shared buffer is used for communication between the leader process and the executor
process. These processes must be synchronized so that, for example, the executor process
never attempts to read data from the input if the buffer is empty.
Depending on the solution, an IPC mechanism may provide synchronization or leave it up to
processes and threads to communicate amongst themselves (e.g. via shared memory).
While synchronization will include some information (e.g. whether or not the lock is
enabled, a count of processes waiting, etc.) it is not primarily an information-passing
communication mechanism.
Examples of synchronization primitives are:
Semaphore
Spinlock
Barrier
Mutual exclusion:

Concept of Pipeline:
Pipelining is the process of accumulating instruction from the processor through a
pipeline. It allows storing and executing instructions in an orderly process. It is also known
as pipeline processing.
Pipelining is a technique where multiple instructions are overlapped during execution.
Pipeline is divided into stages and these stages are connected with one another to form a
pipe like structure.
Instructions enter from one end and exit from another end.
Pipelining increases the overall instruction throughput.
In pipeline system, each segment consists of an input register followed by a
combinational circuit.
The register is used to hold data and combinational circuit performs operations on it. The
output of combinational circuit is applied to the input register of the next segment.

Types of Pipeline:
It is divided into 2 categories:
1. Arithmetic Pipeline:
Arithmetic pipelines are usually found in most of the computers. They are used for floating
point operations, multiplication of fixed point numbers etc. For example: The input to the
Floating Point Adder pipeline is:
X = A*2^a
Y = B*2^b
Here A and B are mantissas (significant digit of floating point numbers), while a and b are
exponents.
The floating point addition and subtraction is done in 4 parts:
1. Compare the exponents.
2. Align the mantissas.
3. Add or subtract mantissas
4. Produce the result.
Registers are used for storing the intermediate results between the above operations.
2. Instruction Pipeline:
In this a stream of instructions can be executed by overlapping fetch, decode and execute
phases of an instruction cycle. This type of technique is used to increase the throughput of
the computer system.

An instruction pipeline reads instruction from the memory while previous instructions are
being executed in other segments of the pipeline. Thus we can execute multiple instructions
simultaneously. The pipeline will be more efficient if the instruction cycle is divided into
segments of equal duration.
Pipeline Conflicts:
There are some factors that cause the pipeline to deviate its normal performance. Some of
these factors are given below:
1. Timing Variations:
All stages cannot take same amount of time. This problem generally occurs in instruction
processing where different instructions have different operand requirements and thus
different processing time.
2. Data Hazards:
When several instructions are in partial execution, and if they reference same data then the
problem arises. We must ensure that next instruction does not attempt to access data
before the current instruction, because this will lead to incorrect results.
3. Branching:
In order to fetch and execute the next instruction, we must know what that instruction is. If
the present instruction is a conditional branch, and its result will lead us to the next
instruction, then the next instruction may not be known until the current one is processed.
4. Interrupts: Interrupts set unwanted instruction into the instruction stream. Interrupts
effect the execution of instruction.
5. Data Dependency:
It arises when an instruction depends upon the result of a previous instruction but this
result is not yet available.
Advantages of Pipelining
1. The cycle time of the processor is reduced.
2. It increases the throughput of the system
3. It makes the system reliable.
4. Disadvantages of Pipelining
5. The design of pipelined processor is complex and costly to manufacture.
6. The instruction latency is more.

Vector Processing:
There is a class of computational problems that are beyond the capabilities of a
conventional computer.
These problems require vast number of computations on multiple data items that will take a
conventional computer (with scalar processor) days or even weeks to complete. Such
complex instructions, which operate on multiple data at the same time, requires a better
way of instruction execution, which was achieved by Vector processors.
Scalar CPUs can manipulate one or two data items at a time, which is not very efficient. Also,
simple instructions like ADD A to B, and store into C are not practically efficient. Addresses
are used to point to the memory location where the data to be operated will be found,
which leads to added overhead of data lookup. So until the data is found, the CPU would be
sitting ideal, which is a big performance issue.
Hence, the concept of Instruction Pipeline comes into picture, in which the instruction
passes through several sub-units in turn. These sub-units perform various independent
functions, for example: the first one decodes the instruction, the second sub-unit fetches
the data and the third sub-unit performs the math itself. Therefore, while the data is
fetched for one instruction, CPU does not sit idle; it rather works on decoding the next
instruction set, ending up working like an assembly line.
Vector processor, not only use Instruction pipeline, but it also pipelines the data, working on
multiple data at the same time.
In vector processor a single instruction, can ask for multiple data operations, which saves
time, as instruction is decoded once, and then it keeps on operating on different data items.

Applications of Vector Processors:


Computer with vector processing capabilities are in demand in specialized applications. The
following are
some areas where vector processing is used:
1. Petroleum exploration.
2. Medical diagnosis.
3. Data analysis.
4. Weather forecasting.
5. Aerodynamics and space flight simulations.
6. Image processing.
7. Artificial intelligence.

Array Processing
Array Processors:
They perform computations on large arrays of data. Thus, they are used to improve the
performance of the computer.

Why use the Array Processor:


An array processor increases the overall instruction processing speed.
As most of the Array processors operate asynchronously from the host CPU, hence it
improves the overall capacity of the system.
Array Processors has its own local memory, hence providing extra memory for systems
with low memory.
There are basically two types of array processors:
4. Attached Array Processors:
An attached array processor is a processor which is attached to a general purpose computer
and its purpose is to enhance and improve the performance of that computer in numerical
computational tasks. It achieves high performance by means of parallel processing with
multiple functional units.

5. SIMD Array Processors:


SIMD is the organization of a single computer containing multiple processors operating in
parallel. The processing units are made to operate under the control of a common control
unit, thus providing a single instruction stream and multiple data streams.
A general block diagram of an array processor is shown below. It contains a set of identical
processing elements (PE's), each of which is having a local memory M. Each processor
element includes an ALUand registers. The master control unit controls all the operations of
the processor elements. It also decodes the instructions and determines how the instruction
is to be executed.
The main memory is used for storing the program. The control unit is responsible for
fetching the instructions. Vector instructions are send to all PE's simultaneously and results
are returned to the memory.
The best known SIMD array processor is the ILLIAC IV computer developed by the
Burroughs corps. SIMD processors are highly specialized computers. They are only suitable
for numerical problems that can be expressed in vector or matrix form and they are not
suitable for other types of computations.
RISC And CISC Processor
Multicore Processor
Multicore processors and multiprocessor systems are both designed to
improve computer performance, but they do so in different ways. Multicore
processors contain multiple processing units (cores) on a single chip, while
multiprocessor systems use multiple separate processors (CPUs).
A multicore processor is an integrated circuit that has two or more processor cores
attached for enhanced performance and reduced power consumption. These
processors also enable more efficient simultaneous processing of multiple tasks,
such as with parallel processing and Multithreading. A dual core setup is similar to
having multiple, separate processors installed on a computer. However, because
the two processors are plugged into the same socket, the connection between them
is faster.

The use of multicore processors or microprocessors is one approach to boost


processor performance without exceeding the practical limitations of
semiconductor design and fabrication. Using multicores also ensure safe operation
in areas such as heat generation.

Working of Multicore Processor


The heart of every processor is an execution engine, also known as a core. The
core is designed to process instructions and data according to the direction of
software programs in the computer's memory. Over the years, designers found that
every new processor design had limits. Numerous technologies were developed to
accelerate performance, including the following ones:

• Clock speed. One approach was to make the processor's clock faster. The clock
is the "drumbeat" used to synchronize the processing of instructions and data
through the processing engine. Clock speeds have accelerated from several
megahertz to several gigahertz (GHz) today. However, transistors use up power
with each clock tick. As a result, clock speeds have nearly reached their limits
given current semiconductor fabrication and heat management techniques.

• Hyper-threading. Another approach involved the handling of multiple


instruction threads. Intel calls this hyper-threading. With hyper-threading,
processor cores are designed to handle two separate instruction threads at the
same time. When properly enabled and supported by both the computer's
firmware and operating system (OS), hyper-threading techniques enable one
physical core to function as two logical cores. Still, the processor only
possesses a single physical core. The logical abstraction of the physical
processor added little real performance to the processor other than to help
streamline the behavior of multiple simultaneous applications running on the
computer.

• More chips. The next step was to add processor chips -- or dies -- to the
processor package, which is the physical device that plugs into the
motherboard. A dual-core processor includes two separate processor cores.
A quad-core processor includes four separate cores. Today's multicore
processors can easily include 12, 24 or even more processor cores. The
multicore approach is almost identical to the use of multiprocessor
motherboards, which have two or four separate processor sockets. The effect
is the same. Today's huge processor performance involves the use of processor
products that combine fast clock speeds and multiple hyper-threaded cores.

Multicore processors have multiple processing units incorporated in them. They connect directly with
their internal cache, as well as with the system bus and memory.

There are several major use cases for multicore processors, including the
following five:

1. Virtualization. A virtualization platform, such as VMware, is designed to


abstract the software environment from the underlying hardware.
Virtualization is capable of abstracting physical processor cores into
virtual processors or central processing units (vCPUs) which are then
assigned to virtual machines (VMs). Each VM becomes a virtual server
capable of running its own OS and application. It is possible to assign
more than one vCPU to each VM, allowing each VM and its application
to run parallel processing software if desired.

2. Databases. A database is a complex software platform that frequently


needs to run many simultaneous tasks such as queries. As a result,
databases are highly dependent on multicore processors to distribute
and handle these many task threads. The use of multiple processors in
databases is often coupled with extremely high memory capacity that
can reach 1 terabyte or more on the physical server.

3. Analytics and HPC. Big data analytics, such as machine learning, and
high-performance computing (HPC) both require breaking large,
complex tasks into smaller and more manageable pieces. Each piece of
the computational effort can then be solved by distributing each piece of
the problem to a different processor. This approach enables each
processor to work in parallel to solve the overarching problem far faster
and more efficiently than with a single processor.

4. Cloud. Organizations building a cloud will almost certainly adopt


multicore processors to support all the virtualization needed to
accommodate the highly scalable and highly transactional demands of
cloud software platforms such as OpenStack. A set of servers with
multicore processors can allow the cloud to create and scale up more
VM instances on demand.

5. Visualization. Graphics applications, such as games and data-


rendering engines, have the same parallelism requirements as other
HPC applications. Visual rendering is math- and task-intensive, and
visualization applications can make extensive use of multiple processors
to distribute the calculations required. Many graphics applications rely
on graphics processing units (GPUs) rather than CPUs. GPUs are
tailored to optimize graphics-related tasks. GPU packages often contain
multiple GPU cores, similar in principle to multicore processors.
Examples of multicore processors
Most modern processors designed and sold for general-purpose x86
computing include multiple processor cores. Examples of latest Intel 12th-
generation multicore processors include the following:

• Intel Core i9 12900 family provides 8 cores and 24 threads.

• Intel Core i7 12700 family provides 8 cores and 20 threads.

• Top Intel Core i5 12600K processors offer 6 cores and 16 threads.

Examples of latest AMD Zen multicore processors include:

• AMD Zen 3 family provides 4 to 16 cores.

• AMD Zen 2 family provides up to 64 cores.

• AMD Zen+ family provides 4 to 32 cores.

You might also like