Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
13 views18 pages

Coa Unit 5

A multiprocessor system consists of two or more CPUs that share a common memory, enabling parallel task execution and improved performance. Advantages include increased reliability, throughput, scalability, and efficient resource sharing, while disadvantages involve high costs and complexity. Various interconnection structures exist, such as shared memory and crossbar switch, and interprocessor synchronization is crucial for maintaining data integrity and preventing conflicts.

Uploaded by

Hemant Yadav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
13 views18 pages

Coa Unit 5

A multiprocessor system consists of two or more CPUs that share a common memory, enabling parallel task execution and improved performance. Advantages include increased reliability, throughput, scalability, and efficient resource sharing, while disadvantages involve high costs and complexity. Various interconnection structures exist, such as shared memory and crossbar switch, and interprocessor synchronization is crucial for maintaining data integrity and preventing conflicts.

Uploaded by

Hemant Yadav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 18
COA UI : 5 (MULTIPROCESSOR) Definition of Multiprocessor: ‘A multiprocessor is a computer system having two or more processing units (CPUs) that are connected and share a common main memory. These processors operate under a single operating system and work together to execute different tasks in parallel, thereby increasing the speed of computation and overall system performance. The principle characteristic of a multiprocessor is its ability to share a set of main memory and some I/O devices. Advantages of Multiprocessor Systems Increased Reliabi If one processor fails, the system can still function using the remaining processors, enhancing fault tolerance and system dependability Improved Throughput Multiple processors execute tasks in parallel, leading to higher overall system throughput and faster execution of jobs. Parallel Processing Tasks can be divided into sub-tasks and processed concurrently, which accelerates processing and improves performance for large and complex applications. Scalability The system's processing power can be increased by adding more processors, making it adaptable to higher workloads without drastic changes to the architecture. Efficient Resource Sharing Peripheral devices and data storage can be shared among processors, allowing for better hardware utilization and resource management. Better Multitasking ‘A multiprocessor system can effectively handle multiple jobs or user requests simultaneously, which is particularly important in server or database environments Disadvantages 1. High cost 2. Synchronization is problematic 3. Power consumption is high 4, High complexity Types of multiprocessor 1. Shared memory(Tightly coupled): All processors share a global physical memory space and run under a single operating system 2. Distributed memory(loosely coupled): Each processor has its own local memory and communicates via message passing, 3. Symmetric Multiprocessing (SMP): All processors run the same OS and have equal access to memory and //O. 4. Asymmetric Multiprocessing (AMP): One master processor controls the system; others perform specific tasks Characteristics of Multiprocessor Systems: Multiple Processors: Consist of two or more CPUs within a single computer system, Shared Main Memory: All processors have access to a common, shared memory space, enabling fast data communication, Parallel Processing Capability: Multiple tasks can be performed simultaneously, improving system throughput High Reliability: Failure of one processor does not halt the system; remaining processors continue to function. Resource Sharing: Peripherals, memory, and other hardware resources are shared among all processors. Increased Throughput: Overall system performance and processing speed are significantly increased. Scalabi / Additional processors can be added to further enhance performance. Structure of Multiprocessor: Five Types of Interconnection Structures + Amultiprocessor structure refers to how multiple CPUs (processors) are connected to memory, I/O devices, and each other within a system. The interconnection structure directly influences system performance, scalability, and complexity. Here are the five main types 1. Common (Time-Shared) Bus Structure All processors, memory modules, and |/O devices share a single communication bus. Only one processor can use the bus at any given time, When a processor wants to communicate with memory or another processor, it must check if the bus is free. If the bus is in use, it has to wait for the bus to become available. Use Cases: Small multiprocessor systems or where cost is more important than scalability. Advantage: + Simple to implement. + Due to single common bus cost to implement is very less. Disadvantage: + Data transfer rate is slow. 2. Crossbar Switch Structure Every processor is connected to every memory module through a matrix of switches. Multiple simultaneous connections are possible, improving performance. Consists of a matrix of switches at the intersection of processor buses and memory module paths. Each processor can be directly connected to each memory module via a crosspoint switch Advantages: High throughput—supports multiple simultaneous data transfers, maximizing bandwidth. Disadvantages: Hardware complexity and cost increase rapidly as the number of processors and memory modules grows (nxn matrix for n processors and n memory modules). Difficult and expensive to implement for large systems. Use Cases: High-performance systems where simultaneous memory access is crucial. 3, Multiport Memory Structure Each memory module has multiple ports (connections), one for each processor. Allows parallel access to different memory modules by different processors. Each memory module is equipped with multiple ports, allowing several processors to connect directly to each memory module via dedicated buses. Priority logic resolves conflicts if more than one processor requests the same memory module. If more than one CPU request for same time memory module, priority will be given in the order of CPU- 1,CPU-2,CPU-3,CPU4 Use Cases: Used in systems requiring very high-speed memory access by a limited number of processors Advantages- The high transfer rate can be achieved because of the multiple paths. Disadvantages- It requires expensive memory control logic and a large number of cables and connections 4, Multistage Switching Network The 2x2 crossbar switch is used in the multistage network. It has 2 inputs (A & B) and 2 outputs (0 & 1). To establish the connection between the input & output terminals, the control inputs CA & CB are associated Examples: Omega, Butterfly, and similar multistage networks. Reduces hardware complexity compared to crossbar but may introduce blocking for some access patterns. Advantages: More scalable and cost-effective than full crossbar; allows for multiple, simultaneous connections. Disadvantages: Some memory access patterns may cause blocking (if two processors try to reach the same memory through a shared switch path), and the network is still more complex than a single bus system 5. Hypercube Interconnection Structure + Processors are arranged as nodes of an n-dimensional cube (for n, there are 2” nodes). + Each processor is directly connected to n others (its "neighbors"). Communications are established along cube edges. ‘+ Efficient communication paths and good for loosely coupled systems + Example: For n=3, each processor has 3 neighbors and there are 8 processors in total + Advantages: Efficient for communication-intensive applications + Disadvantages: Connection complexity grows with the number of dimensions (higher cabling/interface logic) Figure - Hypercube Structures Forn = 12:3 Interprocessor arbitration Interprocessor arbitration is the mechanism used in multiprocessor systems to control and coordinate access to a shared resource, usually the system bus, among multiple processors. When several processors request simultaneous access to the common bus (for memory, /O, or communication), arbitration decides which processor will be granted access at a given time to avoid conflicts and ensure orderly data transfer. + Purpose: To resolve conflicts and enforce that only one processor accesses the shared bus at a time, preventing data corruption and system instability + Arbitration methods: Static and Dynamic: Static- Serial and Parallel Serial (Daisy Chain) Arbitration: Processors are connected in series, and priority is assigned based on their position in the chain. The highest priority processor that requests, gets the bus access first. I's simple but can cause higher latency for lower-priority processors. Hoes ony + Advantages Simple and cheaper method Least number of lines, + Disadvantages Higher delay Priority of the processor is fixed \Not reliable. 2. Parallel Arbitration: Uses a priority encoder and decoder circuit external to the processors to determine the highest priority bus request simultaneously. This method is faster and more flexible than serial arbitration + Advantage Separate pair of bus request and bus grant signals, so it is faster + Disadvantage Require more bus request and grant signal + Dynamic Arbitration: Priorities can change dynamically based on algorithms . «Few dynamic arbitration procedures that use dynamic priority algorithms: Time Slice, Polling, LRU, FIFO 1. Time Slice Algorithm Purpose: Allocates a fixed amount of time (called a * or processor in a round-robin manner. How it works: Each requestor gets the bus or CPU for a set period. If it doesn’t finish in that time, i's interrupted, and the next requestor uses the resource. 2, Polling Algorithm ime slice" or “quantum*) to each process Purpose: Used to check the status of multiple devices or processors in a sequence to see if they need the bus or attention. How it works: The controller "polls" each processor/device one by one in a fixed order and grants access only if that processor has made a request. 3. LRU (Least Recently Used) Algorithm Purpose: Manages which resource (like cache blocks or bus access slots) should be replaced or given priority, based on recent usage. How it works: The processor or cache block that has not been used for the longest period is selected for replacement or is given the lowest priority 4, FIFO (First-In, First-Out) Algorithm Purpose: Schedules resource allocation in the order in which requests arrive, How it works: The earliest requestor (first in) is served first, and new requests are queued at the end (first out). + Advantages 1, The priority can be changed by altering the sequence stored in controller. 2. More reliable Inter Process Communication Processes need to communicate with each other in many situations. Inter-Process Communication or IPC is a ‘mechanism that allows processes to communicate, It helps processes synchronize their activities, share information, and avoid conflicts while accessing shared resources. Types of Process + Independent process: An independent process is not affected by the execution of other pro Independent processes do not share any data or resources with other processes. No inter-pro communication is required in this case + Co-operating process: Interact with each other and share data or resources. A co-operating process can be affected by other executing processes. Inter-process communication (IPC) is a mechanism that allows processes to communicate with each other and synchronize their actions. The communication between these processes can be seen as a method of cooperation between them, Inter process communication (IPC) allows different processes running on a computer to share information with each other. IPC allows processes to communicate by using different techniques like sharing memory, sending messages or using files. t ensures that processes ean work together without interfering with each other. Co-operating processes require an Inter Process Communication (IPC) mechanism that will allow them to exchange data and information, The two fundamental models of Inter Process Communication are: + Shared Memory * Message Passing feo So ao era) See ses to communicate and to synchronize Sed Coen) r Interprocessor Synchronization : Interprocessor synchronization in Computer Organization and Architecture (COA) is the set of techniques and mechanisms that ensure multiple processors or cores in a multiprocessor system coordinate their actions when accessing shared resources or performing parallel tasks. Why is Interprocessor Synchronization Needed? Prevent Conflicts: When two or more processors attempt to access or modify the same data in shared memory concurrently, conflicts (race conditions, data corruption) can occur. Maintain Data Integrity: It ensures only one processor accesses the shared resource/critical section at a time, preventing inconsistent or incorrect results Orderly Execution: Synchronization coordinates the execution order among processors, so programs run correctly and efficiently. + Avoid Deadlocks: It helps avoid deadlocks and priority inversion by properly managing resource allocation among processors. ‘Methods & Mechanisms 1, Mutual Exclusion (Mutex): ‘+ Only one processor can enter the critical section (a section of code accessing shared resources) ata time, + Implemented via mutexes, semaphores, locks, or hardware instructions Semaphores: + Variables (binary or counting) used to signal and control access to resources. + Use operations such as WAIT (P) and SIGNAL (V) to manage process access. 3. Message Passing: In distributed systems, processors synchronize by passing messages through established communication channels. + Ensures messages are received and processed in the correct order, and resources are not accessed until safe. Problems Without Synchronization + Race Conditions: Unpredictable outcomes from simultaneous memory access. + Data Inconsistency/Loss: Multiple writes to the same data without coordination can corrupt results. + Deadlocks: Two or more processors block each other by waiting indefinitely for resources Cache coherence Cache coherence refers to maintaining consistency of data stored in local caches of processors in a multiprocessor system when they share a common memory space. When multiple processors have their own cache and keep copies of shared data from main memory, updating or modifying that data in one cache must be reflected in other caches to prevent incorrect or inconsistent results. ‘Why is Cache Coherence Needed? ‘+ Processors in a multiprocessor system frequently access shared data + If one processor updates its cached copy of a shared variable, others might still have outdated values. + Without coherence mechanisms, programs can read stale or incorrect data Shared memory multiprocessors : 1. UMA (Uniform Memory Access) Description: UMA stands for Uniform Memory Access. In this architecture, all processors share the main memory uniformly, meaning every processor has equal access time and speed to any memory location. Memory access time is independent of which processor is accessing which memory module—there is no “local” or "remote" memory. UNIFORM MEMORY ACCESS (UMA) Processor 1 Processor 2 |- + + + | Processor n l I I ‘System, Interconnect (Bus, Crossbar, Multistage network) Ve ‘Shared foe e Shared Memory 1 Memory m 2. NUMA (Non-Uniform Memory Access) Description: NUMA stands for Non-Uniform Memory Access. Here, the system is divided into several nodes. Each node contains processors and its own local memory. A processor can access its own local memory much faster than the memory located in another node (remote memory). Thus, memory access time depends on the physical location of the data relative to the processor. NON-UNIFORM MEMORY ACCESS (NUMA), el rocessor Memory 1 & 4 Tocal Memory 2 Grsseeers 2 Inter- . connection| « 5 Network | ¢ Tocat oe Processor n 3. COMA (Cache-Only Memory Architecture) + Description: COMA stands for Cache-Only Memory Architecture. In COMA systems, the local memories of each node are treated as large caches, not as main memory. There is no home node for any data—memory lines automatically migrate to wherever they are needed most, making the entire memory space function as a giant cache CASH ONLY MEMORY ACCESS (COMA) Interconnection Network Concept of Pipel Pipelining is a technique in a computer's CPU where the process of executing instructions is divided into small steps, and these steps happen at the same time for different instructions. This helps the CPU work faster by doing many instructions at once in a sequence, just like an assembly line in a factory. + In pipelining, each instruction is broken down into sub-operations, such as fetching, decoding, executing, and storing. Each sub-operation is performed in a dedicated segment, and as soon as one stage completes its part for an instruction, it passes it to the next stage and starts on the next instruction, ‘+ This overlapping of instruction execution increases the CPU's throughput, meaning more instructions are completed in a given period, similar to how an assembly line increases the number of finished products. + Pipelining is fundamental for modern CPUs and comes in types like instruction pipelining (handling the stages of fetching, decoding, executing, etc) and arithmetic pipelining (dividing arithmetic operations into pipeline stages) Imagine a factory where a product is made in steps, like assembling a toy. Instead of one person making the whole toy from start to finish, the work is divided into different steps, and each person specializes in one step. While one person is working on painting the toy, another person is already putting together the next toy’s parts. This way, many toys are being worked on at the same time, just at different stages. Types of Pipelining 1. Instruction Pipelining + Definition: The execution of multiple instructions is divided into stages (like fetch, decode, execute, memory access, write back), so several instructions can be processed at different stages simultaneously. Use: Improves the throughput of instruction processing in CPUs and is the most widely implemented pipelining in processors. Example: While one instruction is being executed, the next one is decoded, and another is fetched from memory. 2. Arithmetic Pipelining Definition: Arithmetic operations (such as floating point addition, multiplication, division) are separated into smaller steps, each performed in a pipeline stage, allowing multiple arithmetic computations to overlap. Use: Used in mathematical operation units for high-speed computation (eg,, floating-point operations in scientific calculators or processors) Example: Breaking down floating-point addition into compare exponents, align mantissas, add/subtract mantissas, and normalize result Advantages : ‘+ Increases the number of instructions completed in less time (better throughput) Keeps all CPU parts busy at once, avoiding idle time. Allows faster clock cycles by breaking tasks into small steps. Makes the CPU run faster without changing single instruction speed. Easy to scale and improve performance. Pipeline hazards are conditions in pipelined processors that prevent the next instruction from executing in its scheduled clock cycle, causing delays or stalls in the instruction pipeline. There are three main types of pipeline hazards relevant for Computer Organization and Architecture: Structural Hazards Occur when two or more instructions in the pipeline require the same hardware resource at the same time (eg, a single memory or ALU unit needed simultaneously). This leads to resource conflicts and forces the pipeline to stall or wait. Example: One instruction is fetching an instruction from memory while another needs to read or write data from the same memory. Data Hazards Happen when an instruction depends on the result of a previous instruction that has not yet completed its execution, These dependencies cause stalls until the required data is available. Data hazards have three subtypes: RAW (Read After Write): An instruction needs to read a location that a previous instruction is writing to. After Read): An instruction writes to a location after a previous instruction has WAW (Write After Write): Multiple instructions write to the same location out of order. Control Hazards (Branch Hazards) Result from instructions that alter the program flow (branches, jumps). The pipeline may fetch wrong instructions while waiting to resolve the branch decision, causing delays or flushing of incorrect instructions. Vector processing : Definition: Vector processing is the technique of executing operations on entire arrays (vectors) of data ina single instruction cycle. * Vector Register: ‘+ Special registers called veetor registers are used to store arrays of data, + Each vector register can hold multiple elements VECTOR PROCESSING + There is a class of computational problems that are beyond the capabilities of a conventional computer. These problems require a vast number of computations that will take a conventional computer days or even weeks to complete. Vector Processing Applications + Problems that can be efficiently formulated in terms of vectors and matrices — Long-range weather forecasting - Petroleum explorations — Seismic data analysis -Medical diagnosis — Aerodynamics and space flight simulations — Artificial intelligence and expert systems — Mapping the human genome — Image processing Vector Processor (computer) + Ability to process vectors, and matrices much faster than conventional computers In vector processor a single instruction, can ask for multiple data operations, which saves time, instructions is decoded once, and then it keep on operating on different data items. Vector Processing: What is Array Processing? Array Processing is a technique in computer architecture where a single instruction operates on multiple data elements arranged in arrays (like matrices or vectors) simultaneously. It is a form of parallel processing used to improve performance in data-heavy applications such as scientific ‘computing, image processing, and machine learning, It is based on the SIMD (Single Instruction, Multiple Data) model, where the same operation is applied to multiple data points in parallel. 2. Attached Array Processing Definition: In Attached Array Processing, the array processor is used as @ coprocessor or secondary processor, connected to a general-purpose host computer. The main processor handles general tasks, while the attached array processor handles intensive numeric computations. Key Characteristics: Array processor acts as a dedicated computation unit. Connected to the main CPU via a bus or channel. Processes floating-point operations or matrix-heavy computations, Often used in scientific computing, engineering simulations 1. SIMD (Single Instruction, Multiple Data) Array Processing Definition: In SIMD array processing, a single instruction is broadcasted to all processing elements (PEs), and cach PE performs the same operation on different data elements simultancously. Key Characteristics: ‘+ One Control Unit governs all PEs. + AIIPEs execute same instruction at the same time, ‘+ Suitable for tasks like image processing, matrix operations, and veetor calculations. ‘+ Fast and efficient for regular data patterns RISC (Reduced Instruction Set Computer) Definition: RISC processors use a small, highly-optimized set of simple instructions. Each instruction performs a single task and typically executes in one clock cycle. Key Features: Simpler instructions and decoding. Fixed-length instructions (usually one word). Large number of general-purpose registers. Loads and stores are separate instructions. Simple addressing modes. Highly pipelined for faster performance. Lower power consumption. Advantages: Fast instruction execution. Easier to design and optimize (hardware and compiler). int for portable devices and high-performance systems. \dvantages: More instructions required for complex operations, possibly resulting in bigger code size. May need more RAM to hold additional instructions. Examples: ARM, MIPS, SPARC processors. CISC (Complex Instruction Set Computer) Definition: CISC processors have a large and complex instruction set—each instruction can perform multiple low-level operations (like loading from memory, arithmetic, storing). Key Featur © Complex and variable-length instructions. ‘© Fewer general-purpose registers (more operations in memory itself) = Complex addressing modes. ‘+ Single instruction may perform multiple tasks (e.g., loading, adding, storing all at once). «Instructions can take several clock cycles to execute. ‘+ Microprogrammed control logic is commonly used. + Advantages: + Fewer instructions needed for each task (compact code). + Makes efficient use of memory. + Established software ecosystem (widely used in desktop PCs, lke Intel x86). + Disadvantages: ‘+ Slower execution per instruction due to complexity. + More complicated hardware design and decoding. + Higher power consumption «Examples: Intel x86, AMD processors. 1) CISC architecture 1) RISC architecture gives more importance to hardware more importance to Software 2) Complex instructions. 2) Reduced instructions, 3) It access memory directly 3) It requires registers. 4) Coding in CISC processor is | 4) Coding in RISC processor simple. requires more number of lines. 5) As it consists of complex 5) It consists of simple instructions, it take multiple instructions that take single cycles to execute eyele to execute 6) Complexity lies in 6) Complexity lies in microporgram compiler. ‘Whaat is a Multicore Processor? A multicore processor is a single integrated circuit (IC) chip that contains two or more independent processing units called cores. Each core is capable of reading and executing program instructions on its own, allowing the processor to handle multiple tasks simultaneously. For example, a dual-core processor has two cores, a quad-core has four, and so on. Characteristies of Multicore Processors + Multiple Cores on One Chip: Each core is a fully functional processing unit with its own registers, arithmetic logic unit (ALU), and cache. + Parallel Processing: Cores can execute different instructions at the same time, enabling ‘rue parallelism and improved performance. Shared/Separate Cache: Cores may have their own (L1) cache memories and also share larger caches (L2, L3) to facilitate faster data access and inter-core communication Efficiency: By sharing the same chip and components, multicore processors improve energy and space efficiency over using multiple single-core chips. Compatibility: Widely used in desktops, laptops, smartphones, servers, and embedded systems. Advantages of Multicore Processors Better Performance: Can perform more operations in parallel, especially with software optimized for multiple threads. Improved Multitasking: Several applications or processes can run simultaneously without slowing down the system Energy Efficiency: Multicore chips consume less power and generate less heat than a system with multiple separate processors. Reliability: If one core fails during an operation, other cores can continue functioning, adding to system robustness. Efficient Resource Sharin and improve speed. Scalability: Future versions with more cores can be introduced for higher performance without major architectural changes. Shorter communication paths and shared caches reduce latency

You might also like