Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
12 views8 pages

CC Assignment 1

The document compares multicore CPU architectures and many-core GPU architectures, highlighting differences in core count, design, performance focus, and best use cases. It also discusses the evolution of distributed computing technologies leading to cloud computing, the concept of scalability in distributed systems, and the role of Hadoop in big data processing. Additionally, it explains Xen architecture and differentiates between full, para, and OS-level virtualization.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views8 pages

CC Assignment 1

The document compares multicore CPU architectures and many-core GPU architectures, highlighting differences in core count, design, performance focus, and best use cases. It also discusses the evolution of distributed computing technologies leading to cloud computing, the concept of scalability in distributed systems, and the role of Hadoop in big data processing. Additionally, it explains Xen architecture and differentiates between full, para, and OS-level virtualization.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

What are the key differences between multicore CPU

architectures and many-core GPU architectures?

Aspect CPU (Multicore) GPU (Many-core)

Core Count Few cores (4-64) Thousands of cores

Core Design Complex, powerful cores Simple, streamlined cores

Performance Focus Low latency, single- High throughput, parallel


thread performance processing

Cache System Large multilevel caches Small caches, high-


(L1/L2/L3) bandwidth memory

Memory Strategy Minimize latency for Hide latency through


random access thread switching

Execution Model Task parallelism Data parallelism (SIMD)

Threading 1-2 threads per core Thousands of lightweight


threads

Control Logic Complex branch Simple control units


prediction, out-of-order shared across groups
execution

Context Switching Sophisticated scheduling Rapid context switching

Branch Handling Excellent with complex Poor performance with


branching divergent branches

Best Use Cases Sequential tasks, general Graphics rendering,


computing, complex ML/AI, scientific
algorithms computing

Optimization Goal Minimize time per task Maximize tasks


completed per unit time
What are the major distributed computing technologies
that led to cloud computing?

Grid Computing (1990s-2000s)

Enabled resource sharing across geographically distributed systems


Projects like SETI@home demonstrated massive distributed processing
Laid groundwork for resource pooling and remote computation

Cluster Computing

Connected multiple machines to work as single system


Technologies like Beowulf clusters made parallel computing accessible
Established concepts of fault tolerance and load distribution

Service-Oriented Architecture (SOA)

Introduced loose coupling between services


Web services standards (SOAP, REST, XML) enabled remote service calls
Created foundation for microservices and API-based architectures

Virtualization Technologies

VMware, Xen, KVM enabled hardware abstraction


Multiple virtual machines on single physical hardware
Essential for resource isolation and multi-tenancy

Web Services & Middleware

CORBA, RMI, .NET Remoting for distributed object communication


Message queuing systems (MQSeries, RabbitMQ)
Application servers for scalable web applications

Peer-to-Peer (P2P) Networks

BitTorrent, Napster showed distributed content delivery


Demonstrated scalability without central control
Influenced CDN and edge computing concepts
Utility Computing

IBM's concept of computing as metered service


Pay-per-use models for computing resources
Direct precursor to cloud pricing models

Internet Infrastructure Evolution

High-speed broadband adoption


Improved network reliability and bandwidth
Made remote computing practical for enterprises

Explain the concept of scalability in distributed


computing and how it can be achieved
Scalability in distributed computing refers to the system's ability to handle increasing
workloads or expand by adding more resources (e.g., servers, nodes) without degrading
performance.

Types:

1. Horizontal Scalability – Adding more machines/nodes.


2. Vertical Scalability – Adding more power (CPU, RAM) to existing machines.

Achieving Scalability:

Load Balancing: Distribute tasks evenly across nodes.


Data Partitioning (Sharding): Split data across multiple databases or servers.
Caching: Store frequently accessed data in memory for faster access.
Asynchronous Processing: Use queues and background jobs to handle tasks
efficiently.
Decentralization: Avoid bottlenecks by removing single points of failure.

Scalability ensures the system remains efficient and responsive as demand grows.

Describe the role of Hadoop in distributed computing


and its advantages for big data processing.
Core Function

Framework for storing and processing massive datasets across commodity


hardware clusters
Handles data distribution, fault tolerance, and parallel processing automatically
Key Components

HDFS: Distributed file system storing data across multiple nodes


MapReduce: Programming model for parallel data processing
YARN: Resource management and job scheduling

Advantages for Big Data Processing

Scalability

Scales horizontally by adding commodity servers


Handles petabytes of data across thousands of nodes

Cost-Effective

Uses inexpensive commodity hardware instead of specialized systems


Open-source reduces licensing costs

Fault Tolerance

Automatic data replication (default 3 copies)


Continues processing even when nodes fail
Self-healing through automatic recovery

Flexibility

Handles structured, semi-structured, and unstructured data


No predefined schema required (schema-on-read)

Parallel Processing

Distributes computation across cluster nodes


Processes data where it's stored (data locality)
Reduces network traffic and improves performance

High Throughput

Optimized for batch processing large volumes


Better for throughput than low-latency operations

Ecosystem Integration

Rich ecosystem (Hive, Pig, Spark, HBase)


Integrates with various data sources and analytics tools

Data Locality

Moves computation to data rather than data to computation


Minimizes network overhead and improves efficiency
Describe Xen architecture with neat diagram and explain
its working.

Xen Hypervisor Architecture

Architecture Overview Xen is a Type-1 (bare-metal) hypervisor that runs directly on


hardware, managing multiple virtual machines called "domains."

Key Components

Xen Hypervisor

Thin layer running directly on hardware


Manages CPU scheduling, memory allocation, and interrupt handling
Provides isolation between domains
Minimal footprint for better performance

Domain Types

Domain0 (Dom0) - Control Domain

Privileged domain with special access to hardware


Runs control stack and device drivers
Manages other domains (create, destroy, migrate)
Handles I/O operations for guest domains
Only domain with direct hardware access

Guest Domains (DomU)

Unprivileged virtual machines


Run guest operating systems (Linux, Windows)
No direct hardware access
Communicate with Dom0 for I/O operations

Working
The XEN hypervisor sits directly on the hardware.
Dom0 is responsible for device I/O, VM management, and control.
Guest domains (DomU) run isolated virtual machines with their applications.
This setup enables efficient virtualization, isolation, and resource management
across multiple OSes.

Differentiate Full-virtualized, Para-virtualized and OS-


level virtualization.

Virtualization Types Comparison


Aspect Full Virtualization Para- OS-Level
virtualization Virtualization

Definition Complete Modified guest OS Shared kernel with


hardware aware of isolated user
simulation virtualization spaces

Guest OS No modification Requires OS kernel No separate guest


Modification required modification OS

Hypervisor Type Type-1 or Type-2 Type-1 (bare- Container runtime


metal)

Hardware Complete Paravirtualized No hardware


Abstraction hardware drivers virtualization
emulation

Performance Lower (emulation Higher (reduced Highest (native


overhead) overhead) performance)

Isolation Level Strong (hardware- Strong Process-level


level) (hypervisor- isolation
managed)

Resource High Medium Low


Overhead

Boot Time Slow (full OS boot) Moderate Fast (container


(modified OS startup)
boot)

Memory Usage High (separate OS Medium (shared Low (shared


instances) components) kernel)

OS Diversity Multiple different Multiple OS types Same OS kernel


OS types (modified) only

Examples VMware vSphere, Xen (paravirt Docker, LXC,


VirtualBox mode), VMware OpenVZ
ESX

Hardware Virtualization Standard Standard


Requirements extensions helpful hardware hardware
Security Strong VM Strong domain Container-level
isolation isolation isolation

Scalability Limited (resource Better than full High (lightweight)


intensive) virtualization

Use Cases Legacy High- Microservices,


applications, performance DevOps
multiple OS virtualization

Migration VM migration Domain migration Container


supported supported portability

Management High Medium Low


Complexity

Startup Overhead High Medium Minimal

Network Lower (emulated Better (paravirt Native network


Performance network) network) performance

Storage Lower (emulated Better (paravirt Native storage


Performance storage) storage) performance

You might also like