Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
4 views73 pages

Module 1 - Data Center Environment

Uploaded by

vvce22cse0131
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views73 pages

Module 1 - Data Center Environment

Uploaded by

vvce22cse0131
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 73

DATA CENTER

ENVIRONMENT

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 1
Contents
▪ Application
▪ Database Management System (DBMS)
▪ Host(Compute)
▪ Connectivity
▪ Storage
▪ Disk Drive Components
▪ Disk Drive Performance
▪ Host Access to Data
▪ Direct-Attached Storage
▪ Storage Design Based on Application

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 2
Application

• A software program that provides logic for computing


operations
• Commonly deployed applications in a data center
4 Business applications – email, enterprise resource
planning (ERP), decision support system (DSS)
4 Management applications – resource management,
performance tuning, virtualization
4 Data protection applications – backup, replication
4 Security applications – authentication, antivirus

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 3
Database Management System (DBMS)

• Database is a structured way to store data in logically organized


tables that are interrelated
4 Helps to optimize the storage and retrieval of data
• DBMS controls the creation, maintenance, and use of databases
4 Processes an application’s request for data
4 Instructs the OS to retrieve the appropriate data from storage
• Popular DBMS examples are MySQL, Oracle RDBMS, SQL Server,
etc.

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 4
Host (Compute)

• Resource that runs applications with the


help of underlying computing
components
4 Example: Servers, mainframes, laptop,
desktops, tablets, server clusters, etc.
• Consists of hardware and software
components
• Hardware components
4 Include CPU, memory, and input/output IP

(I/O) devices
• Software components
4 Include OS, device driver, file system,
volume manager, and so on

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 5
Operating Systems and Device Driver
• In a traditional environment OS resides between the applications and
the hardware
4 Responsible for controlling the environment
• In a virtualized environment virtualization layer works between OS
and hardware
4 Virtualization layer controls the environment
4 OS works as a guest and only controls the application environment
4 In some implementation OS is modified to communicate with
virtualization layer
• Device driver is a software that enables the OS to recognize the
specific device
• Device drivers are hardware-dependent and operating-system-
specific

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 6
Memory Virtualization
• an expensive component of a host
• determines both the size and number of applications that can run on a
host
• Memory virtualization enables multiple applications and to run on a
host without impacting each other
− an operating system feature that virtualizes the physical memory (RAM)
of a host.
• Virtual memory manager (VMM) manages the virtual memory
− The VMM manages the virtual-to-physical memory mapping and fetches
data from the disk storage
EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 7
• An OS feature that presents larger memory to
the application than physically available Operating System

4 Additional memory space comes from disk


storage
4 Space used on the disk for virtual memory is
called ‘swap space/swap file or page file’
Memory
4 Inactive memory pages are moved from physical
Swap in Swap out
memory to the swap file
4 Provides efficient use of available physical
memory
4 Data access from swap file is slower – use of
Disk Drive
flash drives for swap space gives best
performance

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 8
• In a virtual memory implementation, the memory of a system is divided
into contiguous blocks of fixed-size pages.
• paging moves inactive physical memory pages onto the swap file and
brings them back to the physical memory when required.
− enables efficient use of the available physical memory among different
applications

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 9
Logical Volume Manager (LVM)
• software that runs on the compute system and manages logical and physical
storage
• intermediate layer between the file system and the physical disk.
→ partition a larger-capacity disk into virtual, smaller-capacity volumes -
partitioning or aggregate several smaller disks to form a larger virtual volume-
concatenation

• The LVM provides optimized storage access and simplifies storage resource
management.
─ hides details about the physical disk and the location of data on the disk
─ enables administrators to change the storage allocation even when the
application is running.

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 10
LVM Example: Partitioning and Concatenation

Hosts

Logical Volume

Physical Volume

Partitioning Concatenation

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 11
• The basic LVM components are physical volumes, volume groups, and logical
volumes

‒ each physical disk connected to the host system is a physical volume (PV)
‒ A unique physical volume identifier (PVID) is assigned to each physical
volume when it is initialized for use by the LVM.

‒ volume group is created by grouping together one or more physical


volumes

‒ Logical volumes are created within a given volume group.

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 12
• Logical volume appears as a physical device to the operating system
- made up of noncontiguous physical extents

• A file system is created on a logical volume


• These logical volumes are then assigned to the application

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 13
Compute Virtualization

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 14
File System
• A file is a collection of related records or data stored as a unit with a name
• A file system is a hierarchical structure of files
─ provides users with the functionality to create, modify, delete, and access
files.
─ enables easy access to data files residing within a disk drive, a disk partition, or a
logical volume
─ consists of logical structures and software routines that control access to
files
─ Access to files on the disks is controlled by the permissions assigned to the file
by the owner, which are also maintained by the file system

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 15
• A file system organizes data in a structured hierarchical
manner via the use of directories, which are containers for
storing pointers to multiple files.
• Examples of common file systems are:
─ FAT 32 (File Allocation Table) for Microsoft Windows
─ NT File System (NTFS) for Microsoft Windows
─ UNIX File System (UFS) for UNIX
─ Extended File System (EXT2/3) for Linux

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 16
• Process of mapping user files to the disk storage subsystem with an LVM
1. Files are created and managed by users and applications.

2. These files reside in the file systems.

3. The file systems are mapped to file system blocks.

4. The file system blocks are mapped to logical extents of a logical


volume.
5. These logical extents in turn are mapped to the disk physical extents
either by the operating system or by the LVM.
6. These physical extents are mapped to the disk sectors in a storage
subsystem.
• If there is no LVM, then there are no logical extents
• Without LVM, file system blocks are directly mapped to disk sectors

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 17
File System Blocks

Users
File System
Files

1 2 3

Creates/ Reside in Mapped to


Manages

Disk Physical
Disk Sectors Extents LVM Logical Extents

6 5 4
Mapped to Mapped to Mapped to

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 18
• A file system can be

1. Nonjournaling file system or

2. Journaling file system

NonJournaling

▪ Use separate writes to update their data and metadata

▪ If the system crashes during the write process, the metadata or data might be

lost or corrupted

▪ When the system reboots, the file system attempts to update the metadata

structures by examining and repairing them -takes a long time on large file

systems

▪ If there is insufficient information to re-create the wanted or original


EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. structure,
Module 2: Data Center Environment 19
Journaling
• uses a separate area called a log or journal.
• This journal might contain all the data to be written (physical journal) or
just the metadata to be updated (logical journal)
• Before changes are made to the file system, they are written to this
separate area
• After the journal has been updated, the operation on the file system can
be performed
• If the system crashes during the operation, there is enough information in
the log to “replay” the log record and complete the operation
• Journaling results in a quick file system check because it looks only at the
active, most recently accessed parts of a large file system
• In addition, because information about the pending operation is saved,
the risk of files being lost is reduced.
• A disadvantage of journaling file systems is that they are slower than other
file systems.

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 20
Compute Virtualization
Compute Virtualization

It is a technique of masking or abstracting the physical compute


hardware and enabling multiple operating systems (OSs) to run
concurrently on a single or clustered physical machine(s).

• Enables creation of multiple virtual


machines (VMs), each running an OS
and application
4 VM is a logical entity that looks
and behaves like physical
machine Virtualization Layer (Hypervisor)

• Virtualization layer resides between x86 Architecture


hardware and VMs
4 Also known as hypervisor
• VMs are provided with standardized CPU NIC Card Memory Hard Disk

hardware resources

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 21
Need for Compute Virtualization

Virtualization Layer (Hypervisor)


x86 Architecture x86 Architecture

CPU NIC Card Memory Hard Disk CPU NIC Card Memory Hard Disk

Before Virtualization After Virtualization


• Runs single operating system (OS) per • Runs multiple operating systems (OSs) per
machine at a time physical machine concurrently
• Couples s/w and h/w tightly • Makes OS and applications h/w independent
• May create conflicts when multiple applications • Isolates VM from each other, hence, no conflict
run on the same machine • Improves resource utilization
• Underutilizes resources • Offers flexible infrastructure at low cost
• Is inflexible and expensive

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 22
Advantages

• Creation of VMs takes less time compared to a physical server


setup; organizations can provision servers faster and with ease.

• Individual VMs can be restarted, upgraded, or even crashed,


without affecting the other VMs on the same physical machine

• VMs can be copied or moved from one physical machine to


another without causing application downtime.

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 23
Desktop Virtualization
Desktop Virtualization

It is a technology which enables detachment of the user state, the


Operating System (OS), and the applications from endpoint devices.

• Enables organizations to host and Pcs and thin clients

centrally manage desktops


4 Desktops run as virtual machines within the
data center and accessed over a network
• Desktop virtualization benefits LAN/WAN
4 Flexibility of access due to enablement of
thin clients
4 Improved data security Desktop VMs
4 Simplified data backup and PC maintenance

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 24
Connectivity
• Interconnection between hosts or between a host
and peripheral devices, such as storage

• Physical Components of Connectivity are:


4 Host interface card, port, and cable

Host
Adapter Cable

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 25
• A host interface device or host adapter connects a host to other
hosts and storage devices.

• Examples of host interface devices are host bus adapter (HBA) and
network interface card (NIC).

• Host bus adaptor is an application-specific integrated circuit (ASIC)


board that performs I/O interface functions between the host and

storage, relieving the CPU from additional I/O processing workload.

• A port is a specialized outlet that enables connectivity between the


host and external devices.

• An HBA may contain one or more ports to connect the host to the
storage device.

• Cables connect hosts to internal or external devices using copper or


EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 26
Host Bus Adapter

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 27
• Protocol

4 Enables communication between host and storage


4 Implemented using interface devices at both source end and destination
4 Popular storage interface protocols:
- Integrated Device Electronics/Advanced Technology Attachment (IDE/ATA)
- Small Computer System Interface (SCSI)
- Fiber Channel (FC)
- Internet Protocol (IP)

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 28
IDE/ATA and Serial ATA

• Integrated Device Electronics (IDE)/Advanced Technology Attachment (ATA)


4 Most popular interface used with modern hard disks
4 Good performance at low cost
4 Inexpensive storage interconnect
4 Used for internal connectivity
• Serial Advanced Technology Attachment (SATA)
4 Serial version of the IDE/ATA specification that has replaced the parallel ATA
4 Inexpensive storage interconnect, typically used for internal connectivity
4 Provides data transfer rate up to 6 Gb/s (standard 3.0)

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.
SCSI and Serial SCSI
• Parallel Small computer system interface (SCSI)
4 Popular standard for connecting host and peripheral devices
8 Commonly used for storage connectivity in servers
4 Higher cost than IDE/ATA, therefore not popular in PC environments
4 Available in wide variety of related technologies and standards
4 Supports multiple simultaneous data access
4 Used primarily in “higher end” environments
4 Support up to 16 devices on a single bus
4 Ultra-640 version provides data transfer speed up to 640 MB/s
• Serial Attached SCSI (SAS)
4 Point-to-point serial protocol replacing parallel SCSI
4 Supports data transfer rate up to 6 Gb/s (SAS 2.0)

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 30
Fibre Channel (FC)
4 Widely used protocol for high-speed communication to the storage
device
4 Provides gigabit network speed
4 Provides a serial data transmission that operates over copper wire
and/or optical fiber
4 Latest version of the FC interface ‘16FC’ allows transmission of data up
to 16 Gb/s
Internet Protocol (IP)
4 Traditionally used to transfer host-to-host traffic
4 Succesfull option for host-to-storage communication
4 Offers advantages in terms of cost and maturity
4 Provide opportunity to leverage existing IP based network for storage
communication
8 Examples: iSCSI and FCIP protocols

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 31
Storage
• core component in a data center
• Storage Options
Magnetic Tape
4 Low cost solution for long term data storage
8 Preferred option for backup destination in the past

4 Limitations
8 Sequential data access
8 Single application access at a time
8 Physical wear and tear

8 Storage/retrieval overheads

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 32
• Optical discs
4 Popularly used as distribution medium in small, single-user
computing environments
4 Limited in capacity and speed
4 Write once and read many (WORM): CD-ROM, DVD-ROM
4 Other variations: CD-RW, Blu-ray discs
• Disk drive
4 Most popular storage medium
4 Large storage capacity
4 Random read/write access
• Flash drives
4 Uses semiconductor media
4 Provide high performance and low power consumption

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 33
Disk Drive Components

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 34
• The key components of a hard disk drive are platter, spindle, read-write
head, actuator arm assembly, and controller board
• I/O operations in a HDD are performed by rapidly moving the arm across
the rotating flat platters coated with magnetic particles.
• Data is transferred between the disk controller and magnetic platters
through the read-write (R/W) head which is attached to the arm.
• Data can be recorded and erased on magnetic platters any number of
times.

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 35
Platter
• One or more flat circular disks.
• Data recorded in binary codes.
• Sealed in a case, called Head Disk Assembly(HDA).
• Data is encoded by polarizing magnetic area of disk surface.
• Number of platters and storage capacity of each platter determines total
storage capacity.
• Spindle
• Connects all platters.
• Spindle is connected to a motor.
• Speed 7,200 rpm, 10,000 rpm or 15,000 rpm.
• platter diameter 3.5” (90 mm)
EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 36
Read/Write Head
• Read or write data from or to a platter.
• R/W Head changes magnetic polarization on the surface of the platter when
writing data.
• While reading, head detects magnetic polarization.
• Head never touches the surface of platter.
• A microscopic air gap between R/W head and platter surface, known as head
flying height.
• Head rests on special area – landing zone.
• Head crash leads to data loss.

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 37
Actuator Arm Assembly

• R/W heads mounted on actuator arm assembly


• There are 2 R/W heads per platter, above and below

Fig: Actuator arm assembly

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 38
Controller
• It is mounted on PCB at the bottom of disk drive
• It has microprocessor, internal memory, circuitry and firmware
• Firmware controls power to spindle motor and speed of motor
• It also manages the communication between the drive and the host.
• In addition, it controls the R/W operations by moving the actuator arm
and switching between different R/W heads, and performs the
optimization of data access.

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 39
Physical Disk Structure

Spindle Sector
Sector
Track

Cylinder

Track

Platter

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 40
• Data recorded on tracks
• Tracks are numbered, starting from zero from outer edge of platter
• Number of tracks per inch (TPI) measures track density
• Each track is divided into smallest, addressable units – sectors
• Tracks and sectors are written by manufacturer
• There are thousands of tracks on a platter based on its recording density
and dimension

• A sector holds 512bytes of user data.


• Sector also stores other information like, sector number, head
number/platter number, track number to locate data physically on the
drive.
EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 41
• There will be capacity difference between unformatted and formatted
disk.

• Ex: unformatted disk has 500GB capacity will only hold 465.7GB of user
data, and remaining 34.3GB is used for metadata.

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 42
Zone Bit Recording
• Platters are made of concentric rings, Outer tracks hold more data than
inner tracks.
• On older disk drives, data density was low on outer tracks.
• This lead to inefficient use of available space.

Fig: Zoned bit recording

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 43
• Zone bit recording utilizes disk efficiently.
• Tracks are grouped into zones based on their distance from center of disk.
• Outer zone numbered 0, 1, 2…
• Appropriate number of sectors per track are assigned to each zone, so that
data density is uniform.

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 44
Logical Block Addressing

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 45
• Earlier drives used physical addresses consisting of
the cylinder, head and sector(CHS) number to refer
to specific locations on the disk(as shown in fig (a) )
• Host OS has to be aware of the geometry of each
disk used
• Logical block addressing(LBA) simplifies addressing
by using a linear address to access physical blocks of
data
• The disk controller translates LBA to a CHS address
and the host needs to know only the size of the disk
drive in terms of the number of blocks
• The logical blocks are mapped to physical sectors on
a 1:1 basis
EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 46
Example:
• Previous drive shows,
─ 8 sectors per track, 8 heads,
─ & 4 cylinders. (here tracks are referred by cylinders)
• This means: 8*8*4= 256 blocks can be formed, 0 to 255.
• Assuming 512 bytes stored in a block,
• For a 500GB drive, we will access to 976,000,000 blocks.

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 47
Disk Drive Performance

• Electromechanical device
4 Impacts the overall performance of the storage system
• Disk service time
4 Time taken by a disk to complete an I/O request, depends on:
8 Seek time
8 Rotational latency
8 Data transfer rate

Disk service time = seek time + rotational latency + data transfer time

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 48
Seek Time

• Time taken to position the read/write head


• The lower the seek time, the faster the I/O
operation
• Seek time specifications include
4 Full stroke: time taken by the R/W head Radial
Movement
to move across the entire width of the
disk, from the innermost track to the
outermost track
4 Average: average time taken by the R/W
head to move from one random track to
another, normally listed as the time for
one-third of a full stroke
4 Track-to-track : time taken by the R/W
head to move between adjacent tracks

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 49
• Each of these specifications is measured in milliseconds
• The seek time of a disk is typically specified by the drive
manufacturer
• The average seek time on a modern disk is typically in the
range of 3 to 15 milliseconds
• Seek time has more impact on the read operation of random
tracks rather than adjacent tracks
• To minimize the seek time, data can be written to only a
subset of the available cylinders
• This results in lower usable capacity than the actual capacity
of the drive.
• For example, a 500 GB disk drive is set up to use only the first
40 percent of the cylinders and is effectively treated as a 200
GB drive. This is known as short-stroking the drive.
EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 50
Rotational Latency

• The time taken by the platter to


rotate and position the data under
the R/W head
• Depends on the rotation speed of the
spindle & is measured in milli seconds
• Average rotational latency
4 One-half of the time taken for a
full rotation

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environnent 51
• Average rotational latency for a 15,000 rpm (or 250 rps) drive
0.5/250 = 2 milliseconds.
• Average rotational latency is approximately 5.5 ms for a 5,400-rpm drive

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 52
Data Transfer Rate

• Average amount of data per unit time that the drive can deliver to the HBA
• Read operation:
─ Data first moves from disk platters to R/W heads, then it moves to the drive’s
internal buffer
─ Data moves from the buffer through the interface to the host HBA
• Write operation:
─ Data moves from the HBA to the internal buffer of the disk drive through the
drive’s interface
─ The data then moves from the buffer to the R/W heads
─ Finally, it moves from R/W heads to the platters

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 53
• The data transfer rates during the R/W operations are measured in terms of
internal and external transfer rates
4 Internal transfer rate : Speed at which data moves from a platter’s surface to the
internal buffer of the disk
4 External transfer rate: Rate at which data move through the interface to the HBA

External transfer rate Internal transfer rate


measured here measured here

Head Disk
HBA Interface Buffer Assembly

Disk Drive

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 54
Disk I/O Controller Utilization

• Disk can be viewed as black box consisting of 2 elements:


1. Queue: The location where an I/O request waits before it is processed
by the I/O controller
2. Disk I/O controller: Processes I/Os waiting in the queue one by one

• The I/O requests arrive at the controller at the rate generated by the application.
This rate is also called the arrival rate.
• These requests are held in the I/O queue, and the I/O controller processes them
one by one
• The I/O arrival rate, the queue length, and the time taken by the I/O controller
to process each request determines the I/O response time.
• If the controller is busy or heavily utilized, the queue size will be large and the
response time will be high.

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environnent 55
• Based on the fundamental laws of disk drive performance, the relationship
between controller utilization and average response time is given as

Average response time (TR) = Service time (TS)


(1 – Utilization)
where
TS is the time taken by the controller to serve an I/O.

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environnent 56
• The graph indicates that the
response time changes are
nonlinear as the utilization
increases.
• When the average queue sizes
are low, the response time
remains low.
• The response time increases
slowly with added load on the
queue and increases
exponentially when the
Fig: Utilization versus response
utilization exceeds 70 percent.
time

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 57
Host Access to Data

• Data is accessed and stored by applications using the underlying


infrastructure
• The key components of this infrastructure are the operating system (or file
system), connectivity, and storage.
• The storage device can be internal and (or) external to the host.
• In either case, the host controller card accesses the storage devices using
predefined protocols
- IDE/ATA, SCSI, popularly used in small and personal computing
environments for accessing internal storage
- FC and iSCSI protocols are used for accessing data from an external
storage device

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 58
• Understanding access to data over a network is important because it lays
the foundation for storage networking technologies

• Data can be accessed over a network in one of the following ways:


1. block level
2. file level or
3. object level.

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 59
Block level access

• the file system is created on a host, and data is accessed on a


network at the block level,
• raw disks or logical volumes are assigned to the host for creating
the file system.

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 60
File-level access

• the file system is created on a separate file server or at the storage


side, and the file-level request is sent over a network
• Because data is accessed at the file level, this method has higher
overhead, as compared to the data accessed at the block level

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 61
Object-level access

• data is accessed over a network in terms of self-contained


objects with a unique object identifier

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 62
Direct-Attached Storage (DAS)

• DAS is an architecture in which storage is connected directly


to the hosts.
Ex: The internal disk drive of a host ,the directly-
connected external storage array
• DAS is suitable for localized data access in a small
environment, such as personal computing and workgroups
• DAS is classified as internal or external, based on the
location of the storage device with respect to the host.

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 63
Internal DAS architectures

• The storage device is internally connected to the host by a serial or parallel bus
• The physical bus has distance limitations and can be sustained only over a
shorter distance for highspeed connectivity.
• In addition, most internal buses can support only a limited number of devices,
and they occupy a large amount of space inside the host, making maintenance
of other components difficult.

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 64
External DAS architectures

• The host connects directly to the external storage device, and data is
accessed at the block level
• In most cases, communication between the host and the storage device
takes place over a SCSI or FC protocol
• Compared to internal DAS, an external DAS overcomes the distance and
device count limitations and provides centralized management of
storage devices.

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 65
DAS Benefits and Limitations

Benefits
• requires a relatively lower initial investment than storage networking architectures
• The DAS configuration is simple and can be deployed easily and rapidly
• The setup is managed using host-based tools, such as the host OS, which makes
storage management tasks easy for small environments

• Because DAS has a simple architecture, it requires fewer management tasks and less
hardware and software elements to set up and operate.

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 66
Limitations

• DAS does not scale well. A storage array has a limited number of ports, which restricts
the number of hosts that can directly connect to the storage.

• When capacities are reached, the service availability may be compromised.


• DAS does not make optimal use of resources due to its limited capability to share
front-end ports.

• In DAS environments, unused resources cannot be easily reallocated, resulting in


islands of over-utilized and under-utilized storage pools.

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 67
Storage Design Based on Application Requirements and Disk Drive Performance

• Storage requirements for an application estimated by


─ The size and number of file systems
─Database components used by applications
─The I/O size, I/O characteristics, and the number of I/Os
generated by the application at peak workload
─I/O response time
─Design of storage systems
• The disk service time (TS) for an I/O is a key measure of disk
performance
• TS, along with disk utilization rate (U), determines the I/O response
time for an application.
• The total disk service time (TS) is the sum of the seek time (T),
rotational latency (L), and internal transfer time (X):
TS = T + L + X

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environnent 68
Example: Consider the following specifications provided for a disk
• The average seek time is 5 ms in a random I/O environment;
therefore, T = 5 ms.
• Disk rotation speed of 250 revolutions per second — from which
rotational latency (L) can be determined, which is one-half of the
time taken for a full rotation or L = (0.5/250 rps expressed in ms)
• 40 MB/s internal data transfer rate, from which the internal transfer
time (X) is derived based on the block size of the I/O — for example,
an I/O with a block size of 32 KB; therefore X = 32 KB/40 MB
• The time taken by the I/O controller to serve an I/O of block size 32
KB is (TS) = 5 ms + (0.5/250) + 32 KB/40 MB = 7.8 ms
• Therefore, the maximum number of I/Os serviced per second (IOPS)
is
(1/TS) = 1/(7.8 × 10-3) = 128 IOPS

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environnent 69
EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 70
• Disks required to meet an application’s capacity need (DC):

• Disks required to meet application’s performance need (DP):

• IOPS serviced by a disk (S) depends upon disk service time (TS):

4 TS is time taken for an I/O to complete, therefore IOPS serviced by


a disk (S) is equal to (1/TS)
8 For performance sensitive application (S)=

Disk required for an application = max (DC,DP)

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environnent 71
Module 2: Summary
Key points covered in this module:
• Key data center elements
• Application and compute virtualization
• Disk drive components and performance
• Host access to storage

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 72
Exercise: Design Storage Solution for New
Application
• Scenario
4 Characteristics of new application:
8 Require 1TB of storage capacity
8 Peak I/O workload 4900 IOPS
8 Typical I/O size is 4KB
4 Specifications of the available disk drives:
8 15K rpm drive with storage capacity = 100 GB
8 Average seek time = 5ms
8 Data transfer rate = 40 MB/sec
4 As it is business critical application, response time must be within
acceptable range
• Task
4 Calculate the number of disks required for the application

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 2: Data Center Environment 73

You might also like