PowerHA SystemMirror
Basic Architecture
IBM Power Systems
© Copyright IBM Corporation 2010
实施专家级课程 PowerHA
Software Layers of PowerHA node
¾ Application
z Uses the services made highly available by
HACMP
¾ PowerHA
z Makes services highly available for applications
z Co-ordinates resource availability through the
cluster
¾ RSCT
z Provides reliable communication between nodes
z Co-ordination of subsystems
¾ AIX
z Operating system services
¾ LVM
z Logical storage management
¾ TCP/IP
z Manages communications at a logical layer
Page 2
实施专家级课程 PowerHA
PowerHA SystemMirror Cluster Process Flow
Page 3
实施专家级课程 PowerHA
PowerHA And SNMP
¾The PowerHA MIB is defined in the hacmp.defs and hacmp.my files
¾The clstrmgrES daemon maintains current values of MIB objects and
provides them to the snmpd
¾Many programs can get PowerHA status from snmpd using the SNMP
protocal on the same system or across the network.
Page 4
实施专家级课程 PowerHA
PowerHA And SNMP
Page 5
实施专家级课程 PowerHA
PowerHA And SNMP
Page 6
实施专家级课程 PowerHA
PowerHA And SNMP
Page 7
实施专家级课程 PowerHA
两节点HACMP拓扑结构示意图
Network Clients
Service & Standby
IP Network Network Adapters
IP Heartbeats
Serial Heartbeat
System p Cluster System p Cluster
Node Node
Shared Disk
Page 8
实施专家级课程 PowerHA
Cluster Nodes
¾ Since the cluster is treated as a single entity, we refer to the individual
computers as nodes.
¾ Each node is an independent system
¾ Inter node communication is defined when the cluster is initialized.
Page 9
实施专家级课程 PowerHA
PowerHA SystemMirror Cluster Communication
¾ TCP/IP based communication
z All network adapters
Use separate logical subnets
Use single subnet with heartbeat over IP
aliasing
RS232
¾ Non-TCP/IP based communication
z Serial (RS232) connection
z Target mode
z Disk heartbeat
Target mode
¾ A non-TCP/IP based communication Disk heartbeat
network is highly recommended
Page 10
实施专家级课程 PowerHA
Service IP aliases
¾ "Service Address" or "Service Label" is the connection to the computer
¾ AIX allows many addresses on a single adapter
¾ Does not affect the original configuration
¾ Allows separation of services
¾ Faster to move if necessary
Page 11
实施专家级课程 PowerHA
IP replacement method
With After After
At system adapter host Adapter Type
Node A HACMP
boot failure
running failure
192.168.0.1 192.168.0.6 na na Boot / Service
1.1.1.1 1.1.1.1 192.168.0.6 na Standby
192.168.0.2 192.168.0.2 192.168.0.2 192.168.0.2 Boot
1.1.1.2 1.1.1.2 1.1.1.2 192.168.0.6 Standby
Node B
• Two logical IP networks (Netmask 255.255.255.0)
• One physical network
• Clients always access 192.168.0.6
• MAC address takeover or ARP cache update is also needed
Page 12
实施专家级课程 PowerHA
IP alias method
With After After
At system
Node A HACMP adapter host
boot
running failure failure
192.168.0.1 192.168.0.1 na na
10.1.1.150 10.1.1.150
10.1.1.1
1.1.1.1 1.1.1.1 1.1.1.1 na
10.1.1.150
10.1.1.1
192.168.0.2 192.168.0.2 192.168.0.2 192.168.0.2
10.1.1.160 10.1.1.160 10.1.1.160 10.1.1.160
1.1.1.2 1.1.1.2 1.1.1.2 1.1.1.2
10.1.1.1
Node B
• Initially configured addresses (Boot IP)
• Persistent IP addresses - useful for applications like Tivoli
• Service IP addresses - used by clients to access the cluster
- multiple are allowed
Page 13
实施专家级课程 PowerHA
Persistent Node IP label
A persistent node IP label is a useful administrative “tool” that lets you
contact a node even if the HACMP cluster services are down on that node.
Always stays on the same node (is node-bound)
Co-exists on a network interface card that already has a service IP label
defined
Does not require installing an additional physical network interface card on
that node
Is not part of any resource group
There can be one persistent node IP label per network per node
Page 14
实施专家级课程 PowerHA
磁盘心跳(Heartbeat via disk)
z HACMP5.x的新功能
z 能够使用下列任何一种共享
磁盘阵列 (Fibre
Channel,SCSI, 或 SSA)
z 使用的磁盘是一个
enhanced concurrent
volume group 的一部分, 唯
一的要求是这个 VG必须在
两个节点都有定义
Page 15
实施专家级课程 PowerHA
PowerHA SystemMirror Volume Groups
¾ Two types:
z Shared
z Non-shared
¾ Shared volume groups can "migrate“
¾ Non-Shared volume groups are node bound
¾ Application data must be on a shared
volume group to be "moved“
¾ Application code may be on either type of
disk
Page 16
实施专家级课程 PowerHA
PowerHA SystemMirror Application Server Scripts
¾ "Application server", a name given to a series of scripts:
z Start the application
z Stop the application
z Monitor the application (optional)
z Re-start the application (optional)
¾ Applications must be able to be started from a previously unknown
state by a script
¾ Applications must be able to be stopped by a script
Page 17
实施专家级课程 PowerHA
PowerHA SystemMirror Resource Groups
¾ Logical constructs that group related attributes together
¾ The "container" used by HACMP to "move" resources
¾ Participating node list
z default node priorities
z Home node
¾ Have Policies on:
z Start up
z Fall over
z Fall back
z Distribution policy
z Dependant resource groups
Page 18
实施专家级课程 PowerHA
PowerHA SystemMirror Startup Resource Group
¾ Resource group start up occurs: ¾ Online on Home Node Only
(OHNO)
z during initial cluster start up
z only start on the highest priority
z initial acquisition of the resource
group ¾ Online on First Available Node
z May be modified by a "settling" (OFAN)
timer z will start on any one node
¾ Online on All Available Nodes
(OAAN)
z The resource groups will start on
all nodes
¾ Online Using Distribution Policy
(OUDP)
z One resource group per network or
node depending on the distribution
policy
Page 19
实施专家级课程 PowerHA
PowerHA SystemMirror Resouce Group TakeOver
¾ Resource group fallover occurs: ¾ Fallover to Next Priority Node
z When the current node can no (FNPN)
longer support the resource group
and it is "moved" to another node z Resource group is moved to the
Failure has occurred next node in the resource group's
Graceful shutdown with takeover of the node list
current node
¾ Fallover using Dynamic Node
Priority (FDNP)
z Resource group is moved to the
next node in the resource group's
node list as recalculated based on
the dynamic node criteria policy
¾ Bring Offline on Error Node (BOEN)
z Resource group is set to an offline
state on this node only
Page 20
实施专家级课程 PowerHA
PowerHA SystemMirror Resource Group FallBack
¾ Resource group fallback occurs: ¾ Fallback to a Higher Priority Node
z The resource group is not on its (FHPN)
home node z When the higher priority node is
z A higher priority node becomes available and/or the optional timer
available expires, the resource group moves
z Can be modified by a fallback timer
¾ Never Fallback (NFB)
z Regardless if a higher priority node
becomes available, the resource
group will not move
Page 21
实施专家级课程 PowerHA
PowerHA Cluster Working Flow (Online on Home Node Only)
Online on
Home
Node
Only
(Simple standby operation)
Ca
sca
din
A owns resource group B is backup for A g
System A fails System B fails
System B takes over No activities
resource group
Fallover to A B
Next Fallback to a
Priority Higher
Node Priority
Node
A B A B
System A returns to System B returns
cluster to cluster
System B releases
resource group
A B
A owns resource group B is backup for A
Page 22
实施专家级课程 PowerHA
HACMP资源组(Online on Home Node Only)
Page 23
实施专家级课程 PowerHA
PowerHA Cluster Working Flow (Online on First Available Node)
Online on
First
Available
Node
Rotating
A owns resource group: B is backup for A
System A fails System B returns to
cluster
System B takes over
resource group
A B
Fallover to
Next
Priority Never
Node Fallback
A B A B
System A
returns to System B fails
cluster
System A takes over
resource group
A B
B owns resource group:
Page 24
实施专家级课程 PowerHA
HACMP资源组(Online on First Available Node)
Page 25
实施专家级课程 PowerHA
PowerHA Cluster Working Flow (Online on All Available Nodes)
Online on
All
Available A and B owns resource group:
Nodes
Concurrent
System A fails System B fails
No activities No activities
Bring
Offline on
Error A B
Node
Never
Fallback
A B A B
System A returns to System B returns
cluster to cluster
A B
A and B owns resource group:
Page 26
实施专家级课程 PowerHA
HACMP资源组(Online on All Available Nodes)
Page 27
实施专家级课程 PowerHA
Failover possibilities
Page 28
实施专家级课程 PowerHA
Three-node Mutual Takeover Cluster
¾ Increased resiliency vs. 2-node cluster
¾ Redundant connections to storage and
networks
¾ Server capacity must be sized to handle
additional workload in failover scenarios
zIdeally, each node should be sized to run
all workloads (in case 2 of 3 nodes failed)
¾ Some increase in complexity of cluster
configuration and management
Page 29
实施专家级课程 PowerHA
“n + 1” HA Cluster
¾ Increased resiliency vs. 2-node cluster
¾ Some efficiency gain (only one server
“on standby”)
¾ Server capacity must be considered
zIdeally, Server D capacity should be sized
to handle all workloads from Servers A, B,
and C
zSome clients size Server D smaller;
assuming that risk of Servers A, B, and C all
failing at once is small
¾ Some increase in complexity of cluster
configuration and management
Page 30
实施专家级课程 PowerHA
HA Clustering and Virtualization
¾ Still need two servers to avoid server as
SPoF
¾ Base example shown at right:
zTwo-node clusters configured to
failover across physical boundaries
zDistribute primary nodes evenly
across servers so that single server
failure results in failover of only 50% of
primary nodes
¾ Other common cluster configs can also
be used in virtual environment
Page 31
实施专家级课程 PowerHA
Node A is active at production site and other 2 nodes are standby nodes
Production Site Client IP Client IP Recovery Site
Network Network
Failover
Node C
Node A
Node B
Cisco Data Center
Failback Interconnect Network
Metro Mirror
SAP R/3
/ DB2 UDB
PPRC
DS8000 DS6000
Page 32
实施专家级课程 PowerHA
Resource Group and Application will failover to Node B on detection of node A failure.
Production Site Client IP Client IP Recovery Site
Network Network
Node C
Node A
Node B
Cisco Data Center
Interconnect Network
Metro Mirror
SAP R/3
/ DB2 UDB
PPRC
DS8000 DS6000
Page 33
实施专家级课程 PowerHA
Resource Group and Application will failover to Node C at remote site on detection of
node A & node B failures.
Production Site Client IP Client IP Recovery Site
Network Network
Failover
Node C
Node A
Node B
Cisco Data Center
Interconnect Network
Metro Mirror
SAP R/3
/ DB2 UDB
PPRC
DS8000 DS6000
Page 34
实施专家级课程 PowerHA
Resource Group and Application will fallback to primary site once it is repaired.
Production Site Client IP Client IP Recovery Site
Network Network
Node D
Node A
Node B
Cisco Data Center
Failback Interconnect Network
Metro Mirror
SAP R/3
/ DB2 UDB
PPRC
DS8000 DS6000
Page 35
实施专家级课程 PowerHA
Node A is active at production site and other 2 nodes are standby nodes. Back to normal.
Production Site Client IP Client IP Recovery Site
Network Network
Failover
Node C
Node A
Node B
Cisco Data Center
Failback Interconnect Network
Metro Mirror
SAP R/3
/ DB2 UDB
PPRC
DS8000 DS6000
Page 36
实施专家级课程 PowerHA
PowerHA SystemMirror With SAN Volume Controller
Production Site Client IP Client IP Recovery Site
Network Network
Failover
CRM
BW
CRM
XI
Cisco Data Center
BW
XI
Failback Interconnect Network
Metro Mirror
9205 9205
SVC SVC
60KM
Metro Mirror
DS8000 DS8000
Page 37
Thank
You!
© Copyright IBM Corporation 2010