Ha Linux PDF PDF
Ha Linux PDF PDF
Version 7.2.2
IBM
IBM PowerHA SystemMirror for Linux
Version 7.2.2
IBM
Note
Before using this information and the product it supports, read the information in “Notices” on page 59.
This edition applies to IBM PowerHA SystemMirror 7.2.2 for Linux and to all subsequent releases and modifications
until otherwise indicated in new editions.
© Copyright IBM Corporation 2017.
US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract
with IBM Corp.
Contents
About this document . . . . . . . . . v Configuring resource groups for PowerHA
Highlighting . . . . . . . . . . . . . . v SystemMirror for Linux . . . . . . . . . . 27
Case-sensitivity in Linux . . . . . . . . . . v Support for Shared storage . . . . . . . . . 28
ISO 9000. . . . . . . . . . . . . . . . v
Troubleshooting PowerHA
PowerHA SystemMirror for Linux SystemMirror for Linux. . . . . . . . 29
concepts . . . . . . . . . . . . . . 1 Troubleshooting PowerHA SystemMirror clusters. . 29
High availability clustering for Linux . . . . . . 1 Using PowerHA SystemMirror cluster log files . . 29
High availability and hardware availability . . . 1 Using the Linux log collection utility . . . . . . 29
Benefits of PowerHA SystemMirror . . . . . . 1 Solving common problems . . . . . . . . . 30
High availability clusters . . . . . . . . . 2 PowerHA SystemMirror startup issues . . . . 30
Physical components of a PowerHA SystemMirror PowerHA SystemMirror disk issues . . . . . 32
cluster . . . . . . . . . . . . . . . . 2 PowerHA SystemMirror resource and resource
PowerHA SystemMirror nodes . . . . . . . 3 group issues . . . . . . . . . . . . . 33
Networks . . . . . . . . . . . . . . 3 PowerHA SystemMirror Fallover issues . . . . 34
Clients . . . . . . . . . . . . . . . 4 PowerHA SystemMirror additional issues . . . 35
PowerHA SystemMirror cluster nodes, networks, and
heartbeating concepts . . . . . . . . . . . 4 PowerHA SystemMirror graphical user
Nodes . . . . . . . . . . . . . . . 4 interface (GUI) . . . . . . . . . . . 37
Cluster networks . . . . . . . . . . . . 4 Planning for PowerHA SystemMirror GUI . . . . 37
IP address takeover . . . . . . . . . . . 5 Installing PowerHA SystemMirror GUI . . . . . 38
IP address takeover by using IP aliases . . . . 6 Logging in to the PowerHA SystemMirror GUI . . 39
Heartbeating over TCP/IP and disk . . . . . 6 Navigating the PowerHA SystemMirror GUI . . . 40
Split policy . . . . . . . . . . . . . . 7 Troubleshooting PowerHA SystemMirror GUI . . . 41
Tiebreaker option for split policies . . . . . . 7
PowerHA SystemMirror resources and resource
Smart Assists for PowerHA
groups . . . . . . . . . . . . . . . 7
PowerHA SystemMirror cluster software . . . 10 SystemMirror . . . . . . . . . . . . 45
PowerHA SystemMirror cluster configurations . . 10 PowerHA SystemMirror for SAP HANA . . . . 45
Standby configurations . . . . . . . . . 11 Planning for SAP HANA . . . . . . . . . 45
Takeover configurations . . . . . . . . . 12 Configuring for SAP HANA. . . . . . . . 46
PowerHA SystemMirror for SAP NetWeaver . . . 49
Planning for SAP NetWeaver . . . . . . . 49
Planning for PowerHA SystemMirror
Configuring for SAP NetWeaver . . . . . . 50
for Linux . . . . . . . . . . . . . . 17 Troubleshooting PowerHA SystemMirror Smart
Assist issues . . . . . . . . . . . . . . 56
Installing PowerHA SystemMirror for PowerHA SystemMirror is not able to harvest
Linux . . . . . . . . . . . . . . . 19 some values during Wizard execution . . . . 56
Planning the installation of PowerHA SystemMirror Replication mode that is configured in Smart
for Linux . . . . . . . . . . . . . . . 19 Assist wizard is different from replication mode
Installing PowerHA SystemMirror for Linux . . . 20 reflected in SAP HANA setup . . . . . . . 56
Cluster snapshot for PowerHA SystemMirror for Smart Assist policy fails to activate . . . . . 57
Linux . . . . . . . . . . . . . . . . 21 Smart Wizard does not detect or show one or
more Ethernet interfaces in the list. . . . . . 57
Configuring PowerHA SystemMirror for
Linux . . . . . . . . . . . . . . . 23 Notices . . . . . . . . . . . . . . 59
Creating a cluster for PowerHA SystemMirror for Privacy policy considerations . . . . . . . . 61
Linux . . . . . . . . . . . . . . . . 23 Trademarks . . . . . . . . . . . . . . 61
Adding a node to a cluster for PowerHA
SystemMirror for Linux . . . . . . . . . . 24 Index . . . . . . . . . . . . . . . 63
Configuring resources for PowerHA SystemMirror
for Linux . . . . . . . . . . . . . . . 24
Highlighting
The following highlighting conventions are used in this document:
Bold Identifies commands, subroutines, keywords, files, structures, directories, and other items whose names are
predefined by the system. Bold highlighting also identifies graphical objects, such as buttons, labels, and
icons that the you select.
Italics Identifies parameters for actual names or values that you supply.
Monospace Identifies examples of specific data values, examples of text similar to what you might see displayed,
examples of portions of program code similar to what you might write as a programmer, messages from
the system, or text that you must type.
Case-sensitivity in Linux
Everything in the Linux operating system is case-sensitive, which means that it distinguishes between
uppercase and lowercase letters. For example, you can use the ls command to list files. If you type LS, the
system responds that the command is not found. Likewise, FILEA, FiLea, and filea are three distinct file
names, even if they reside in the same directory. To avoid causing undesirable actions to be performed,
always ensure that you use the correct case.
ISO 9000
ISO 9000 registered quality systems were used in the development and manufacturing of this product.
PowerHA SystemMirror monitors the cluster resources for failures, and when a problem is detected,
PowerHA SystemMirror moves the application (along with resources that ensure access to the
application) to another node in the cluster.
Surveys of the causes of downtime show that actual hardware failures account for only a small
percentage of unplanned outages. Other contributing factors include:
v Operator errors
v Environmental problems
v Application and operating system errors.
Reliable and recoverable hardware simply cannot protect against failures of all these different aspects of
the configuration. Keeping these varied elements, and therefore the application, highly available requires:
v Thorough and complete planning of the physical and logical procedures for access and operation of the
resources on which the application depends. These procedures help to avoid failures in the first place.
v A monitoring and recovery package that automates the detection and recovery from errors.
v A well-controlled process for maintaining the hardware and software aspects of the cluster
configuration while keeping the application available.
In a high availability cluster, multiple server machines cooperate to provide a set of services or resources
to clients.
PowerHA SystemMirror extends the clustering model by defining relationships among cooperating
processors where one processor provides the service offered by a peer should the peer be unable to do so.
As shown in the following figure, a PowerHA SystemMirror cluster is made up of the following physical
components:
v Nodes
v Networks
v Clients.
The PowerHA SystemMirror software allows you to combine physical components into a wide range of
cluster configurations, providing you with flexibility in building a cluster that meets your processing and
availability requirements. This figure shows one example of a PowerHA SystemMirror cluster. Other
PowerHA SystemMirror clusters could look very different - depending on the number of processors, the
choice of networking and disk technologies, and so on.
PublicLAN1
PublicLAN2
Nodes
Private LAN
Disk buses
c_4nodebig-00
In a PowerHA SystemMirror cluster, each node is identified by a unique name. A node has access to a set
of resources: networks, network addresses, and applications. Typically, a node runs a server or a back end
application that accesses data on the shared external disks.
Networks
As an independent, layered component of the Linux operating system, the PowerHA SystemMirror
software is designed to work with any TCP/IP-based network.
Types of networks
The PowerHA SystemMirror software defines two types of communication networks, characterized by
whether these networks use communication interfaces based on the TCP/IP subsystem (TCP/IP-based) or
disk-based networks.
Concepts 3
TCP/IP-based network
Connects two or more server nodes, and optionally allows client access to these cluster nodes,
using the TCP/IP protocol.PowerHA SystemMirror uses only unicast communication for
heartbeat.
Disk heartbeat
Provides communication between PowerHA SystemMirror cluster nodes to monitor the health of
the nodes, networks and network interfaces, and to prevent cluster partitioning
Clients
A client is a processor that can access the nodes in a cluster over a local area network.
Clients each run a "front end" or client application that queries the server application running on the
cluster node. The PowerHA SystemMirror software provides a highly available environment for critical
data and applications on the cluster nodes. The PowerHA SystemMirror software does not make the
clients themselves highly available.
Nodes
A node is a processor that runs both Linux and the PowerHA SystemMirror software.
Nodes might share a set of resources such as, networks, network IP addresses, and applications. The
PowerHA SystemMirror software supports up to 4 nodes in a cluster. In a PowerHA SystemMirror
cluster, each node is identified by a unique name. In PowerHA SystemMirror, a node name and a
hostname must be the same. Nodes serve as core physical components of a PowerHA SystemMirror
cluster.
Cluster networks
Cluster nodes communicate with each other over communication networks.
If one of the physical network interface cards on a node on a network fails, PowerHA SystemMirror
preserves the communication to the node by transferring the traffic to another physical network interface
card on the same node. If a "connection" to the node fails, PowerHA SystemMirror transfers resources to
another node to which it has access.
In addition, the clustering software sends heartbeats between the nodes over the cluster networks to
periodically check on the health of the cluster nodes themselves. If the clustering software detects no
heartbeats from a node, a node is considered as failed and resources are automatically transferred to
another node.
It is highly recommend configuring multiple communication paths between the nodes in the cluster.
Having multiple communication networks prevents cluster partitioning, in which the nodes within each
partition form their own entity. In a partitioned cluster, it is possible that nodes in each partition could
allow simultaneous non-synchronized access to the same data. This can potentially lead to different views
of data from different nodes.
A logical network is a portion of a physical network that connects two or more logical network interfaces
or devices. A logical network interface or device is the software entity that is known by an operating
system. There is a one-to-one mapping between a physical network interface/device and a logical
network interface/device. Each logical network interface can exchange packets with each logical network
interface on the same logical network.
If a subset of logical network interfaces on the logical network needs to communicate with each other
(but with no one else) while sharing the same physical network, subnets are used. A subnet mask defines
the part of the IP address that determines whether one logical network interface can send packets to
another logical network interface on the same logical network.
All logical network interfaces in a PowerHA SystemMirror network can communicate PowerHA
SystemMirror packets with each other directly. Each logical network is identified by a unique name. A
PowerHA SystemMirror logical network might contain one or more subnets.
IP address takeover
IP address takeover is a mechanism for recovering a service IP label by moving it to another network
interface card (NIC) on another node, when the initial NIC fails.
Concepts 5
IPAT occurs if the physical network interface card on one node fails and if there are no other accessible
physical network interface cards on the same network on the same node. Therefore, swapping IP labels of
these NICs within the same node cannot be performed and PowerHA SystemMirror will use IPAT to
recover the service IP address by using a NIC on a backup node. IP address takeover keeps the IP
address highly available by recovering the IP address after failures. PowerHA SystemMirror uses a
method called IPAT via IP aliases.
When a resource group containing the service IP label falls over from the primary node to the target
node, the service IP labels are added (and removed) as alias addresses on top of the base IP addresses on
an available NIC. This allows a single NIC to support more than one service IP label placed on it as an
alias. Therefore, the same node can host more than one resource group at the same time.
When there a multiple interfaces on the same node connected to the same network, and those interfaces
are not combined into a Ethernet Aggregation, all boot addresses must all be on different subnets. Also,
any persistent addresses or service addresses must be on different subnets than the boot addresses.
Because IP aliasing allows coexistence of multiple service labels on the same network interface, you can
use fewer physical network interface cards in your cluster. Upon fallover, PowerHA SystemMirror equally
distributes aliases between available network interface cards.
In order for a PowerHA SystemMirror cluster to recognize and respond to failures, it must continually
check the health of the cluster. Some of these checks are provided by the heartbeat function.
Each cluster node sends heartbeat messages at specific intervals to other cluster nodes, and expects to
receive heartbeat messages from the nodes at specific intervals. If messages stop being received,
PowerHA SystemMirror recognizes that a failure has occurred. Heartbeats can be sent over:
v TCP/IP networks
v A physical volume (disk) which is accessible from all clusters nodes
The heartbeat function is configured to use specific paths between nodes. This allows heartbeats to
monitor the health of all PowerHA SystemMirror networks and network interfaces, as well as the cluster
nodes themselves.
The heartbeat paths are set up automatically by RSCT; you have the option to configure disk paths as
part of PowerHA SystemMirror configuration.
After cluster split, the subdomain with the majority of nodes survives and the other subdomains are
dissolved. In case, there is a tie when exactly half the nodes of a domain are online and the other half are
inaccessible then PowerHA SystemMirror by using the RSCT must determine which subdomain has
operational quorum and so will survive, and which subdomain will be dissolved.
You can use PowerHA SystemMirror to configure a split policy that specifies the response to a cluster
split event.
A tiebreaker disk or an NFS file is used when the sites in the cluster can no longer communicate with
each other. This communication failure results in the cluster splitting the sites into two, independent
partitions. If failure occurs because the cluster communication links are not responding, both partitions
attempt to lock the tiebreaker disk or the NFS file. The partition that acquires the tiebreaker disk
continues to function, while the other partition reboots, or has cluster services restarted, depending on if
any critical resources are configured.
The disk or NFS-mounted file that is identified as the tiebreaker must be accessible to all nodes in the
cluster.
By identifying resources and defining resource group policies, the PowerHA SystemMirror software
makes numerous cluster configurations possible, providing tremendous flexibility in defining a cluster
environment tailored to individual requirements.
Cluster resources can include both hardware resources and software resources:
v Service IP labels or addresses
v Applications
The PowerHA SystemMirror software handles the resource group as a unit, thus keeping the
interdependent resources together on one node and keeping them highly available.
Concepts 7
Types of cluster resources
This section provides a brief overview of the resources that you can configure in PowerHA SystemMirror
and include into resource groups to let PowerHA SystemMirror keep them highly available.
Applications:
The purpose of a highly available system is to ensure that critical services are accessible to users.
Applications usually need no modification to run in the PowerHA SystemMirror environment. Any
application that can be successfully restarted after an unexpected shutdown is a candidate for PowerHA
SystemMirror.
For example, all commercial DBMS products provide a checkpoint on the state of the disk in some sort of
transaction journal. In the event of a server failure, the fallover server restarts the DBMS, which
reestablishes database consistency and then resumes processing.
Note: The start and stop scripts are the main points of control for PowerHA SystemMirror over an
application. It is very important that the scripts you specify operate correctly to start and stop all aspects
of the application. If the scripts fail to properly control the application, other parts of the application
recovery might be affected. For example, if the stop script you use fails to completely stop the application
and a process continues to access a disk, PowerHA SystemMirror will not be able to recover it on the
backup node.
Add your application to a PowerHA SystemMirror resource group only after you have thoroughly tested
your application start and stop scripts.
The resource group that contains the application should also contain all the resources that the application
depends on, including service IP addresses. Once such a resource group is created, PowerHA
SystemMirror manages the entire resource group and, therefore, all the interdependent resources in it as a
single entity. PowerHA SystemMirror coordinates the application recovery and manages the resources in
the order that ensures activating all interdependent resources before other resources.
PowerHA SystemMirror includes application monitoring capability, whereby you can define a monitor to
detect the unexpected termination of a process or to periodically poll the termination of an application
and take automatic action upon detection of a problem.
A service IP label is used to establish communication between client nodes and the server node. Services,
such as a database application, are provided using the connection made over the service IP label.
A service IP label can be placed in a resource group as a resource that allows PowerHA SystemMirror to
monitor its health and keep it highly available, either within a node or between the cluster nodes by
transferring it to another node in the event of a failure.
A persistent node IP label is a useful administrative tool that lets you contact a node even if the PowerHA
SystemMirror cluster services are down on that node.
When you define persistent node IP labels PowerHA SystemMirror attempts to put an IP address on the
node. Assigning a persistent node IP label to a network on a node allows you to have a node-bound IP
address on a cluster network that you can use for administrative purposes to access a specific node in the
cluster. A persistent node IP label is an IP alias that can be assigned to a specific node on a cluster
network and that:
You can configure the cluster so that certain applications stay on the same node, or on different nodes not
only at startup, but during fallover and fallback events. To do this, you configure the selected resource
groups as part of a location dependency set.
A participating node list defines a list of nodes that can host a particular resource group.
You define a node list when you configure a resource group. The participating node list can contain some
or all nodes in the cluster.
Typically, this list contains all nodes sharing the same data and disks.
Default node priority is identified by the position of a node in the node list for a particular resource
group.
The first node in the node list has the highest node priority. This node is also called the home node for a
resource group. The node that is listed before another node has a higher node priority than the current
node.
Home node:
The home node (the highest priority node for this resource group) is the first node that is listed in the
participating node list for a nonconcurrent resource group.
The home node is a node that normally owns the resource group.
The term home node is not used for concurrent resource groups because they are owned by multiple
nodes.
PowerHA SystemMirror ensures the availability of cluster resources by moving resource groups from one
node to another when the conditions in the cluster change.
Cluster startup
When cluster services are started. resource groups are activated on different cluster nodes
according the resource group startup policy you selected.
Node failure
Resource groups that are active on this node fall over to another node.
Concepts 9
Node recovery
When cluster services are started on the home node after a failure, the node reintegrates into the
cluster, and may acquire resource groups from other nodes depending on the fallback policy for
the group.
Resource failure and recovery
A resource group might fall over to another node, and be reacquired, when the resource becomes
available.
Cluster shutdown
When you stop cluster services you can choose to have the resource groups move to a backup
node or be taken offline on the current node.
The following table describes the specific options for each policy:
Startup
Startup is the activation of a resource group on a node (or multiple nodes). Resource group
startup occurs when cluster services are started.
Fallover
Fallover is the movement of a resource group from the node that currently owns the resource
group to another active node after the current node experiences a failure.
Fallover only occurs with nonconcurrent resource groups. concurrent resource groups are active
on all nodes concurrently, so the failure of a single nodes means only that instance of the resource
group on that node is affected.
Fallback
Fallback is the movement of resources to the home node when it is reintegrated in the cluster
after failure. No fallback means that resources continue to run on the same node, even after the
reintegration of home node post failure.
Each combination of these policies allows you to specify varying degrees of control over which
node, or nodes, control a resource group.
Startup, fallover, and fallback are specific behaviors that describe how resource groups behave at
different cluster events. It is important to keep in mind the difference between fallover and
fallback. These terms appear frequently in discussion of the various resource group policies.
Standby configurations
Standby configurations are the traditional redundant hardware configurations where one or more standby
nodes stand idle or running a less critical application, and will takeover the critical application should a
server or primary node fail, waiting for a server node to leave the cluster.
Concurrent resource groups are activated on all nodes concurrently and therefore cannot be used in a
standby configuration.
Example: Standby configurations with online on first available node startup policy
Figure 2. One-for-one standby configuration where IP label returns to the home node
In this setup, the cluster resources are defined as part of a single resource group. A node list is then
defined as consisting of two nodes. The first node, Node A, is assigned a takeover (ownership) priority of
1. The second node, Node B, is assigned a takeover priority of 2.
At cluster startup, Node A (which has a priority of 1) assumes ownership of the resource group. Node A
is the "server" node. Node B (which has a priority of 2) stands idle, ready should Node A fail or leave the
cluster. Node B is, in effect, the "standby".
If the server node leaves the cluster, the standby node assumes control of the resource groups owned by
the server, starts the highly available applications, and services clients. The standby node remains active
until the home node rejoins the cluster (based on the fallback policy configured). At that point, the
standby node releases the resource groups it has taken over, and the server node reclaims them. The
standby node then returns to an idle state.
The standby configuration from the previously described example can be easily extended to larger
clusters. The advantage of this configuration is that it makes better use of the hardware. The
disadvantage is that the cluster can suffer severe performance degradation if more than one server node
leaves the cluster.
Concepts 11
The following figure illustrates a three-node standby configuration using the resource groups with these
policies:
v Startup policy: Online on First Available Node
v Fallover policy: Fallover to Next Priority Node in the List
v Fallback policy: Fallback to Home Node
In this configuration, two separate resource groups (A and B) and a separate node list for each resource
group exist. The node list for Resource Group A consists of Node A and Node C. Node A has a takeover
priority of 1, while Node C has a takeover priority of 2. The node list for Resource Group B consists of
Node B and Node C. Node B has a takeover priority of 1; Node C again has a takeover priority of 2. (A
resource group can be owned by only a single node in a nonconcurrent configuration.)
Since each resource group has a different node at the head of its node list, the cluster's workload is
divided, or partitioned, between these two resource groups. Both resource groups, however, have the
same node as the standby in their node lists. If either server node leaves the cluster, the standby node
assumes control of that server node's resource group and functions as the departed node.
In this example, the standby node has three network interfaces (not shown) and separate physical
connections to each server node's external disk. Therefore, the standby node can, if necessary, take over
for both server nodes concurrently. The cluster's performance, however, would most likely degrade while
the standby node was functioning as both server nodes.
Takeover configurations
In the takeover configurations, all cluster nodes do useful work, processing part of the cluster's workload.
There are no standby nodes. Takeover configurations use hardware resources more efficiently than
standby configurations since there is no idle processor. Performance can degrade after node failure,
however, since the load on remaining nodes increases.
One-sided takeover
This configuration has two nodes actively processing work, but only one node providing highly available
services to cluster clients. That is, although there are two sets of resources within the cluster (for example,
two server applications that handle client requests), only one set of resources needs to be highly
available.
Figure 4. One-sided takeover configuration with resource groups in which IP label returns to the home node
This set of resources is defined as a PowerHA SystemMirror resource group and has a node list that
includes both nodes. The second set of resources is not defined as a resource group and, therefore, is not
highly available.
At cluster startup, Node A (which has a priority of 1) assumes ownership of Resource Group A. Node A,
in effect, “owns” Resource Group A. Node B (which has a priority of 2 for Resource Group A) processes
its own workload independently of this resource group.
If Node A leaves the cluster, Node B takes control of the shared resources. When Node A rejoins the
cluster, Node B releases the shared resources.
If Node B leaves the cluster, however, Node A does not take over any of its resources, since Node B's
resources are not defined as part of a highly available resource group in whose chain this node
participates.
This configuration is appropriate when a single node is able to run all the critical applications that need
to be highly available to cluster clients.
Mutual takeover
The mutual takeover for nonconcurrent access configuration has multiple nodes, each of which provides
distinct highly available services to cluster clients. For example, each node might run its own instance of
a database and access its own disk.
Furthermore, each node has takeover capacity. If a node leaves the cluster, a surviving node takes over
the resource groups owned by the departed node.
The mutual takeover for nonconcurrent access configuration is appropriate when each node in the cluster
is running critical applications that need to be highly available and when each processor is able to handle
the load of more than one node.
The following figure illustrates a two-node mutual takeover configuration for nonconcurrent access. In
the figure, a lower number indicates a higher priority.
Concepts 13
Figure 5. Mutual takeover configuration for nonconcurrent access
The key feature of this configuration is that the cluster's workload is divided, or partitioned, between the
nodes. Two resource groups exist, in addition to a separate resource chain for each resource group. The
nodes that participate in the resource chains are the same. It is the differing priorities within the chains
that designate this configuration as mutual takeover.
The chains for both resource groups consist of Node A and Node B. For Resource Group A, Node A has a
takeover priority of 1 and Node B has a takeover priority of 2. For Resource Group B, the takeover
priorities are reversed. Here, Node B has a takeover priority of 1 and Node A has a takeover priority of 2.
At cluster startup, Node A assumes ownership of the Resource Group A, while Node B assumes
ownership of Resource Group B.
If either node leaves the cluster, its peer node takes control of the departed node's resource group. When
the "owner" node for that resource group rejoins the cluster, the takeover node relinquishes the associated
resources; they are reacquired by the integrating home node.
The following figure illustrates a two-node mutual takeover configuration for concurrent access:
In this example, both nodes are running an instance of a server application that accesses the database on
the shared disk. The application's proprietary locking model is used to arbitrate application requests for
disk resources.
Running multiple instances of the same server application allows the cluster to distribute the processing
load. As the load increases, additional nodes can be added to further distribute the load.
Concepts 15
16 PowerHA SystemMirror for Linux Version 7.2.2
Planning for PowerHA SystemMirror for Linux
All the relevant Red Hat Package Manager (RPM) are installed automatically on starting the PowerHA
SystemMirror installation script.
To install the PowerHA SystemMirror for Linux, the following RPMs are used internally:
v powerhasystemmirror
v powerhasystemmirror.adapter
v powerhasystemmirror.policies
v powerhasystemmirror.policies.one
v powerhasystemmirror.policies.two
v powerhasystemmirror.sappolicy
PowerHA SystemMirror internally uses Reliable Scalable Cluster Technology (RSCT) (RSCT) for the
clustering technology. RSCT Version 3.2.2.4 is included in the PowerHA SystemMirror package. The
required versions of the RSCT RPM are installed automatically by default when you install the PowerHA
SystemMirror. During installation of the PowerHA SystemMirror, if RSCT is detected, and the level of
RSCT is lower than the required RSCT package, then the currently installed RSCT package is upgraded.
PowerHA SystemMirror for Linux installs the following RSCT RPMs:
v rsct.basic
v rsct.core
v rsct.core.utils
v rsct.opt.storagerm
v src
v If you define the disk tiebreaker resources, the disk on which IBM.TieBreaker resources are stored must
not be used to store file systems.
v Internet Protocol version 6 (IPv6) configuration is not supported in the PowerHA SystemMirror for
Linux.
v You can check the firewall status by running the systemctl status firewalld.service command. Firewall
must be disabled or you must open the below ports:
657/tcp
2001/tcp
2002/tcp
16191/tcp
657/udp
12143/udp
12347/udp
12348/udp
When a node is configured with multiple connections to a single network, the network interfaces serve
different functions in the PowerHA SystemMirror.
A service interface is a network interface that is configured with the PowerHA SystemMirror service IP
label. The service IP label is used by clients to access application programs. The service IP is only
available when the corresponding resource group is online.
A persistent node IP label is an IP alias that can be assigned to a specific node on a cluster network. A
persistent node IP label always stays on the same node (node-bound), and coexists on an NIC that
already has a service or boot IP label defined. A persistent node IP label does not require installing an
extra physical NIC on that node.
For PowerHA SystemMirror, you must configure a persistent IP label for each cluster node. This is useful
to access a particular node in a PowerHA SystemMirror cluster for running reports or for diagnostics.
This provides advantage that the PowerHA SystemMirror can access the persistent IP label on the node
despite individual NIC failures, provided spare NICs are present on the network.
If you assign IP aliases to NICs, it allows you to create more than one IP label on the same network
interface. During an IP address takeover by using the IP aliases, when an IP label moves from one NIC to
another, the target NIC receives the new IP label as an IP alias and keeps the original IP label and
hardware address.
Configuring networks for IP address takeover (IPAT) by using IP aliases simplifies the network
configuration in the PowerHA SystemMirror. You can configure a service address and one or more boot
addresses for NICs.
PowerHA SystemMirror uses a technology referred to as IPAT by using IP aliases for keeping IP
addresses highly available.
If you are planning for IP address takeover by using IP aliases, review the following information:
v Each network interface must have a boot IP label that is defined in the PowerHA SystemMirror. The
interfaces that are defined in the PowerHA SystemMirror are used to keep the service IP addresses
highly available.
v The following subnet requirements apply if multiple interfaces are present on a node that is attached to
the same network:
– All boot addresses must be defined on different subnets.
– Service addresses must be on a different subnet from all boot addresses and persistent addresses.
v Service address labels that are configured for IP address takeover by using IP aliases can be included in
all nonconcurrent resource groups.
v The netmask for all IP labels in a PowerHA SystemMirror for Linux network must be the same.
During a node fallover event, the service IP label that is moved is placed as an alias on the NIC of target
node in addition to any other service labels configured on that NIC.
If your environment has multiple adapters on the same subnet, all the adapters must have the same
network configuration and the adapters must be part of the PowerHA SystemMirror configuration.
The cluster node on which you want to install PowerHA SystemMirror for Linux must be running on
either one of the following versions of the Linux operating system:
v SUSE Linux Enterprise Server (SLES) 12 SP1 (64-bit)
v Red Hat Enterprise Linux (RHEL) 7.2 (64-bit) or 7.3 (64-bit).
Packaging
You can download PowerHA SystemMirror for Linux from the IBM website.
Prerequisites
You must fulfill the software and hardware requirements for PowerHA SystemMirror for Linux. Before
you install the PowerHA SystemMirror on a Linux system, you must meet the following prerequisites:
v Root authority is required to install PowerHA SystemMirror.
v The following scripting package is required in each SUSE Linux Enterprise Server and Red Hat
Enterprise Linux (RHEL) system:
– KSH93
– PERL
Checking prerequisites
To verify whether all prerequisites are met, complete the following steps:
1. Log in as root user.
2. After you have downloaded the tar file from the IBM website, extract the tar file by entering the
following command:
tar -xvf <tar file>
3. Enter the following command:
cd PHA7220Linux64
The hashtag (<#>) is a number; the highest number identifies the most recent log file.
6. If your system did not pass the prerequisites check, correct any problems before you start the
installation.
To install PowerHA SystemMirror for Linux, you must use the installation script. The installation script
runs the following actions:
Prerequisites check
The installation script runs a complete prerequisite check to verify that all required software are
available and are at the required level. If your system does not pass the prerequisite check, the
installation process does not start. To continue with the installation process, you must install the
required software.
Installing PowerHA SystemMirror for Linux
If an IBM Reliable Scalable Cluster Technology (RSCT) peer domain exists, ensure that the node
on which you are running the script is offline in the domain. Otherwise, the installation is
canceled. To install the PowerHA SystemMirror for Linux, complete the following steps:
1. Log in as root user.
2. Download the tar file from the Entitled Systems Support website and extract the tar file by
entering the following command:
tar -xvf <tar file>
3. Run the following installation script:
./installPHA
Note: You do not need to specify any of the options that are available for the installPHA
command.
4. The installation program checks prerequisites to verify that all the required software are
available and are at the required level. If your system does not pass the prerequisites check,
the installation does not start, and you must correct any problems before you restart the
installation. Information about the results of the prerequisites check is available in the
following log file:
/tmp/installPHA.<#>.log
5. After the system passes the prerequisite check, read the information in the license agreement
and the license information. You can scroll forward line-by-line by using the Enter key, and
page-by-page with the space bar. After reviewing the license information, to indicate
acceptance of the license terms and conditions, press the Y key. Any other input cancels the
installation.
6. After you accept the license agreement, installation proceeds You must check the following
log file for information about the installation:
/tmp/installPHA.<#>.log
The hashtag (<#>) is a number; the highest number identifies the most recent log file.
The cluster snapshot utility saves record of all data that defines a specific cluster configuration. Snapshot
restoration is the process of recreating a specific cluster configuration by using the cluster snapshot utility.
You can restore a snapshot on nodes even if a cluster does not exist.
You can use the clmgr snapshot command with actions such as add, modify, query, or delete to utilize
snapshot utility feature.
Related information:
clmgr command
Installing 21
22 PowerHA SystemMirror for Linux Version 7.2.2
Configuring PowerHA SystemMirror for Linux
After you install PowerHA SystemMirror for Linux, you can configure the product by creating a cluster
and by adding nodes to the cluster.
The configuration path significantly automates the discovery and selection of configuration information
and chooses default behaviors.
Before you configure a cluster, PowerHA SystemMirror for Linux must be installed on all nodes and
connectivity must exist between the node where you are performing the configuration and all other
nodes that need to be included in the cluster.
Network interfaces must be both physically and logically configured with the Linux operating system so
that communication occurs from one node to each of the other nodes. The host name and IP address
must be configured on each interface.
Note: All host name, service IP address, persistent IP address, and labels must be configured in the
/etc/hosts file.
All node IP addresses must be added to the /etc/cluster/rhosts file before you configure the cluster to
verify that information is collected from the systems that belong to the cluster. PowerHA SystemMirror
uses all configured interfaces on the cluster nodes for cluster communication and monitoring. All
configured interfaces are used to keep cluster IP addresses highly available.
Configuring a cluster
In this scenario, if you are creating a cluster that is named clMain, and if the participating nodes are
nodeA and nodeB, enter the following command:
clmgr create cluster clMain NODES=nodeA,nodeB
After creating the clMain cluster with nodeA and nodeB, you can also add nodeC to the clMain cluster by
using the clmgr add node nodeC command.
Related information:
clmgr command
You must first define resources that are made available by the PowerHA SystemMirror for Linux for an
application and then group them together in a resource group. You can add all the resources at once or
separately.
A persistent node IP label is an IP alias that can be assigned to a specified node on a network. A
persistent IP label has the following features:
v Always remains on the same node.
v Co-exists with other IP labels that are present in an interface.
v Does not require an additional physical interface in a node.
v Always remains in an interface of the PowerHA SystemMirror network.
You can assign a persistent node IP label to a network on a node that allows you to have a node-bound
address on a cluster network that you can use for administrative purposes to access a specific node in the
cluster.
In this example, the persistent node IP label named pip_node1 is assigned to the node1 node and uses
interfaces that are defined in the net_ether_01 network.
Consider the following limitations for using the persistent node IP label or IP address:
v You can define only one persistent IP label on each node.
v Persistent node IP labels are available at the boot time of a node.
v You must configure persistent node IP labels individually on each node.
v To change or show persistent node IP labels, you must use the Modify and Query ACTION and
persistent node IP label in the clmgr command.
Service IP labels and IP addresses are used to establish communication between client nodes and the
server node. Services such as a database application are provided by using the connection that is made
over the service IP label. The /etc/hosts file on all nodes must contain all IP labels and associated IP
addresses that you define for the cluster that includes service IP labels and addresses.
In this example, the service IP label named sip is established and is made available by using the
net_ether_01 network.
Configuring 25
Note: Enter the service IP label that you want to keep highly available. The name of the service IP label
or IP address must be unique within the cluster and distinct from the resource group names and it must
relate to the application and also to any corresponding device. For example, sap_service_address.
A PowerHA SystemMirror application is a cluster resource that is used to control an application that user
want to make highly available. It contains scripts for starting, stopping, and monitoring an application.
In this example, an application named app_1 will be a non-concurrent application because the
RESOURCETYPE flag is set to 1. The app_1 application uses the STARTSCRIPT flag for starting application,
the MONITORMETHOD flag for monitoring application, and the STOPSCRIPT flag for stopping application.
The application process which is invoked from the STARTSCRIPT must be detached from the calling script
by using either of the following method:
v Redirect all file handles to a file and start the application process in the background. For example
/path/to/application >/outputfile 2>&1 &
v Create a wrapper application that uses the setsid() C-function to detach the application process from the
calling STARTSCRIPT.
When you configure an application, the application performs the following actions:
v Associates a meaningful name with the application. For example, the application you are using with
PowerHA SystemMirror is named sapinst1. You use this name to refer to the application when you
define it as a resource. When you set up the resource group that contains this resource, you define an
application as a resource.
v Allows you to configure application start, stop, and monitoring scripts for that application.
v Reviews the vendor documentation for specific product information about starting and stopping a
particular application.
v Verifies that the scripts exist and has an executable permissions on all nodes that participate as possible
owners of the resource group where this application is defined.
The clmgr add application command includes the following attributes:
Application
Enter an ASCII text string that identifies the application. You use this name to refer to the
application when you add it to the resource group. The application name can include
alphanumeric characters and underscores.
STARTSCRIPT
Enter the full path name of the script followed by arguments that are called by the event scripts
of the cluster to start the application. Although, this script must have the same name and location
on every node, the content and function of the script can be different. You can use the same script
and runtime conditions to modify the runtime behavior of a node.
STOPSCRIPT
Enter the full path name of the script that is called by the cluster event scripts to stop the
application. This script must be in the same location on each cluster node that can start the
application. Although, this script must have the same name and location on every node, the
content and function of the script can be different. You can use the same script and runtime
conditions to modify runtime behavior of a node.
In this example, the RG1 resource group is non-concurrent resource group that uses default policies. The
resource group contains the sip IP address and the app_1 application.
If you are defining the resources in your resource groups, ensure that you are aware of following
information:
v A resource group might include multiple service IP addresses. According to the resource group
management policies in PowerHA SystemMirror, when a resource group is moved, all service labels in
the resource group are moved as aliases to the available interfaces.
v When you define a service IP label or IP address on a cluster node, the service IP label can be used in
any non-concurrent resource group.
Configuring 27
Table 1. Add a resource group fields (continued)
Field Value
STARTUP Defines the startup policy of the resource group:
ONLINE ON FIRST AVAILABLE NODE
The resource group activates on the first node that becomes available.
ONLINE ON ALL AVAILABLE NODES
The resource group is made online on all nodes. This is similar to the behavior of
concurrent resource group.
If you select this option for the resource group, ensure that resources in this group can be
made online on multiple nodes simultaneously.
FALLOVER Select a value from the list that defines the fallover policy of the resource group:
FALLOVER TO NEXT PRIORITY NODE IN THE LIST
In the case of a fallover, the resource group that is online on only one node at a time
follows the default node priority order that is specified in the nodelist attribute of the
resource group (it moves to the highest priority node this is currently available).
BRING OFFLINE
Select this option to make the resource group offline.
FALLBACK Select a value from the list that defines the fallback policy of the resource group:
NEVER FALLBACK
A resource group does not fall back when a higher priority node joins the cluster.
List the service IP labels to be taken over when this resource group is taken over. These include
addresses that rotate or that might be taken over.
APPLICATION Specify the application to include in the resource group.
You can modify resource group attributes or the different resources such as application, service IP label,
or IP address by using the clmgr modify resource_group command.
Note: The maximum number of resource groups that are supported is 32.
You can make a filesystem highly available by using the start, stop, monitor scripts of an application
resource. The sample scripts to support this functionality and the readme file are available in the
PowerHA SystemMirror for Linux package in the sample_scripts folder.
To support the Shared storage functionality, you can perform the following steps:
1. Create a volume group or required filesystem by using Linux command.
2. Create a custom application resource with relevant start, stop, monitor scripts by using the clmgr
commands.
v The start script does the volume group activation and mounting of filesystem.
v The monitor script monitors both volume group status and mounting of filesystem.
v The stop script deactivates the volume group and unmount the filesystem
3. Add the application resource in the relevant resource group by using the clmgr commands.
Tuning a cluster for optimal performance, can help you to avoid common PowerHA SystemMirror cluster
problems.
PowerHA SystemMirror writes messages that it generates to the system console and to several log files.
Each log file contains a different subset of messages that are generated by the PowerHA SystemMirror
software. When viewed as a group, the log files provide a detailed view of all cluster activity.
The following list describes the log files into which the PowerHA SystemMirror software writes messages
and the types of cluster messages they contain. The list also provides recommendations for using
different log files.
Note: Only default directories are listed in the following table. If you redirect any log files, you must
check the appropriate location.
system log messages
Contains time-stamped and formatted messages from all subsystems, including scripts and
daemons.
The log file name is /var/log/messages.
/var/pha/log/clcomd/clcomddiag.log
Contains time-stamped, formatted, diagnostic messages generated by the clcomd daemon.
Recommended Use: Information in this file is for IBM Support personnel.
/var/pha/log/hacmp.out
Contains time-stamped, formatted messages generated by PowerHA SystemMirror events.
/var/pha/log/clmgr/clutils.log
Contains information about the date, time, commands generated by using the clmgr command.
The clsnap command collects the log files from all nodes of the cluster and saves the log data as a
compressed .tar file in the /tmp/ibmsupt/hacmp directory for every cluster node. The logs of the clsnap
command are created in the /tmp/ibmsupt/hacmp.snap.log file. The clsnap command performs the
following operations:
v To collect logs from the local node, enter the following command:
clsnap -L
v To collect logs from the specified node, enter the following command:
clsnap -n <node-name>
v To collect logs in a specified directory, enter the following command:
clsnap –d <dir>
Problem
Cluster creation fails with a reason that it cannot connect to other cluster nodes.
Solution
A number of situations can cause this problem. Review the following information to identify a possible
solution for this problem:
v Check whether the IPv4 address entry exists in the /etc/cluster/rhosts file on all nodes of cluster.
v If any entry is recently added or updated, refresh the Cluster Communication Daemon subsystem
(clcomd) by using the refresh -s clcomdES command on all nodes of a cluster and then create the
cluster again.
v Check whether the Cluster Communications (clcomdES) daemon process is running by using the ps
-aef | grep clcomd command. If the clcomdES daemon process is not listed in the process table, start
the clcomd daemon manually by using the startsrc -s clcomdES command.
v Check and ensure that other nodes are not part of any cluster.
v Check that the hostname of the different nodes are there in the /etc/hosts file.
v Check the Firewall status by running the systemctl status firewalld.service command. Firewall should
be disabled and the following ports must be opened:
657/tcp
2001/tcp
2002/tcp
16191/tcp
657/udp
12143/udp
12347/udp
12348/udp
– To open the port, enter the following command:
Problem
Solution
You can review the following information to identify possible solutions for this problem:
1. Check for messages that are generated when you run the StartCommand command for that resource
in the system log file (/var/log/messages), and in the ps –ef process table. If the StartCommand
command is not run, proceed with next step, otherwise investigate why the application is online.
2. Either more than half of the nodes in the cluster are online or exactly half of the nodes are online and
the tiebreaker function is reserved. If less than half of the nodes are online, start the additional nodes.
If exactly half of the nodes are online, check the attribute of the active tiebreaker. You can check the
active tiebreaker by running the clmgr query cluster command.
3. In some scenarios, a resource moves to the Sacrificed state when the PowerHA SystemMirror might
not find a placement for the resource. PowerHA SystemMirror cannot start this resource because there
is no single node on which this resource might be started.
To resolve this problem, ensure that the Network, which the Service IP resource in the resource group
uses, does have at least one of the nodes included. This node must be part of the resource group
nodelist. To check whether different nodes are assigned on a network, run the clmgr query network
command. If different nodes are assigned on the network, delete the network and add it again with
correct entries of the nodes by using the clmgr add interface command. To display the detailed
information about resource groups and the resources, run the clRGinfo -e command. This solution
might resolve the issues.
If the application resource is in Sacrificed state, check whether the resource groups are not on the
same node when they have AntiCollocated relationship between them. To resolve this issue, move
one of the resource groups to the other node.
Problem
Solution
A resource group is composed of several resources. If none of resources of the resource group is starting,
perform the following steps:
1. Identify which of the resources must start first by evaluating the relationship status between them.
2. Check all requests against the resource group, and evaluate all relationships in which the resource
group is defined as a source.
Problem
Solution
If a cluster goes in warning state, you can check the following scenario and resolve it.
v All the nodes of a cluster are not in Online state.
v All the nodes of a cluster are not reachable.
Problem
Solution
This error occurs if you use the cloned operating system images. To fix this issue, you must reset the
cluster configuration by running the /opt/rsct/install/bin/recfgct -F command on the node that is specified
in the error message. This action resets the RSCT node ID.
Problem
Solution
Check if the disk is shared among all the nodes by comparing the Universally Unique Identifier (UUID)
of the disk on all the nodes of the cluster. You can additionally perform the following steps:
v Use the lsscsi command to list the SCSI devices in the system.
v Check the /usr/lib/udev/scsi_id -gu <SCSI_disk#> file for all nodes to check the Disk ID attribute,
and ensure that the Disk ID attribute is same across all nodes of the cluster
Problem
Check whether the disk is a shared disk among all nodes by comparing the Port VLAN Identification
(PVID) of the disk on all nodes of the cluster. You must ensure that the PVID of that disk is same on all
nodes.
Problem
PowerHA SystemMirror software is not able to detect the shared disk across nodes of the cluster.
Solution
v Use the lsrsrc IBM.Disk command to view the common disk between two nodes and ensure that the
cluster is present. The lsrsrc command works only if the cluster is located in the common disk.
v You must choose the DeviceName attribute that corresponds to the nodelist attribute, which has the total
number for nodes of the cluster.
Problem
PowerHA SystemMirror sets the resource to the Failed Offline state because the previous attempt to start
the resources failed.
Solution
After this step, the resource will be in the Offline state and the PowerHA SystemMirror will start
the resource again if the resource is in the Online state.
Problem
PowerHA SystemMirror sets the resource group to the Failed Offline state because previous attempt to
start the resource group failed.
Solution
If the resources of a resource group do not start and the resource group shows Failed Offline state, it
indicates that the binder was unable to find a placement for the resources and now the resource group
shows Sacrificed state.
Also, check whether other resources have Failed Offline state and set the resource Online or Offline
state.
Problem
PowerHA SystemMirror shows that the state of the resource group is Stuck Online.
Solution
PowerHA SystemMirror fails to selectively move the affected resource group to another cluster node
when a node crashes or restarts.
Solution
Check if either more than half of the nodes in the cluster are online or exactly half of the nodes are online
and the tiebreaker is reserved. If less than half of the nodes are online, start additional nodes. If exactly
half of the nodes are online, check the attribute of the active tiebreaker.
Solution
v Ensure that the path and arguments of all the scripts are correct.
v Try to manually run the start script and redirect all output to a file and observe the behavior of scripts.
For example, run the /usr/bin/application >/outputfile script.
v Ensure that all the return codes of start, stop, and monitor scripts are returning correct values. The
monitor script shall return the value 1 for ONLINE or 2 for OFFLINE status.
v The application process must be detached from the calling script by using either of the following
methods.
– Redirect all file handles to a file and start the application process in the background, for example
/path/to/application >/outputfile 2>&1 &
– Create a wrapper application that uses the setsid() C-function to detach the application process from
the calling StartScript.
The PowerHA SystemMirror GUI provides the following advantages over the PowerHA SystemMirror
command line:
v Monitor the status for all clusters, sites, nodes, and resource groups in your environment in a single,
unified view. If any clusters are experiencing a problem, those clusters are always displayed at the top
of the view, so you will be sure to see them.
v Group clusters into zones to better organize your enterprise. Zones can be used to restrict user access
to clusters, and can be based on function, geographical location, customer, or any other characteristics
that make sense for your business.
v Management features are provided, allowing authorized users to perform actions on their clusters, such
as starting and stopping cluster services and resource groups, move resource groups to new nodes,
create new clusters, create new resource groups with resources, and more.
v User permissions provide security controls, so that users can be restricted to only the capabilities that
they are specifically authorized to have.
v Scan event summaries and read a detailed description for each event. If the event occurred because of
an error or issue in your environment, you can read suggested solutions to fix the problem.
v Search and compare log files side by side. Some commonly used log files are displayed in an easier to
read format to make identifying important information easier than ever before.
v View properties for a cluster such as the PowerHA SystemMirror version and name of sites and nodes.
The nodes in the clusters on where you are running the install scripts must be running one of the
following versions of the Linux operating system:
v SUSE Linux Enterprise Server (SLES) 12 SP1 (64-bit)
v Red Hat Enterprise Linux (RHEL) 7.2 (64-bit) or 7.3 (64-bit).
Note:
v Before using the PowerHA SystemMirror GUI, you must install and configure secure shell (SSH) on
each node.
v OpenSSL and OpenSSH must be installed on the system that is used as the PowerHA SystemMirror
GUI server.
You can install the PowerHA SystemMirror GUI on Linux operating system by using the installPHAGUI
script. For more information, see “Installing PowerHA SystemMirror GUI” on page 38.
OpenSSL and OpenSSH must be installed on the system that is used at the PowerHA SystemMirror GUI
server. OpenSSL is used to create secure communication between PowerHA SystemMirror GUI server and
nodes in the cluster. For more information, see the OpenSSL website and the OpenSSH website.
If the path is not correct, you must enter the correct path in the /etc/ssh/sshd_config file, and then
restart the sshd subsystem.
You will need the following information to properly configure SSH for the PowerHA SystemMirror GUI:
Note: You need to connect to only one node in the cluster. After the node is connected, the PowerHA
SystemMirror GUI automatically adds all other nodes in the cluster.
v Host name or IP address of at least one node of the cluster.
v User ID and corresponding password which will be used for SSH authentication on that node.
v SSH password or SSH key location.
Prerequisites
The following prerequisites must be met before you install the PowerHA SystemMirror GUI on a Linux
system:
v The PowerHA SystemMirror package must be installed on your system.
v The KSH93 scripting package is required on each SUSE Linux Enterprise Server and Red Hat
Enterprise Linux (RHEL) system
Note: If any previous installation of the agent is detected, it will be automatically uninstalled before
installing the new version.
4. To install the server only, run the following script:
./installPHAGUI –s -c
After the PowerHA SystemMirror GUI is installed, you must run the /usr/es/sbin/cluster/ui/server/bin/
smuiinst.ksh command to complete the installation process. The smuiinst.ksh command automatically
downloads and installs the remaining files that are required to complete the PowerHA SystemMirror GUI
installation process. These downloaded files are not shipped in the RPMs because the files are licensed
under the General Public License (GPL).
The PowerHA SystemMirror GUI server must have internet access or an HTTP proxy that is configured
to allow access to the internet to run the smuiinst.ksh command. If you are using an HTTP proxy, you
must run the smuiinst.ksh -p command to specify the proxy information, or you must specify the proxy
information by using the http_proxy environment variable. If the PowerHA SystemMirror GUI server
does not have internet access, complete the following steps:
1. Copy the smuiinst.ksh file from the PowerHA SystemMirror GUI server to a system that is running
the UNIX compatible operating system that has internet access.
2. Run the smuiinst.ksh -d /directory command where /directory is the location where you want to the
download the files. For example, /smuiinst.ksh –d /tmp/smui_rpms.
3. Copy the downloaded files (/tmp/smui_rpms) to a directory on the PowerHA SystemMirror GUI
server.
4. From the PowerHA SystemMirror GUI server, run the smuiinst.ksh -i /directory command where
/directory is the location where you copied the downloaded files (/tmp/smui_rpms).
Note: The first time you log in to the PowerHA SystemMirror GUI, you must add clusters to the GUI or
create new clusters.
| To add existing clusters to the PowerHA SystemMirror GUI, complete the following steps:
| To create new clusters for the PowerHA SystemMirror GUI, complete the following steps:
| Health summary
| In the PowerHA SystemMirror GUI, you can quickly view all events for a cluster in your environment.
| The following figure identifies the different areas of the PowerHA SystemMirror GUI that are used to
| view events and status.
|
|
|
| Figure 7. Health summary
|
| Navigation pane
| This area displays all the zones, clusters, sites, nodes, and resource groups in a hierarchy that was
| discovered by the PowerHA SystemMirror GUI. You can click to view resources for each cluster.
| Note: The clusters are displayed in alphabetic order. However, any clusters that are in a Critical
| or Warning state are listed at the top of the list.
| Note: The Network tab is not included for the PowerHA SystemMirror for Linux Version 7.2.2.
| Health Summary
| This menu provides cluster administrative features for the selected item. You can select Add
| Cluster, Create Zone, Remove Cluster, or Create Cluster from the Health Summary
| menu.
| Event filter
| In this area, you can click the icons to display all events in your environment that correspond to
| a specific state. You can also search for specific event names.
| Event timeline
| This area displays events across a timeline of when the event occurred. This area allows you to
| view the progression of events that lead to a problem. You can zoom in and out of the time range
| by using the + or – keys or by using the mouse scroll wheel.
| Event list
| This area displays the name of the event, the time when each event occurred, and a description of
| the event. The information that is displayed in this area corresponds to the events you selected
| from the event timeline area. The most recent event that occurred is displayed first. You can click
| this area to display more detailed information about the event such as possible causes and
| suggested actions.
| Action Menu
| This area displays the following menus options:
| User Management
| PowerHA SystemMirror GUI allows an admin to create and manage users by using User
| Management menu. The admin can assign built-in roles to new users.
| Note: You can only add user names that are defined on the host running the PowerHA
| SystemMirror GUI server.
| Role Management
| The Role Management tab displays information about available roles for each user. An
| admin can create custom roles and provide permission to different users. PowerHA
| SystemMirror GUI provides the following roles:
| v ha_root
| v ha_mon
| v ha_op
| v ha_admin
| Zone Management
| You can create zones, which are groups of clusters. An admin can create zones and assign
| any number of clusters to a zone. You can also add new zones or edit existing zones.
| View Activity Log
| You can view information about all activities performed in the PowerHA SystemMirror
| GUI that resulted in a change by using the View Activity Log tab. This view provides
| various filters to search for specific activities for the cluster, roles, zone, or user
| management changes.
Log files
You can use the following log files to troubleshoot PowerHA SystemMirror GUI:
smui-server.log
This log file is located in the /usr/es/sbin/cluster/ui/server/logs/ directory. The
smui-server.log file contains information about the PowerHA SystemMirror GUI server.
smui-agent.log
This log file is located in the /usr/es/sbin/cluster/ui/agent/logs/ directory. The smui-agent.log
file contains information about the agent that is installed on each PowerHA SystemMirror node.
notify-event.log
This log file is located in the /usr/es/sbin/cluster/ui/agent/logs/ directory. The
notify-event.log file contains information about all PowerHA SystemMirror events that are sent
from the agent to the PowerHA SystemMirror server.
SUSE:
auth requisite pam_nologin.so
auth include common-auth
account requisite pam_nologin.so
account include common-account
Note: The PAM configuration occurs when you install the PowerHA SystemMirror GUI server.
If you are not able to add clusters to the PowerHA SystemMirror GUI, complete the following steps:
1. Check for issues in the /usr/es/sbin/cluster/ui/server/logs/smui-server.log file.
a. If sftp-related signatures exist in the log file, such as Received exit code 127 while establishing
SFTP session, a problem exists with the SSH communication between the PowerHA SystemMirror
GUI server and the cluster you are trying to add.
If the path is not correct, you must enter the correct path in the /etc/ssh/sshd_config file, and
then restart the sshd subsystem.
2. Check for issues in the /usr/es/sbin/cluster/ui/agent/logs/agent_deploy.log file on the target
cluster.
3. Check for issues in the /usr/es/sbin/cluster/ui/agent/logs/agent_distribution.log file on the
target cluster.
If the PowerHA SystemMirror GUI is not updating the cluster status or displaying new events, complete
the following steps:
1. Check for issues in the /usr/es/sbin/cluster/ui/server/logs/smui-server.log file.
2. Check for issues in the /usr/es/sbin/cluster/ui/agent/logs/smui-agent.log file. If certificate-related
problem exists in the log file, the certificate on the target cluster and the certificate on the server do
not match. An example of a certificate error follows:
WebSocket server - Agent authentication failed, remoteAddress:::ffff:10.40.20.186, Reason:SELF_SIGNED_CERT_IN_CHAIN
The SAP HANA high availability policy defines all SAP HANA components as resources and starts or
stops them in a pre-defined sequence to provide high availability for your SAP HANA system. The SAP
HANA components must be specified as an automated resources in the PowerHA SystemMirror by using
the Smart Assist.
The following components of the software stack in an SAP HANA installation need to be Highly
available:
Primary and Secondary Host with the following processes on each host:
v The hdbdaemon daemon manages the following subprocesses:
– hdbindexserver
– hdbnameserver
– hdbxsengine
– hdbwebdispatcher
– hdbcompileserver
– hdbpreprocessor
v The sapstartsrv start, stop, and monitor processes for the hdbdaemon daemon.
v IP address to access the primary host
Prerequisites
The following are general SAP HANA prerequisites for the System Replication scenario having data
preload:
v The SAP HANA software version of the secondary system must be equal to primary system.
v The secondary system must have the same SAP system ID (SID) and same instance number as the
primary system.
v System replication between two systems on the same host is not supported.
SAP HANA High Availability feature is defined for the SAP ID TS2 and for the instance name as HDB02.
The SAP ID TS2 and the instance name HDB02 are provided during SAP HANA installation.
Smart Assist for PowerHA SystemMirror Smart Assist for PowerHA SystemMirror 47
Table 2. SAP HANA parameters (continued)
SI Number Parameter Description Value type Value (Example)
8 Specify all site names of your SAP HANA List of values, value type for HANA_SITE_1
nodes. each value: String
HANA_SITE_2
Note: Value harvesting is provided for this
parameter.
Depending on the option that you select, one of the following actions is performed:
Yes, activate as new policy
The Smart Assist policy is activated as a new policy. If you chose this option, you are invoking
the following command:
clmgr add smart_assist APPLICATION =SAP_HANA SID=<value> INSTANCE=<instance value>
Yes, activate by updating currently active policy
The Smart Assist policy is activated by updating the currently active policy. If you chose this
option, you are invoking the following command:
clmgr update smart_assist APPLICATION=SAP_HANA SID=<value> INSTANCE=<instance value>
No, save modifications and exit
The Smart Assist policy activation is not performed. Your modifications are saved and the wizard
is closed.
No, return to parameter overview
The Smart Assist policy activation is not performed. Your modifications are not saved and the
wizard returns to the Overview dialog.
To verify the start and stop SAP HANA High Availability solution, perform the following steps:
v To start your SAP HANA system by entering the clmgr online resource_group ALL command.
v To display the different resource groups created by the wizard and their corresponding state, enter the
clRGinfo command.
v To stop your SAP HANA system, enter the clmgr offline resource_group ALL command.
The high availability setup reduces the downtime of an SAP system in case of any software or hardware
failures. The high availability solution for SAP uses PowerHA SystemMirror to automate all SAP
components. PowerHA SystemMirror detects failed components and restarts or initiates a failover. This
setup also helps to reduce the operational complexity of an SAP environment and to avoid operator
errors resulting from this complexity. PowerHA SystemMirror supports high availability for ABAP and
JAVA SAP Central Services installations.
Smart Assist for PowerHA SystemMirror Smart Assist for PowerHA SystemMirror 49
v User-defined resources like Persistent IP can be deleted on configuring Smart Assist policies.
v You must configure the NFS tiebreaker by using the clmgr command.
Prerequisites
The following are SAP NetWeaver prerequisites for configuration, which involves JAVA, ABAP, and
App-server:
v The SAP HANA software version of the secondary system must be equal to the primary system.
v The SAP system ID (SID) and instance number of SAP NetWeaver must be same on both the cluster
nodes (primary and failover node).
v You must ensure to authorize the <sid>adm user to run PowerHA SystemMirror cluster commands.
Examples
1. To configure SAP ABAP(ASCS) HA solution, enter the following command:
clmgr setup smart_assist APPLICATION=SAP_ABAP SID=TS2 INSTANCE=SCS02
2. To configure SAP JAVA(SCS) HA solution, enter the following command:
clmgr setup smart_assist APPLICATION=SAP_JAVA SID=TS2 INSTANCE=SCS02
3. clmgr setup smart_assist APPLICATION=SAP_APPSERVER SID=TS2
SAP NetWeaver High Availability is defined for SAP ID TS2 and instance name as SCS02. SAP ID TS2
and instance name SCS02 are provided during SAP NetWeaver installation.
To create the SAP NetWeaver Smart Assist setup, complete the following steps:
1. From the command line, run the clmgr setup smart_assist command.
2. Enter the flag value as follows:
APPLICATION
Enter the name of an application to configure by using Smart Assist, for example, SAP_ABAP
or SAP_JAVA or SAP_APPSERVER.
SID Enter the SAP system ID configured during SAP installation.
INSTANCE
Instance name is used for the instance directory that contains all necessary files for the SAP
Central Services instance.
Smart Assist for PowerHA SystemMirror Smart Assist for PowerHA SystemMirror 51
Table 3. ABAP policy parameters (continued)
SI Number Parameter Description Value type Value (Example)
17 Do you want SA MP to automate your SAP {yes|no} yes
router?
17.1 Enter the desired prefix for the SAP router String SAP_ROUTER
resources
17.2 Enter the nodes where you want to automate the List of values, value type for Node1
SAP router each value: Hostname
Node2
17.3 Specify the virtual IPv4 address that clients will IP version 4 address 172.19.15.17
use to connect to the SAP router
17.4 Specify the netmask for the SAP router virtual IP IP version 4 address 255.255.255.0
address
17.5 Enter the network interface for the SAP router IP String (plus additional value Eth1
address checking)
17.6 Specify the fully qualified SAP router routing String /usr/sap/TS2/SYS/global/
table filename saprouttab
18 Do you want SA MP to automate the SAP Web {yes|no} yes
dispatcher?
18.1 Enter the desired prefix for the SAP Web String SAP_AWISP
Dispatcher resources
18.2 Enter the nodes where you want to automate the List of values, value type for Node1
SAP Webdispatcher. each value: Hostname
Node2
18.3 Specify the SAP system ID (SAPSID) for the SAP String (plus additional value W0
Web dispatcher checking)
18.4 Specify the instance owner username that will String W0adm
be used to execute the start, stop and monitor
commands for SAP Web dispatcher resources.
18.5 Specify the instance name of the SAP web String W00
dispatcher instance, i.e. 'W00'
18.6 Specify the virtual hostname for the SAP Web Hostname Node1
dispatcher
Node2
18.7 Specify the virtual IPv4 address that clients will IP version 4 address 172.19.15.18
use to connect to the SAP web dispatcher.
18.8 Specify the netmask for the SAP Web dispatcher IP version 4 address 255.255.255.0
virtual IP address.
18.9 Specify the network interface on which SAP web String (plus additional value Eth1
dispatcher virtual IP address is activated on each checking)
node as alias.
Smart Assist for PowerHA SystemMirror Smart Assist for PowerHA SystemMirror 53
Table 4. JAVA policy parameters (continued)
SI Number Parameter Description Value type Value (Example)
15.1 Specify the virtual hostname List of values, value type for Node1
for each application server. Use each value: Hostname
the same order as for the Node2
nodes in one of the previous
questions. If you installed one
of the application servers
without a virtual hostname,
specify the system hostname
instead.
16 Do you want to automate the {yes|no} no
SAP Host Agent?
16.1 Enter the nodes where you List of values, value type for Node1
want to automate the SAP each value: Hostname
Host Agent. Node2
17 Do you want PHAto automate {yes|no} yes
your SAP router?
17.1 Enter the desired prefix for the String SAP_ROUTER
SAP router resources
17.2 Enter the nodes where you List of values, value type for Node1
want to automate the SAP each value: Hostname
router Node2
17.3 Specify the virtual IPv4 IP version 4 address 172.19.15.17
address that clients will use to
connect to the SAP router
17.4 Specify the netmask for the IP version 4 address 255.255.255.0
SAP router virtual IP address
17.5 Enter the network interface for String (plus additional value Eth1
the SAP router IP address checking)
17.6 Specify the fully qualified SAP String /usr/sap/TS2/SYS/global/
router routing table filename saprouttab
18 Do you want PHA to automate {yes|no} yes
the SAP Web dispatcher?
18.1 Enter the desired prefix for the String SAP_WDISP
SAP Web Dispatcher resources
18.2 Enter the nodes where you List of values, value type for Node1
want to automate the SAP each value: Hostname
Webdispatcher. Node2
18.3 Specify the SAP system ID String (plus additional value JW1
(SAPSID) for the SAP Web checking)
dispatcher
18.4 Specify the instance owner String Jw1adm
username that will be used to
execute the start, stop and
monitor commands for SAP
Web dispatcher resources.
18.5 Specify the instance name of String W00
the SAP web dispatcher
instance, i.e. 'W00'
18.6 Specify the virtual hostname Hostname Node1
for the SAP Web dispatcher
Node2
18.7 Specify the virtual IPv4 IP version 4 address 172.19.15.18
address that clients will use to
connect to the SAP web
dispatcher.
Depending on the option that you select, one of the following actions is performed:
Yes, activate as new policy
The Smart Assist policy is activated as a new policy. If you chose this option, you are invoking
the following command:
clmgr add smart_assist APPLICATION =SAP_HANA SID=<value> INSTANCE=<instance value>
Yes, activate by updating currently active policy
The Smart Assist policy is activated by updating the currently active policy. If you chose this
option, you are invoking the following command:
clmgr update smart_assist APPLICATION=SAP_HANA SID=<value> INSTANCE=<instance value>
No, save modifications and exit
The Smart Assist policy activation is not performed. Your modifications are saved and the wizard
is closed.
No, return to parameter overview
The Smart Assist policy activation is not performed. Your modifications are not saved and the
wizard returns to the Overview dialog.
Smart Assist for PowerHA SystemMirror Smart Assist for PowerHA SystemMirror 55
Modify Smart Assist
Modify the active policy and removes all the existing resources. All resources that are not deleted
are not stopped.
Delete Smart Assist
Deactivates the active policy. All existing resources are deleted.
To verify the start and stop SAP NetWeaver High Availability solution, perform the following steps:
v To start your SAP NetWeaver system by entering the clmgr online resource_group ALL command.
v To display the different resource groups created by the wizard and their corresponding state, enter the
clRGinfo command.
v To stop your SAP NetWeaver system, enter the clmgr offline resource_group ALL command.
Problem
PowerHA SystemMirror is not able to harvest some values during Wizard execution.
Solution
This problem has two possible causes:
v Check that all the nodes of cluster are online. This can be checked by using the clmgr query node -v
command.
v Check and run the harvest command that is mentioned in the Smart Assist Wizard manually on the
Linux system. If the harvest command works, it is probably due to delay in system response as Smart
Assist waits for 10 seconds to get response from the harvest command.
Solution
The SAP HANA Replication Mode and Replication Site Name that is configured during Smart Assist
Wizard must be same as provided during SAP HANA Replication setup.
PowerHA SystemMirror policy fails to activate with reason of some file or path not able to find.
Solution
Smart Wizard does not detect or show one or more Ethernet interfaces
in the list
Solution
Check if the Ether interfaces are up and running. If interfaces are just brought up, it may take sometime
for the Smart Assist to detect interfaces.
Smart Assist for PowerHA SystemMirror Smart Assist for PowerHA SystemMirror 57
58 PowerHA SystemMirror for Linux Version 7.2.2
Notices
This information was developed for products and services offered in the US.
IBM may not offer the products, services, or features discussed in this document in other countries.
Consult your local IBM representative for information on the products and services currently available in
your area. Any reference to an IBM product, program, or service is not intended to state or imply that
only that IBM product, program, or service may be used. Any functionally equivalent product, program,
or service that does not infringe any IBM intellectual property right may be used instead. However, it is
the user's responsibility to evaluate and verify the operation of any non-IBM product, program, or
service.
IBM may have patents or pending patent applications covering subject matter described in this
document. The furnishing of this document does not grant you any license to these patents. You can send
license inquiries, in writing, to:
For license inquiries regarding double-byte character set (DBCS) information, contact the IBM Intellectual
Property Department in your country or send inquiries, in writing, to:
This information could include technical inaccuracies or typographical errors. Changes are periodically
made to the information herein; these changes will be incorporated in new editions of the publication.
IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this
publication at any time without notice.
Any references in this information to non-IBM websites are provided for convenience only and do not in
any manner serve as an endorsement of those websites. The materials at those websites are not part of
the materials for this IBM product and use of those websites is at your own risk.
IBM may use or distribute any of the information you provide in any way it believes appropriate without
incurring any obligation to you.
Licensees of this program who wish to have information about it for the purpose of enabling: (i) the
exchange of information between independently created programs and other programs (including this
one) and (ii) the mutual use of the information which has been exchanged, should contact:
Such information may be available, subject to appropriate terms and conditions, including in some cases,
payment of a fee.
The licensed program described in this document and all licensed material available for it are provided
by IBM under terms of the IBM Customer Agreement, IBM International Program License Agreement or
any equivalent agreement between us.
The performance data and client examples cited are presented for illustrative purposes only. Actual
performance results may vary depending on specific configurations and operating conditions.
Information concerning non-IBM products was obtained from the suppliers of those products, their
published announcements or other publicly available sources. IBM has not tested those products and
cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM
products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of
those products.
Statements regarding IBM's future direction or intent are subject to change or withdrawal without notice,
and represent goals and objectives only.
All IBM prices shown are IBM's suggested retail prices, are current and are subject to change without
notice. Dealer prices may vary.
This information is for planning purposes only. The information herein is subject to change before the
products described become available.
This information contains examples of data and reports used in daily business operations. To illustrate
them as completely as possible, the examples include the names of individuals, companies, brands, and
products. All of these names are fictitious and any similarity to actual people or business enterprises is
entirely coincidental.
COPYRIGHT LICENSE:
This information contains sample application programs in source language, which illustrate programming
techniques on various operating platforms. You may copy, modify, and distribute these sample programs
in any form without payment to IBM, for the purposes of developing, using, marketing or distributing
application programs conforming to the application programming interface for the operating platform for
which the sample programs are written. These examples have not been thoroughly tested under all
conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these
programs. The sample programs are provided "AS IS", without warranty of any kind. IBM shall not be
liable for any damages arising out of your use of the sample programs.
Each copy or any portion of these sample programs or any derivative work must include a copyright
notice as follows:
Portions of this code are derived from IBM Corp. Sample Programs.
This Software Offering does not use cookies or other technologies to collect personally identifiable
information.
If the configurations deployed for this Software Offering provide you as the customer the ability to collect
personally identifiable information from end users via cookies and other technologies, you should seek
your own legal advice about any laws applicable to such data collection, including any requirements for
notice and consent.
For more information about the use of various technologies, including cookies, for these purposes, see
IBM’s Privacy Policy at http://www.ibm.com/privacy and IBM’s Online Privacy Statement at
http://www.ibm.com/privacy/details the section entitled “Cookies, Web Beacons and Other
Technologies” and the “IBM Software Products and Software-as-a-Service Privacy Statement” at
http://www.ibm.com/software/info/product-privacy.
Trademarks
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business
Machines Corp., registered in many jurisdictions worldwide. Other product and service names might be
trademarks of IBM or other companies. A current list of IBM trademarks is available on the web at
Copyright and trademark information at www.ibm.com/legal/copytrade.shtml.
Java and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or
its affiliates.
Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.
Red Hat, the Red Hat "Shadow Man" logo, and all Red Hat-based trademarks and logos are trademarks
or registered trademarks of Red Hat, Inc., in the United States and other countries.
UNIX is a registered trademark of The Open Group in the United States and other countries.
Notices 61
62 PowerHA SystemMirror for Linux Version 7.2.2
Index
Special characters M
/var/pha/log/clmgr/clutils.log 29 message log
/var/pha/log/hacmp.out 29 reviewing 29
monitoring
persistent node IP labels 8
A
application 8
N
Navigating 40
C network 3, 4
communication interfaces 5
client 4
heartbeating 6
cluster
logical 5
client 4
physical 5
IP address takeover 6
node 3, 4
network 3, 4
node 3, 4
physical components 2
resource groups 9 P
reviewing message log files 29 persistent node IP label 8
communication interface 5 physical network 5
configuration Planning 37
standby 11 PowerHA SystemMirror 3
example 11 PowerHA SystemMirror startup issues 30, 32, 33, 34, 35, 56
takeover 12
mutual 13
one-sided 13
two-node mutual 14
R
resource
applications 8
service IP address 8
E service IP label 8
Events 40 resource group 9
example fallback 9
standby configuration 11 fallover 9
startup 9
reviewing
H message log files 29
heartbeating 6
point-to-point network 7
TCP/IP network 6 S
service IP address 8
service IP label 8
I software 10
components 10
Installing 38
Split configurations 7
IP address takeover 6
issues
PowerHA SystemMirror startup 30, 32, 33, 34, 35, 56
T
Troubleshooting 42
L
log
reviewing cluster message 29
Log files 40
Logging in 39
logical network 5
Printed in USA