Redbook FS Policy-Based Replication and HA
Redbook FS Policy-Based Replication and HA
Vasfi Gucer
Bernd Albrecht
Erwan Auffret
Byron Grossnickle
Carsten Larsen
Thomas Vogel
Redbooks
IBM Redbooks
October 2024
SG24-8569-00
Note: Before using this information and the product it supports, read the information in “Notices” on
page ix.
Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Authors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Now you can become a published author, too! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
Comments welcome. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
Stay connected to IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
Chapter 1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Recovery Time Objectives and Recovery Point Objectives . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Synchronous, asynchronous, and policy-based replication . . . . . . . . . . . . . . . . . . . . . . 3
1.2.1 Synchronous replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.2 Asynchronous replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.3 Asynchronous replication with snapshots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.4 IBM policy-based replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Data consistency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Policy-based HA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.5 Summary of storage business continuity strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.6 IBM FlashSystem grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Contents vii
viii Ensuring Business Continuity with Policy-Based Replication and Policy-Based HA
Notices
This information was developed for products and services offered in the US. This material might be available
from IBM in other languages. However, you may be required to own a copy of the product or product version in
that language in order to access it.
IBM may not offer the products, services, or features discussed in this document in other countries. Consult
your local IBM representative for information on the products and services currently available in your area. Any
reference to an IBM product, program, or service is not intended to state or imply that only that IBM product,
program, or service may be used. Any functionally equivalent product, program, or service that does not
infringe any IBM intellectual property right may be used instead. However, it is the user’s responsibility to
evaluate and verify the operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter described in this document. The
furnishing of this document does not grant you any license to these patents. You can send license inquiries, in
writing, to:
IBM Director of Licensing, IBM Corporation, North Castle Drive, MD-NC119, Armonk, NY 10504-1785, US
This information could include technical inaccuracies or typographical errors. Changes are periodically made
to the information herein; these changes will be incorporated in new editions of the publication. IBM may make
improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time
without notice.
Any references in this information to non-IBM websites are provided for convenience only and do not in any
manner serve as an endorsement of those websites. The materials at those websites are not part of the
materials for this IBM product and use of those websites is at your own risk.
IBM may use or distribute any of the information you provide in any way it believes appropriate without
incurring any obligation to you.
The performance data and client examples cited are presented for illustrative purposes only. Actual
performance results may vary depending on specific configurations and operating conditions.
Information concerning non-IBM products was obtained from the suppliers of those products, their published
announcements or other publicly available sources. IBM has not tested those products and cannot confirm the
accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the
capabilities of non-IBM products should be addressed to the suppliers of those products.
Statements regarding IBM’s future direction or intent are subject to change or withdrawal without notice, and
represent goals and objectives only.
This information contains examples of data and reports used in daily business operations. To illustrate them
as completely as possible, the examples include the names of individuals, companies, brands, and products.
All of these names are fictitious and any similarity to actual people or business enterprises is entirely
coincidental.
COPYRIGHT LICENSE:
This information contains sample application programs in source language, which illustrate programming
techniques on various operating platforms. You may copy, modify, and distribute these sample programs in
any form without payment to IBM, for the purposes of developing, using, marketing or distributing application
programs conforming to the application programming interface for the operating platform for which the sample
programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore,
cannot guarantee or imply reliability, serviceability, or function of these programs. The sample programs are
provided “AS IS”, without warranty of any kind. IBM shall not be liable for any damages arising out of your use
of the sample programs.
The following terms are trademarks or registered trademarks of International Business Machines Corporation,
and might also be trademarks or registered trademarks in other countries.
Redbooks (logo) ® IBM® Redbooks®
HyperSwap® IBM FlashSystem®
UNIX is a registered trademark of The Open Group in the United States and other countries.
VMware, and the VMware logo are registered trademarks or trademarks of VMware, Inc. or its subsidiaries in
the United States and/or other jurisdictions.
Other company, product, or service names may be trademarks or service marks of others.
In today's digital age, downtime is not an option. Businesses rely on constant access to
critical data to maintain productivity and ensure customer satisfaction. IBM® Storage
Virtualize offers functionalities to safeguard your data against various threats. Policy-based
replication and policy-based HA (policy-based-HA) protect against site failures by
automatically failing over to a secondary site, helping ensure business continuity.
This IBM Redbooks® delves into the powerful tools of IBM policy-based replication and IBM
policy-based high availability, empowering you to create a robust disaster recovery plan that
minimizes downtime and maximizes data protection.
Whether you are a seasoned IT professional or just starting to explore business continuity
solutions, this book provides a comprehensive guide to navigating these essential
technologies and building a resilient IT infrastructure.
Authors
This book was produced by a team of specialists from around the world.
Joanne E Borrett, Chris Bulmer, Chris Canto, Daniel Dent, Lucy Harris, Russell
Kinmond, Bill Passingham, Evelyn Perez, Nolan Rogers, David Seager
IBM UK
Biser Vasilev
IBM Bulgaria
Ivo Gomilsek
IBM Austria
Find out more about the residency program, browse the residency index, and apply online at:
ibm.com/redbooks/residencies.html
Comments welcome
Your comments are important to us!
We want our books to be as helpful as possible. Send us your comments about this book or
other IBM Redbooks publications in one of the following ways:
Use the online Contact us review Redbooks form found at:
ibm.com/redbooks
Send your comments in an email to:
[email protected]
Mail your comments to:
IBM Corporation, IBM Redbooks
Dept. HYTD Mail Station P099
2455 South Road
Poughkeepsie, NY 12601-5400
Preface xiii
Explore new Redbooks publications, residencies, and workshops with the IBM Redbooks
weekly newsletter:
https://www.redbooks.ibm.com/subscribe
Stay current on recent Redbooks publications with RSS Feeds:
https://www.redbooks.ibm.com/rss.html
Chapter 1. Introduction
Business continuity ensures that an organization can deliver services even during disruptions.
Although some applications might tolerate temporary outages, major disasters can cause
significant downtime and data loss, leading to immense costs for recovery. Organizations
should minimize data loss and downtime to lessen business impact and financial strain.
From a storage perspective, business continuity involves maintaining data consistency and
availability for uninterrupted application access. Two key concepts contribute to this: Disaster
recovery (DR) and high availability (HA). DR focuses on replicating data to remote locations
for recovery, and HA prioritizes continuous data accessibility.
Disasters can range from entire site outages to data corruption or theft. Data protection
typically involves local or remote data backups. IBM Storage Virtualize offers functionalities to
safeguard your data against various threats. Policy-based replication and policy-based HA
protect against site failures by automatically failing over to a secondary site, helping ensure
business continuity. Although not covered in this book, Storage Virtualize offers additional
features like Snapshots and Safeguarded Snapshots to protect against data corruption or
cyberattacks.
In a disaster recovery environment, where a production site runs the applications and
replicates on a recovery site, depending on the replication mode, the data on the recovery site
can be older than the one on the production site. The time gap between these two versions
represents the amount of data potentially lost in case there is a disaster. It is referred to as the
Recovery Point Objective (RPO).
The time needed to recover to access the latest available data is the Recovery Time Objective
(RTO). It is typically the time needed to reload the latest available data and to mount volumes
to servers on the recovery site; it corresponds to the application downtime.
When cycle-based asynchronous replication is used, the cycle period defines the recovery
point. See Figure 1-1.
When synchronous replication is used, the recovery point is reduced to zero because the
available version of data on the recovery site is equivalent to the latest on production site.
There is no data loss in the event of a disaster. See Figure 1-2 on page 3.
The RTO is not related to the type of replication between production and recovery sites.
Whenever a disaster occurs, even with a synchronous replication, recovered data still needs
to be presented to servers on the recovery site. The recovery time can be reduced when
servers are pre-attached to the recovery system, but the servers must still mount the
recovered data. This is generally done manually or scripted, and there is still a downtime for
business-critical applications.
With policy-based HA, servers are pre-mapped to volumes that are instantly accessible. If
there is a disaster, they automatically failover to the surviving site to access the data. The
recovery time in that case is reduced to zero as there is no downtime. See Figure 1-3.
A typical DR implementation involves a production site where applications run and access
local data. Additionally, a secondary or recovery site stores copies of this production data.
This helps ensure that even if the primary site becomes unavailable, you can access and
restore critical data from the secondary location.
Chapter 1. Introduction 3
This method helps ensure that both production and recovery systems maintain identical data
copies. However, there are tradeoffs:
Increased write response times. Because writes involve sending data to the recovery site
and waiting for confirmation, application performance can be impacted.
Impact of round-trip time (RTT). The longer the distance between the production and
recovery sites (measured in milliseconds or ms), the higher the write response times
because of the additional data travel and confirmation cycle.
Therefore, synchronous replication is best suited for scenarios with very low RTT, ideally less
than 1 millisecond to minimize performance drawbacks.
In earlier versions of IBM Storage Virtualize, synchronous replication was facilitated by the
Metro Mirror remote copy service. However, for geographically dispersed sites where
distance creates high RTT, synchronous replication becomes impractical. The other option is
asynchronous replication.
Additionally, because the hosts do not wait for the replication to finish on the recovery site,
there might be a gap between data in production and in recovery sites if a disaster occurs and
the replication is not finished.
By dissociating application server activity and data replication, this method optimizes overall
system efficiency. Applications and replication operate independently, minimizing
performance bottlenecks. See Figure 1-5.
In this mode, replicated data on the recovery site can be older than the production one
because data is likely to change since the last completed cycle. In the preceding example,
Recovery Site started receiving "A" while "B" was being written on Production. Recovery site
will receive "B" on next cycle.
Chapter 1. Introduction 5
It is the write change rate on data that determines the size of the snapshots and therefore the
amount of data to be replicated. Some areas of data volumes can change several times
between cycles, but only the latest are replicated, which reduces the amount of data to be
replicated.
The frequency of the cycles dictates the age of the latest available copy on the recovery site.
The frequency of the cycles should be high to minimize the time gap between a disaster event
and the latest completed cycle.
In earlier versions of IBM Storage Virtualize, asynchronous replication was managed by two
primary remote copy services:
Global Mirror. This service facilitated basic asynchronous replication.
Global Mirror with Change Volumes (cycling-mode). This advanced version offered
asynchronous replication with snapshots, similar to the functionality described in this
section.
Therefore, the system always strives to provide the best possible recovery point based on the
current workload and available bandwidth. With journaling mode, this is achieved by using a
journal to record every write operation on the production volumes. The system monitors this
journal and triggers replication operations dynamically, eliminating the need for predefined
replication cycles.
Journals are used in journaling mode and snapshots are used in cycling mode, maintaining
consistency at all times. To help ensure consistent data on the recovery site, the system
automatically creates a snapshot before it initiates the resynchronization process. This
snapshot guarantees that the order of writes on the recovery site mirrors the production site,
maintaining data integrity.
To achieve a high frequency replication and maintain the most recent data on the recovery
site, the bandwidth between the two sites must be sufficient to handle the write throughput of
the production site.
Journaling mode is the preferred replication method because it offers a lower RPO. However,
the system might switch to cycling mode if it cannot sustain the write volume required for
journaling because of bandwidth limitations. In cycling mode, the system captures periodic
snapshots of the production volume and replicates only the changes since the last snapshot.
This reduces the amount of transferred data but increases the potential recovery point.
Data consistency applies within each volume and across volumes. Some applications need
blocks from different volumes to assemble exploitable data, so consistency needs to be
maintained between volumes that are associated to the same application.
The order in which data changes are applied on the recovery site whether within a volume or
across volumes of a volume group is crucial to maintain data consistency. IBM Storage
Virtualize policy-based replication uses an in-memory journal while in journaling mode. The
journal tracks the changes that are made on volumes within the volume groups in sequence.
The journal in journaling mode acts as a buffer for write I/Os on the production site. This
allows data to be written locally without waiting for the entire replication process to finish.
Hosts can continue operations without delays caused by replication.
Chapter 1. Introduction 7
In cycling mode, to help ensure data consistency during resynchronization, the system
automatically creates a snapshot of the volumes before initiating the process. This snapshot
provides a known, consistent state of the data. If the resynchronization fails, the system can
revert to this snapshot, guaranteeing data integrity on the recovery site.
1.4 Policy-based HA
In storage infrastructure, high availability (HA) helps ensure that applications on hosts can
access their data continuously even if there is a failure in the primary storage system. This
solution is achieved by maintaining a full copy of the data and synchronization on a peer
system that allows for application access through either system, so that data access is
maintained even during a disaster.
With storage partitions, users do not have to worry about manually mapping the hosts to the
volume groups copy, as it is already prepared on both systems. They do not have to worry
either about volumes recognition by the hosts, because the UID is the same.
To ensure complete consistency, HA replicates not only data but also the configuration of the
storage partition (including host definitions and mappings) to the remote site whenever
changes occur. This configuration information is typically managed on a preferred system for
the selected partition and stored on both systems.
When the location is explicitly set for hosts, read and write operations are localized. This
means hosts at a specific site access the copy of the data available on the storage system at
the same site, assuming that the host location is configured correctly.
Additionally, management of the volume groups, storage partitions, and the policy are
centralized on an active management system. The active management system is usually the
preferred management location.
If an outage or other failure happens on the current active management system, the active
management system automatically fails over to the other system.
In a system failure on a local site, hosts from the local site automatically switch to the system
on the remote site to access the data, by using their ALUA-compliant multipath policy.
Legacy Metro Mirror 0 > 0 (not HA) Short distance and low
RTT
Chapter 1. Introduction 9
1.6 IBM FlashSystem grid
IBM Storage Virtualize, with the adoption of storage partitions and volume groups, dissociates
the business continuity requirements (HA, replication) from hardware systems and moves
further toward the creation of multiple software-defined virtual storage systems within a single
FlashSystem deployment.
By using the FlashSystem grid approach, users can create federated and scalable clusters of
independent storage devices and failure domains. From an application angle, through the use
of storage partitions, users can add HA and DR resilience to applications through manual or
automated nondisruptive data movement. The FlashSystem grid approach also enables
easier device migration and consolidation and rebalancing of storage capacity and
performance over several systems.
Clients can aggregate IBM FlashSystem or SVC systems and manage them as a single
scalable storage grid, which is engineered for high availability, replication, and non-disruptive
application data migration. Systems that are involved in replication and HA can participate on
the same FlashSystem grid.
The historical approach of clustering nodes with IBM Storage Virtualize was a “per I/O group”
one. Pairs of nodes were the bricks of a cluster solution design that was more “scale-up”
oriented. With the introduction of IBM FlashSystem grid, the clustering granularity is slightly
different. It is now the systems themselves that can scale-out and form a single solution with a
single point of management. There are fewer requirements for hardware compatibility and the
performance and capacity can scale linearly.
After storage partitions are configured, they can be moved from one system to another,
manually balanced by users over several systems and sites. They can also be stretched over
two sites for high availability. See Figure 1-8 on page 11.
At the time of this writing, FlashSystem grid features (partitions mobility) are manageable with
IBM Storage Insights Pro only.
It is possible to use the CLI to create a FlashSystem grid and add or remove systems in a
FlashSystem grid.
Chapter 1. Introduction 11
12 Ensuring Business Continuity with Policy-Based Replication and Policy-Based HA
2
This solution uses policies, such as provisioning and replication policies, to define the overall
replication behavior. Volume groups serve as the smallest unit, and the assigned replication
policy dictates how data is replicated. The replication policy states what systems to replicate
between and the desired recovery point objective (RPO). Different volume groups can have
differing policies to allow an organization to replicate to a maximum of 3 DR systems and
prioritize the value of sets of data. This prioritization occurs only if resources are restricted. If
no restriction exists, all volume groups are treated with equal value. Because the volumes are
in a volume group, the system maintains consistency between them.
The bandwidth limit on the partnership that is used for asynchronous policy-based replication
dictates how much data can be sent between systems. The bandwidth limit helps ensure that
a particular system does not overload the shared inter-site WAN link. The bandwidth limit
value does not take into consideration data compression, so if the data is being compressed
by native IP-based replication or FCIP-based replication, set the limit accordingly. The
bandwidth limit is per I/O group on a multi-I/O group system.
There are generally two types of methodologies for asynchronous replication that will be
explored in more depth in the following sections. These methods are journaling, which is used
by Global Mirror/Global Mirror with Consistency Protection, and cycling, which is used by
Global Mirror with Change Volumes.
In IBM Storage Virtualize the journaling is implemented, as shown in Figure 2-1. In IBM's DR
systems, each controller or node employs a non-volatile bitmap and a volatile journal. A
volume is divided into fixed-size regions called grains, where each grain is a contiguous 128
KiB segment. Each bit in the bitmap represents the status of a corresponding grain.
When a write request arrives at a controller or node, it is mirrored to the other node. The
bitmap for the affected grains is marked as dirty, indicating a pending write operation. This
updated bitmap is then synchronized with the other controller or node. Subsequently, the
write is sequenced, assigned a unique identifier, and acknowledged back to the host system.
DR systems use sequence numbers to maintain data integrity during replication to ensure
that data is written to remote storage in the correct order. This guarantees a consistent
point-in-time reflection of the data at the recovery site though it might not be the most
up-to-date version. The journal can be volatile. However, the bitmap effectively tracks what
data was sent. So, if the journal is corrupted, the bitmap can be used to enable recovery. The
bitmap uses fewer resources than the journal and minimizes network traffic between nodes.
In IBM's DR systems, maintaining a consistent point-in-time copy at the recovery site relies on
sending data sequentially. However, if the link fails or replication stops, restarting requires the
bitmap to identify which data needs transmission.
Because the journal might be unavailable during this interruption, the order of transactions
and potentially some data might be lost. In such scenarios, change volumes are used to
create a potentially outdated recovery point on the target site while resynchronization occurs.
This ensures a recoverable state until data synchronization is complete.
Change volumes are always used during resynchronization, so include space for them when
designing policy-based replication.
Tip: As a rule, change volumes can use up to 10% of the storage capacity on both the
production and recovery systems.
Although not visible through the GUI, change volumes can be accessed using the lsvdisk
-showhidden command or the lsfcmap -showhidden command.
The change volumes that are associated with the relationship are used for the point-in-time
snapshots.
The advantage to this method is that it tolerates low-bandwidth links and problems with site
connectivity. The disadvantage to this method is that it is hard to maintain a very low RPO.
See Figure 2-2 on page 35.
Barring constraints, policy-based replication always prefers journaling mode regardless of the
stated RPO on the policy. This keeps all volume groups at the lowest possible RPO.
For example, take a system with two replication policies to the same target system. One
policy has an RPO alert of 5 minutes and the other has an RPO alert of 60 minutes. In this
example, the client has a peak workload in the evening that overloads their connection
bandwidth between the two sites. When constraints appear on the system, it can convert
some or all of the volume groups with the 60-minute RPO policy to cycling mode. The change
keeps them within their 60-minute RPO, so it can dedicate more bandwidth to the volume
groups with the stated 5-minute RPO and keep them in journaling mode. When the constraint
no longer exists, the system converts the affected volume groups from cycling mode back into
journaling mode.
Enabling access is the means for disaster recovery and a means of testing also. However, if
testing is the primary function, one of the other two methods is better suited. See Figure 2-4
and Figure 2-5 on page 38.
To initiate a recovery test on a volume group named BGTest, enter the following command on
the command-line interface (CLI) or REST API:
chvolumegroupreplication -startrecoverytest BGTest
Recovery volumes are offline, but when a recovery test is initiated, these volumes come
online and can be mounted to a server with read/write access for testing. See Figure 2-7 on
page 40.
When the recovery test is terminated, all changes to the volumes in the recovery volume
group are overwritten by what has changed on the source and the recovery volumes are
offline again. To stop the recovery test on a volume group named BGTest, enter the following
command:
chvolumegroupreplication -stoprecoverytest BGTest
By using the recovery test method, you can test on the target volumes and when the test is
done, less data needs to be updated when compared to the enable access method.
Recommendation: Take a snapshot of the target volume group before the recovery test is
initiated.
For more information, see Policy-Based Replication with IBM Storage FlashSystem, IBM SAN
Volume Controller and IBM Storage Virtualize, REDP-5704.
In Storage Virtualize version 8.6.1, IBM introduced storage partitions. This critical step laid
the groundwork for FlashSystem grid architecture, offering greater flexibility and scalability of
storage. For more information, see Chapter 1, “Introduction” on page 1.
See 3.3, “Behavior examples of policy-based HA” on page 47 for some examples.
Limits and restrictions: For more information about configuration limits and restrictions,
see V8.7.0.x Configuration Limits for IBM FlashSystem and SAN Volume Controller.
Statement of general direction: In the second half of 2024, IBM intends to further
enhance these features to support highly available storage with replication to a third
system.
Currently, a maximum of four storage partitions are supported per FlashSystem. However,
there is no limit on the number of volumes, volume groups, hosts, and host-to-volume
mappings you can configure within a partition. You can add more resources as needed, either
to existing partitions or by creating new ones. It is possible to merge partitions, if they have
the same replication policy. A partition cannot be split into separate partitions.
Figure 3-2 shows a FlashSystem example with a single IO group with 2 partitions and other
local volumes.
Figure 3-2 FlashSystem example - single IO group with two partitions and other local volumes
Best practice: Configure a second IP quorum as a backup for situations where the
primary quorum fails or requires maintenance.
SAN zoning: SAN zoning to isolate traffic must be configured manually. It is not
automated by policy-based HA.
Policy-based HA uses site awareness. Host site awareness is the (optional) ability to set a
location for each host such that when HA is established the I/Os are directed to the storage
system in the same location as the host.
Recommendation: For optimal performance and efficient high availability, assign site
attributes to all your hosts. This configuration step unlocks benefits for your applications
and simplifies storage management.
The site attribute is the name of the IBM FlashSystem or SVC. Figure 3-5 shows the data
flow, if site attributes are used.
In a split-brain scenario, only the affected partition on the secondary site and its access paths
become unavailable with policy-based HA. Local non-HA volumes on the secondary site
During a failover event in policy-based HA, the multipathing driver on the host automatically
switches the paths to the active partition from the secondary site to the preferred site. To
ensure optimal performance during a failover, the public SAN must have sufficient bandwidth
to handle the additional workload from the non-preferred site. See Figure 3-9.
In case of cascading failures, where HA was not fully reestablished, disaster recovery-like
access can be enabled to the most recently synchronized copy of volumes within the
partition. HA is only established when all volumes in the partition are synchronized. HA
becomes available after synchronization finishes for all volume groups, and the partition
Management of the partition follows the active management system, which might or might not
be the same as the preferred system depending on the failure scenario. After a failure,
management and data access is routed through only one of the FlashSystem units until HA is
reestablished.
Although automatic restart occurs after 15 minutes, you can manually initiate the
partnership restart on both FlashSystem units to expedite HA reestablishment.
Note: If quorum applications are inaccessible from both systems, high availability will
enter a mode where the outcome of quorum races is predetermined if connectivity
between the systems is lost. In this state, the current management system for each
partition will win the quorum race for that partition. Even in this mode, high availability
will still trigger a suspend and failover if volumes go offline or latency becomes too high.
– Preferred site goes offline. The primary site is brought offline to prevent potential data
inconsistencies during the failover process. See Figure 3-13.
2. After HA is reestablished, the paths fail back to the preferred site. See Figure 3-15 on
page 54.
3. Also, the management fails back to the preferred site. See Figure 3-16.
Table 3-1 Comparing policy-based HA with SVC stretched cluster and HyperSwap
Policy-based HA HyperSwap Enhanced Stretched
Cluster
Supported on SVC and NVMe 2 or more I/O group 2 or more I/O group SVC
products. 1 I/O systems only
group systems
Maximum nodes per 2 (1 I/O group) 4 (2 I/O groups) 4 (half of each I/O group)
site
Mixed hardware Yes, unrestricted Limited (must cluster) Limited (must cluster)
models
* Volume Group Snapshots would be taken on both sides rather than making the snapshots HA.
Figure 4-1 shows the topology for the systems and the example configuration.
You can monitor and manage replication from the Volume Groups page.
Note: In the following example, the setup of policy-based replication is on two connected
FlashSystem systems. Policy-based replication can also be configured on SAN Volume
Controller (SVC).
In the example, the systems are SAN-zoned together with dedicated ports for
node-to-node communication. Other connectivity options include high speed Ethernet
networking.
Also, as part of the partnership setup, a certificate exchange must be performed. This
exchange ensures that each system has the necessary configuration access to the other
system by using the REST API.
Note: A prerequisite for creating a partnership through SAN-zoning is that the two systems
are correctly zoned together with dedicated ISL-links for node-to-node traffic.
8. The Create Partnership wizard options are shown in Figure 4-4 on page 61 and include
selection of Fibre Channel, the selection of the Partner system name, and selecting the
checkbox Use policy-based replication. Also, the bandwidth and background copy rate are
specified.
The partnership must be created from both the production and recovery systems.
9. After the partnership is created, the CopyServices → Partnership menu looks like the
system in Figure 4-5 on page 62.
Because the two systems are partners, the Partnerships panel on each system looks like
Figure 4-5.
At least one linked pool is required on each system for policy-based replication to function
properly. There are two approaches to creating linked pools. The first option involves creating
pools on each storage system and then linking them together manually.
Alternatively, the wizard to Setup policy-based replication in Figure 4-7 guides you through
the processes involved.
Select the storage pools to link together. Our lab configuration has only a single pool to select
on each system as shown in Figure 4-7. Click Link Pools to proceed.
The Link Pools Between systems wizard in Figure 4-7 on page 63 initially provides a link to
the recovery system on which you are directed to the Pools menu. You can then right-click the
pool and select Add Pool Link for Replication. Figure 4-8 shows how to add or remove pool
links directly on the target system.
A replication policy can be linked to multiple volume groups. However, each volume group can
have a maximum of one replication policy associated with it.
Figure 4-9 on page 65 shows the Create replication policy panel in which you can perform the
following steps:
1. Select Create new policy.
2. Enter a name in the Name field.
3. In the Topology field, select 2 Site, Asynchronous.
4. Select the systems in the Location 1 and Location 2 fields.
5. Define how old the data on the recovery site can be before an alert is sent.
6. Click Create replication policy to proceed.
The recovery copies of volume groups are immutable, meaning they cannot be modified or
altered. Policy-based replication greatly simplifies the configuration, management, and
monitoring of replication between two systems.
To create a volume group, the Create Policy-Based Replication wizard prompts you to enter a
volume group name as shown in Figure 4-10 on page 66. Enter the name and click Create
Volume Group to proceed.
The Setup Policy-Based Replication wizard is finished. Select Go to Volumes to add volumes
to the volume group.
3. The Create Volumes window opens. Select the storage pool for the new volumes. The only
pool is StandardPool. Figure 4-14 shows the Create Volumes wizard. Select the pool and
click Define Volume Properties.
Four volumes active are defined in the volume group, and these are replicating to the
recovery system.
The content of the volume group is shown in Figure 4-16.
Figure 4-16 Four volumes created and added to the Volume group
The configuration is completed, and volumes within the volume group PBR-01-VolumeGP
are copying from FS9100 to FS7300.
7. By entering the defined volume group and clicking the tab Policies, you can view the
status of the current replication policy as shown in Figure 4-18.
8. You now have the option to click Manage replication policy where you can remove the
replication policy if needed.
If you are on the recovery system, the Manage replication policy window lists the option
to enable access to the recovery volumes.
Note: The newly created volumes can be mapped to one or more hosts from the Volumes
menu.
Enable access to the recovery copy suspends replication, which permits host access and
configuration changes for each independent copy.
Notice that you can select Restart replication and that the changes that are made to this
copy are not replicated until replication is restarted. The replication status is shown in
Figure 4-20 on page 71.
Whether you want to restart the replication from the production site or want to reverse the
replication from the recovery site, which might be the current production site, depends on
from which system you restart the replication.
Figure 4-21 on page 72 shows how you can restart the replication in which the recovery site
is the production system.
The action overwrites the volumes on the FS9100-10 system, which was the production
system. However, the situation might be that this system has been down for some time and
that the volumes on it are no longer current because the recovery system is now functioning
as the production system.
Figure 4-22 on page 73 shows that initial copy is ongoing from the FS7300-2 to the
FS9100-10.
Switching the direction of replication back to the FlashSystem 9100 requires the same actions
as before, which is to log on to the FS9100-10 and click Enable Access on the Volume Group
policy tab as shown in Figure 4-19 on page 70.
To switch copy direction in a safe way requires that the hosts accessing the volumes be shut
down when you reverse the copy direction back to FS9100-10. So you can expect a few
minutes of downtime when you switch the copy direction.
Ensure that the following prerequisites are met before you configure a volume that is part of a
Global Mirror relationship to use policy-based replication:
The relationship must be either Metro Mirror or Global Mirror.
The volume being migrated must be the primary volume within the Metro Mirror or Global
Mirror relationship.
Volumes that use Global Mirror can be manually migrated to use policy-based replication.
Remote Copy features such as 3-site partnerships cannot be directly migrated to
policy-based replication due to their more complex configuration requirements.
No associated change volumes can be linked to the primary volume.
Before running the movevdisk command to transfer a volume between I/O groups, several
conditions must be satisfied:
If the relationship is part of a consistency group, the volume cannot move between I/O
groups when the policy-based replication is defined. Use the movevdisk command to
move the volume as needed.
The relationship state must be consistent_synchronized. The data in the source and
target volumes of the relationship must be fully synchronized and consistent. Resolve any
pending changes or discrepancies before proceeding with the volume movement.
The relationship cannot be in a consistency group. Consistency groups are a collection of
relationships that need to maintain data consistency as a group. If the relationship is part
of a consistency group, it cannot be moved independently. Restrict the volume movement
to relationships that are not associated with any consistency group.
The relationship type must be Metro Mirror or Global Mirror. The movevdisk command is
designed specifically for volumes that are involved in Metro Mirror or Global Mirror
relationships. These relationship types enable synchronous or asynchronous replication of
data between primary and secondary volumes. Only volumes that are associated with
these types of relationships are eligible for the move operation.
The relationship must not have a change volume associated with the primary volume. In
some replication scenarios, a change volume is used to track modifications made to the
primary volume. If the relationship has an active change volume associated with the
primary volume, the move operation cannot proceed. Remove or detach the change
volume before you start the volume transfer.
The volume being moved must be the primary volume in the Metro Mirror or Global Mirror
relationship. When you move a volume, the volume must be the primary volume in the
Metro Mirror or Global Mirror relationship. The primary volume is the source volume where
the original data resides, and the secondary volume is the target for replication. Moving
the primary volume ensures the appropriate replication of data to the new I/O group.
The preceding steps can be done manually or by using the Setup Policy-Based Replication
wizard as shown in Figure 4-26 on page 77.
2. Choose the current Remote Copy partnership from the left navigation and select
Actions → Partnership Properties, as shown in Figure 4-24 on page 76.
5. Configure a provisioning policy for each linked pool, create one or more replication
policies, create an empty volume group for each consistency group and independent
relationships and assign a replication policy as described in section “Setup policy-based
replication wizard” on page 62. This Setup Policy-Based Replication wizard can be
reached from menu Copy Services → Partnerships and Remote Copy, as shown in
Figure 4-26
When the policy-based replication setup has completed, open the recovery system menu
Volumes → Volumes to verify that there are two copies of the replicated production volumes
as shown in Figure 4-27 on page 78
If you choose to keep the disaster recovery copy, ensure that sufficient capacity is available
on the recovery system to accommodate both sets of copies.
Retaining existing volumes provides data protection in case of an outage, and you can verify
replicated data on the recovery system after configuring policy-based replication.
3. Select Actions → Stop Group. On the Stop Remote-Copy Consistency Group page,
choose the option Allow secondary read/write access to retain the secondary volumes
as a disaster recovery copy, as shown in Figure 4-29.
4. Click Stop Consistency Group. The state of the consistency group changes to Idling,
as shown in Figure 4-30.
7. On the Delete Relationship page, verify the number of relationships being deleted. Verify
that the checkbox Delete the relationship even when the data on the target system is
not consistent is cleared. This cleared checkbox allows the secondary volumes to be
retained for disaster recovery until a new recovery point is established using policy-based
replication, as shown in Figure 4-32.
8. Click Delete. After the relationships are deleted from the consistency group, select
Actions → Delete Group to complete the removal process, as shown in Figure 4-33 on
page 81.
The Remote Copy configuration is now deleted. The volumes from the Remote Copy
relationship exist and are available for host mapping, or they can be deleted. Both copies
Remote Copy and PBR volumes requires space, which requires free space in the storage
pool.
The GUI continues to show the Remote Copy features. This can be disabled from the menu
Settings → GUI Preferences → GUI Features → Remote Copy Functions in copy
services, as shown in Figure 4-34.
For more information about Safeguarded Copy, see the Redpaper Data Resiliency Designs: A
Deep Dive into IBM Storage Safeguarded Copy, REDP-5737.
Monitoring the recovery point objective (RPO) is a crucial aspect of business continuity.
Storage Virtualize provides several ways of verifying whether the objective of recovery points
are reached and provides ways to receive alerts if the RPO is not met.
When partnerships are created on all systems, they appear as “Configured”. It is possible to
create a policy for replication, if not already done, from that screen. See Figure 5-2 on
page 85.
Only one partnership can be defined between two systems, but a system can have multiple
partnerships. A system can be a partner with up to three remote systems. No more than four
systems can be in the same connected set. See Figure 5-3.
When a partnership is stopped, all the replicating volume groups that use this partnership are
suspended. The replications status is listed as a disconnected system in the Volume Group
page under the Policies tab. See Figure 5-5 on page 87.
Use the Pools page in the management GUI to manage storage pools, and pool links
between production and disaster recovery locations by navigating to Pools → Pools page.
See Figure 5-6 on page 88.
If the storage pools exist on the production and recovery systems, you can add a link between
the pools from either system. If a pool on one of the systems has existing links to another
partnered system, you must add the link from the unlinked system. The existing link between
pools for other partnerships is not affected. Alternatively, if child pools currently exist on the
production system only, you can use the management GUI on the recovery system to create
and link a child pool in a single step. The management GUI simplifies the process of creating
a linked pool on the recovery system. The management GUI automatically displays the
properties such as name, capacity, and provisioning policy from the production system. You
can use these values to create the new linked child pool on the recovery system without
logging in to the other system.
To create a link between storage pools from the production system, right-click the pool to link
and select Add Pool Link for Replication. The Add Pool Link page opens. See Figure 5-7
on page 89.
The Add Pool Link page displays options for the remote system on the left side of the page.
Local system details are displayed on the right.
From each drop-down menu, select the remote system to link, the remote pool to link, and the
local pool to link. Also, determine whether the provisioning policy is assigned to the remote
pool. If a provisioning policy is already assigned to the remote pool, then select the local pool
provisioning policy from the drop-down menu.
To modify pool links between pools in production and disaster recovery locations, use the
management GUI and select Pools → Pools, right-click the pool and select Modify Pool
Links for Replication. On the Modify pool link page, select whether you want to unlink the
selected pool from remote systems or move all links from the pool to another pool. See
Figure 5-8.
To view existing volume groups in the management GUI, from the production system or the
recovery system, select Volumes → Volume Groups. See Figure 5-9.
Volume groups have multiple attributes among which the following can be changed:
Name
Volumes
Optional replication policy
Optional snapshot policy
The available actions on a volume group are renaming it, deleting it, adding or removing
volumes, changing the replication policy, changing the snapshot policy, and manage local and
cloud snapshots.
The name of a volume group cannot be changed while a replication policy is assigned and
cannot be changed while the volume is in a volume group with a replication policy assigned.
Volume groups can have only a single replication policy. A system can host multiple volume
groups, each of them using a different replication policy, but a volume group cannot have
multiple replication policies assigned. See Figure 5-10 on page 91 and Figure 5-11 on
page 91.
The following actions cannot be performed on a volume while the volume is in a volume group
with a replication policy assigned:
Resize (expand or shrink)
Migrate to image mode or add an image mode copy
Move to a different I/O group
Volume groups are not used exclusively for replication and can be managed on a stand-alone
system for local copies, for instance. Volumes from a volume group that are not associated
with a replication policy can be moved to another volume group. To move volumes from one
volume group to another, select the Volumes menu and in the list of volumes, right-click the
ones that you want to move. Then, select Move to Volume Group and choose the volume
group where you want to add the volume.
Removing a volume from a volume group is a configuration change and is done only on the
production system. The change is reflected on the recovery system where the volume is also
deleted from the volume group when it is no longer part of the recovery point.
If a local snapshot was taken for the volume group from which the volumes are deleted,
before the deletion, it cannot be restored as the number of volumes are not the same
anymore. However, a clone of the volume group can be made to restore the volumes.
Deleting a volume in a volume group is again a configuration change and reflects on the
recovery system.
To take an instant snapshot of a volume group using the GUI, navigate to Volumes →
Volume Groups, select the volume group to be copied, select the Local Snapshots tab, and
click Take Snapshot. Instant snapshots cannot be safeguarded. They have no expiration
date and can be restored, cloned, or deleted anytime. See Figure 5-14 on page 94.
A snapshot policy can be assigned to a volume group. To manage snapshot policies, navigate
to Policies → Snapshot Policies. To define a snapshot policy, specify a frequency, time, day
of the week, day of the month, and retention period for snapshots.
To assign a policy to a volume group by using the management GUI, select Volumes →
Volume Groups, select the volume group for which to assign a policy, and click the Assign
internal snapshot policy button. See Figure 5-15.
Select the policy to assign to the volume group and specify the start date and time. You can
make Safeguarded snapshots of the volume group. Safeguarded snapshots that are created
from the selected volume group are backups that cannot be changed or assigned to hosts.
See Figure 5-16 on page 95.
You can assign only one local snapshot policy to a volume group. You can assign a local
snapshot policy and a replication policy to a volume group.
It is possible to orchestrate the snapshots of volume groups through an external tool (like IBM
Storage Copy Data Management (CDM) or IBM Copy Services Manager (CSM) to make
Safeguarded Copies. If the option to assign an external policy is not visible in the Volume
Group page, it might be hidden by the system GUI settings. Go to the Settings → GUI
preferences → GUI Features page. Select the switch labeled Display Safeguarded Backup
policy tile and External Schedule application settings to make the option visible. See
Figure 5-17 on page 96.
A new panel is then available in the volume group properties. See Figure 5-18.
Figure 5-18 A Volume Group page with external Safeguarded backup policies available
To restore a volume group from snapshots, the following requirements must be met:
When you initiate a restore operation, the volume group that is specified by the volume
group parameter must be the same parent group from which the snapshot was originally
created.
The composition of the volume group must be the same at the time of the restore as it was
at the time the snapshot was taken.
If volumes are added to or removed from the volume group in the time between the
snapshot being taken and the restore being requested, then the restore fails. Those
volumes must be removed from or added back into the volume group before the restore
can be performed.
Note: If the volume group on FlashSystem Site B is not mapped to any host, then data
on that volume group is unchanged. Therefore, the last replicated state from
FlashSystem Site A before enabling independent access (last recovery point) is
restored.
If a volume is deleted after the snapshot is taken, it is removed from the volume group and
is in the deleting state. By requesting a snapshot restore, assuming that all other
prerequisites have been met and the restore operation proceeds, the volume is added
back to the volume group and put into the active state.
If the volumes are expanded between the time that the snapshot is taken and when the
restore is requested, then the restore fails. Shrink the parent volumes back to the size they
were when the snapshot was taken before a restore can proceed.
This has implications if new snapshots were taken after a volume was expanded because
you cannot shrink the parent volumes without cleaning out any new snapshots and their
dependent volume groups.
A snapshot restore overwrites the data on the target volume group. Be sure to identify the
correct snapshot to restore and the correct target of the restoration.
A volume group in an active replication policy cannot be restored from a snapshot. the
replication policy must first be unassigned from the volume group.
To see when a volume group was last restored from a snapshot, go to the Properties option
on the Volumes → Volume Groups panel.
Any existing policies are displayed in a table. The table lists the existing replication policies
with their name, Location 1 system, Location 2 system, and the number of volume groups that
use the policy.
Note: As of this writing, Storage Virtualize version 8.7 supports only two-site
configurations for replication that uses both asynchronous technology and high availability
(HA). This means you can select only two locations, which are designated as location 1
and location 2.
When the replication policy is created, you can assign volume groups to it by clicking the
Overflow menu, which consists of three vertical dots, and selecting Assign to volume
groups. See Figure 5-22.
The system where the policy is assigned is the production system, and the other system in
the policy is the recovery system for the volume group. See Figure 5-23 on page 101.
Note that if a volume group is already associated to a replication policy, it does not appear in
the list of available volume groups.
When a volume group is associated with a replication policy, the synchronization of the
volumes starts. The system creates the recovery copy of the volume group and volumes on
the remote system automatically. There is no need to create them on the remote system.
The first synchronization of the volumes is done at the speed of the available partnership's
link bandwidth. The background copy rate setting for that partnership is not used if there is
only policy-based replication.
To manage replication policies in the management GUI that are assigned to existing volume
groups, select Volumes → Volume Groups. Select the volume group and select Policies.
5.5.1 Checking the RPO and status by using the management GUI
To view the replication status for a volume group do the following steps:
1. Navigate to the Policies tab within the management GUI.
2. In the Volumes section, select Volume Groups and choose the specific group that you
want to monitor.
3. On the Volume Groups page, click the Policies tab. The RPO status is listed under the
Recovery copy illustration. See Figure 5-24 on page 102.
In the Volume Groups list page, the RPO is also listed for every replicating volume group. See
Figure 5-25.
Within policy's RPO Replication is within the RPO value set in the policy.
Outside policy's RPO Data on the recovery copy is outside the RPO value set in the policy.
The replication status is also displayed and illustrated between the production and recovery
copies. Table 5-2 lists the possible states of Replication.
Independent access Replication is stopped and each copy To resume replication, choose
of the volume group is accessible for the system that you would like to
I/O. use the data and configuration
from. Make this system the
production copy.
Replication suspended Replication is suspended due to an Review the event log and
error on one of the systems. address errors.
Replication automatically resumes
when all errors are resolved.
The HTTPS server requires authentication of a valid username and password for each API
session. The /auth endpoint uses the POST method for the authentication request. See
Example 5-1.
The response to this request is a token (a string) which must be used for further requests. It is
referenced in the next example as <your_token>.
In Example 5-2, the token that was returned in the previous authentication request is used.
If a successful response is returned with a status of 200, it can be used by the third-party tool.
Example 5-3 is the response to a previous request. It lists the status of the two volume groups
that are defined on our lab system.
Example 5-3 Viewing the volume groups and their replication status
{
"id": "1",
"name": "Test_CG-to-VG",
"replication_policy_id": "0",
"replication_policy_name": "NewRepPol",
"ha_replication_policy_id": "",
"ha_replication_policy_name": "",
"location1_system_name": "TronLives",
"location1_replication_mode": "production",
"location1_within_rpo": "",
"location2_system_name": "TotalRecall",
"location2_replication_mode": "recovery",
"location2_within_rpo": "yes",
"link1_status": "running",
"partition_id": "",
"partition_name": "",
"recovery_test_active": "no",
"draft_partition_id": "",
"draft_partition_name": ""
}
More details are listed if the id of the volume group is specified in the request. For example,
Example 5-4 shows the response for the following request:
https://<your_flashsystem_ip>:7443/rest/v1/lsvolumegroupreplication/1.
Example 5-4 Details of the replication status for a given volume group
{
"id": "1",
"name": "Test_CG-to-VG",
"replication_policy_id": "0",
"replication_policy_name": "NewRepPol",
You can also view the event log for exceeding RPO events. In Example 5-5, after
authentication is successful and a token is retrieved, you can list the alerts from the event log.
Event ID 052004 means “The recovery point objective (RPO) for the volume group has
been exceeded”.
Example 5-5 Requesting RPO alerts list from the event log
The response to this request provides a list of all alerts that are related to the event “The
recovery point objective (RPO) for the volume group has been exceeded.” Each alert
{
"sequence_number": "208",
"last_timestamp": "240527051415",
"object_type": "volume_group",
"object_id": "1",
"object_name": "Test_CG-to-VG",
"copy_id": "",
"status": "alert",
"fixed": "yes",
"event_id": "052004",
"error_code": "",
"description": "The recovery point objective (RPO) for the volume group
has been exceeded"
},
{
"sequence_number": "212",
"last_timestamp": "240529102925",
"object_type": "volume_group",
"object_id": "1",
"object_name": "Test_CG-to-VG",
"copy_id": "",
"status": "alert",
"fixed": "yes",
"event_id": "052004",
"error_code": "",
"description": "The recovery point objective (RPO) for the volume group
has been exceeded"
}
You can list more details for a specific event by specifying the sequence number in the
request URL. Example 5-7, shows a request for more details for the sequence number 208.
The sequence number is provided from a previous response.
The returned response provides details on the event, which can be used for further
optimization such as number of occurrences, duration of the event, and so on. See
Example 5-8.
{
"sequence_number": "208",
"first_timestamp": "240527051415",
Before you implement policy-based HA, ensure your environment meets the requirements for
it and that you understand the concepts of the solution. For more information, see Planning
high availability.
This guide explains how to configure a policy-based HA solution. You can set up policy-based
HA using the management GUI or the CLI.
Within a storage partition, all volumes are in volume groups, and host mappings can be
created only between volumes and hosts in the same partition.
In storage partitions configured for HA replication, there are two important properties: the
preferred management system and the active management system.
The preferred management system is configured and changed by the storage administrator.
Certain error situations can trigger a failover of the management system to the HA-partner.
When the system recovers, control automatically returns to the preferred management
system. The administrator can also change the preferred management system.
All configuration actions on a storage partition must be performed on the active management
system. The storage partition can be monitored on either system.
You can configure additional volumes, volume groups, hosts, and host-to-volume mappings at
any time, either by adding to an existing partition or by creating a new one. A partition must
include all volumes that are mapped to any hosts included in the partition.
Figure 6-1 on page 112 shows the topology for the systems to be configured.
Note: You must have connectivity from all the servers that are running an IP quorum
application to the service IP addresses of all the nodes or node canisters.
6. Create a storage partition and give it a name. Click Create partition to proceed. See
Figure 6-5 on page 115.
7. Create an HA replication policy and give it a name. In the example, the partition is named
FS9100-FS7300-HA. Click Create replication policy. See Figure 6-6 on page 115.
9. Select an existing volume group, which contains four volumes. The resulting window
displaying this selection is shown in Figure 6-8. Click Next to proceed.
Note: Volumes are configured, but no hosts are added yet. Host creation and
configuration are discussed in “Creating hosts in policy-based HA and mapping
volumes” on page 121.
11.The wizard exits to the Storage Partition view, as shown in Figure 6-10.
The Storage Partition overview shows that the FS9100-10 and FS7300-2 are active in a
highly available relationship where the FS9100-10 is the preferred management system.
The next step involves creating new components on both FlashSystem storage systems:
New hosts. Define new hosts in the management software.
Volume groups. Create volume groups to manage related volumes for easier
administration.
Host-to-volume mappings. Establish mappings between the newly created hosts and the
volumes that they must access.
During host creation, the storage administrator can specify a location preference. This
ensures that local FlashSystem volumes on the same site as the host are prioritized for
access. This approach optimizes performance and minimizes network traffic.
Use the Storage Partition Overview panel to monitor connectivity between the two systems
and the IP quorum applications, and the health of the hosts and volumes associated with the
partition.
These new volumes are created on the primary management system, which is the
FlashSystem 9100 (FS9100) in this policy-based HA configuration. Because of the replication
policy applied to the storage partition, any volumes that are created in the partition are
automatically mirrored to the partner system, which ensures data redundancy and high
availability.
Follow these steps to configure volumes to be used for storage partitions in policy-based HA:
1. From the Volumes menu within the storage partition, click Create Volumes as shown in
Figure 6-11
3. The Create Volumes menu lists a summary of changes. Click Create volumes to proceed.
See Figure 6-13
Figure 6-14 Volumes in the storage partition of the FS9100 and FS7300
Note: Volumes that are not required to be mirrored in a policy-based HA relationship can
be created in the Volumes → Volumes window outside of the Storage partition section of
the GUI.
Open the Volumes → Volumes window on the FS7300 HA-partner to verify that the new
replicated volumes also exist on the FS7300 as shown in Figure 6-15.
The volumes in Figure 6-15 are accessible from outside the Storage Partitions menu or within
the Storage Partitions menu.
Because these volumes belong to a volume group with an enabled replication policy, they are
mirrored to the HA partner system for redundancy and failover capabilities. This helps ensure
that data remains available if the primary system encounters an issue.
Note: You cannot create volumes that are attached to the replicating volume group from
the FS7300 because it is not the preferred management system. You can create
non-replicated volumes outside of the Storage Partitions menu on the FS7300 system.
Creating hosts
Follow these steps to add to hosts in policy-based HA:
1. Enter the storage partition FS9100-FS7300.
2. Select Hosts as shown in Figure 6-16. No hosts are listed.
3. Click Add Host to proceed.
4. The Add Host wizard opens. Ensure that the Assign location checkbox is checked.
5. Select the preferred location of the host as shown in Figure 6-17. Select the location name
from the list of storage devices.
7. Review the settings and click Save to proceed. See Figure 6-19.
4. Review the volumes to be mapped and click Map Volumes as shown in Figure 6-23.
Storage partitions bring a new level of flexibility to Storage Virtualize by enabling the migration
of both the frontend and backend storage components. With a single command, you can
migrate all underlying storage and associated hosts, volumes, and host-to-volume mappings
to a new system.
This process involves updating storage paths for attached hosts. Fortunately, multipathing
drivers handle this automatically and can provide a non-disruptive migration with no impact on
applications or users. This can simplify the decommissioning of old equipment.
The CLI svctask chpartition command is the sole CLI-command that is used to start a
migration. This helps to make the migration simpler and more direct.
Storage Virtualize provides event-driven confirmation steps when user validation is needed
during migration. These events are provided on the migration source and on the migration
target system during the migration process and can includethe following examples:
Check multipath on hosts before final path switch.
Confirm deletion of original copy.
When data synchronization is complete, the hosts see a new set of paths to identical volumes
on a different storage system.
For Storage Virtualize version 8.7.0, partitions can be migrated between systems that are
members of the same FlashSystem grid, or can be migrated between systems that are not
configured in a FlashSystem grid. Migration of storage partitions in 8.7.0 is supported only on
systems that support FlashSystem grid.
Migration enables relocation of storage partitions from a source system to a different system
location, and as a consequence, the following flow of events occurs:
All the objects that are associated with the storage partition are moved to the migration
target storage system.
Host I/O is served from the migration target storage system after the migration is done.
At the end of the migration process, the storage partition and all of its objects are removed
from the source system.
Note: The initial release of Storage Virtualize 8.7 does not include Storage Partition
Migration functionality within the graphical user interface (GUI). This feature is expected to
be available in an update. You can still use CLI for migration tasks.
Prerequisites
Before you can use nondisruptive Storage Partition Migration function, ensure that the
following prerequisites are met:
Review the HA requirements to ensure that the storage partition supports migration and
the host operating systems support this feature. For more information, see Planning high
availability.
Confirm that both systems are members of the same FlashSystem grid, or that neither
system is a member of a FlashSystem grid, and that both systems meet the requirements
for FlashSystem grid.
Use the lspartnershipcandidate command and make sure that both source and target
systems are correctly zoned and are visible to each other. For more information, see
lspartnershipcandidate.
Procedure
Perform the following steps:
1. Run the chpartition command with the -location option to migrate the storage partition
to its required system location. For more information, see chpartition.
2. The following example initiates a migration of storage partition to the designated location
system:
chpartition -location <remote_system> mypartition1
3. To check the migration status, run the lspartition command. For more information, see
lspartition.
4. After you successfully migrate the storage partition's data and configuration to the target
system, an event is posted for the storage administrator. This event verifies that the
affected hosts established new paths to the volumes on the target system. After you
confirm this by fixing the event, host I/O operations for the storage partition automatically
switch to using the paths on the target storage system. This event is raised and fixed at the
source storage system.
5. To finalize the migration process, another event prompts the storage administrator to
remove the copy of the data and configuration from the source storage system.
Important: Before confirming this event by fixing it, it is crucial to verify the
performance of the storage partition on the target system. This ensures a smooth
transition and optimal performance after migration. This event is raised and fixed at the
target storage system.
6. An informational event on the target storage system marks the completion of the storage
partition migration.
You can monitor the progress of the migration, including the amount of data remaining to be
copied by using the lsvolumegreopureplication command. For more information, see
lsvolumegroupreplication.
You can monitor the migration by using the migration_status field that is shown by the
lspartition command indicating that there is no migration activity active or queued for that
storage partition. For more information, see lspartition.
An ongoing storage partition migration can be stopped by specifying a new migration location
that uses the -override option. The migration to the new location is queued behind any
existing queued migrations. For more information about the storage partition migration
procedure, see Migrating storage partitions between systems.
Replication policies define replication details, including source and target system IDs, location
names and IDs, RPO alerts (if any), IO group selections, and the replication topology.
Partitions are used in policy-based HA to set a common management environment for hosts,
volumes, volume groups, and related snapshots. Partitions are available under the Storage
partitions menu item as shown in Figure 7-3.
For all partition configuration tasks, use the Storage Partitions menu item on the left panel.
All policy-based replication activities require a policy, which defines the general replication
settings. Identify the existing policies and check for the number of partitions that are
associated with each policy as shown in Example 7-2 on page 133.
The definition for source and target systems, the topology, and the number of partitions that
use this policy are shown in the policy details in Example 7-3.
The partitions are managed by the policies. To verify this, check the policies used and the
status of the partitions as shown in Example 7-4. Note the setting for the active management
system. This system coordinates policy-based HA replication within the partition. In
Example 7-4, the FS7300-2 is the active management system.
Optimized data path management for long distances relies on host location settings. If host
locations are not defined, the replication is originated from the system that receives the write
from the host, and all host traffic (read and write) is managed by the active management
system. In this scenario, the copy source volumes reside at the same site as the active
management system. As shown in Example 7-5, the FS7300-2 is the default access point for
all hosts without a location setting within this partition.
Changing the active management system immediately reverses the copy direction between
the storage systems and designates the new system as the default storage system for host
access, as illustrated in Example 7-6.
Policy-based HA configurations use optimized data paths through host-level location settings.
In the absence of a location setting, a host always accesses the active management system
for this partition, regardless of its physical location, for all read and write I/O. Changing the
active management system impacts both the copy direction for all partition volumes and the
default access point for hosts without a location setting. Public ISLs are used for this
redirected traffic.
With a defined host location, read and write I/O are performed, when possible, directly with
the storage system assigned to the defined location. Local reads and writes are performed
directly with the assigned storage and data is asynchronously replicated to the remote site.
The active management system ensures data consistency by acknowledging both copies.
This approach minimizes data traffic by eliminating unnecessary network transfers and uses
the private ISL for efficient replication.
Use the lshost command to list the host configuration and the location parameter as shown
in Example 7-7.
Example 7-7 Identify the storage location and verify the host location
IBM_FlashSystem:FS9100-10:Team4>lshost
id name port_count iogrp_count status site_id site_name host_cluster_id
host_cluster_name protocol owner_id owner_name portset_id portset_name
partition_id partition_name draft_partition_id draft_partition_name
ungrouped_volume_mapping location_system_name
0 PB_HA_1 1 4 online
scsi 64 portset64 0 FS9100-FS7300
no FS9100-10
1 PB_HA_2 1 4 online
scsi 64 portset64 0 FS9100-FS7300
no FS7300-2
IBM_FlashSystem:FS9100-10:Team4>lshost 0 | grep location
location_system_name FS9100-10
IBM_FlashSystem:FS9100-10:Team4>
As previously discussed, you must assign the new volume to both appropriate hosts. See
Example 7-9.
Note: A volume deletion for a volume with active snapshots is not physically removed from
the storage system if a dependent snapshot exists.
Follow the deletion process and select the removal although there are already host
assignments in place as shown in Figure 7-6 on page 137.
The volume and data deletion and the host unassignment are run on both storage systems
without additional checks.
You can also use the CLI to delete the volume as shown in Example 7-10.
The single command successfully removed all host definitions, all host mappings, and all data
for this volume at both sites.
Note: It is important that the volumes to be added must first be migrated to a temporary
partition with identical properties as the target partition. This helps to ensure a smooth
merge by using the partition merging feature.
In this example, only the GUI method is used because the GUI significantly simplifies the
overall process.
The existing partition is a single policy-based HA partition with three volumes and two hosts
as shown in Figure 7-7. All data is replicated to the remote site.
There are three active volumes, already replicated to the remote site as shown in Figure 7-8
on page 139.
The following example describes how to add two existing volumes with data to the existing
policy-based HA partition named PB_HA_1 and includes the following steps:
Volume group preparation to create a new, common volume group for the two existing
volumes
Temporary partition creation to establish a new partition and assign the newly created
volume group during this process.
See Figure 7-9 for a visual guide to create a new storage partition.
3. Click Select volume groups as shown in Figure 7-11 and click Continue.
Figure 7-11 Create a new storage partition and select volume group
4. Select the appropriate volume group or groups as shown in Figure 7-12 on page 141 and
follow the prompts provided by the GUI wizard.
5. The new partition has two volumes and two hosts as shown in Figure 7-13.
The data volume is now managed by the newly created partition. To merge the two partitions,
both partitions must have the same properties. The original partition is running a 2-site high
available configuration, but the newly created partition is running without the additional 2-site
protection. Assigning the appropriate policy-based HA policy eliminates this difference and
automatically creates the required volume copies on the recovery site.
Figure 7-14 Partition: Check for details for the new partition
7. Follow the process and select the appropriate policy, set your preferred management
system, and download the IP Quorum application (if not already done), link the pools. As
shown in Figure 7-15, select the appropriate replication policy to activate the configured
settings.
3. Review the Merge configuration, as shown in Figure 7-19, to make sure that the
configuration meets your requirements.
Figure 7-19 Review the partition merge settings and start the merge process
4. Verify the new configuration. The partition now manages five volumes, four hosts. All
volumes and hosts are available at both sites as shown in Figure 7-20 on page 145.
5. Verify the details such as volume or host settings as shown in Figure 7-21.
Figure 7-21 Check for policy-based HA volumes in detail after the merge
Two existing volumes are assigned to an existing policy-based HA partition by using the
merge process. The system automatically created the required host definitions and volumes
at the target site. It merged all volumes, volume groups, and hosts from two independent
partitions in a single partition.
Setting the host location parameter to the local storage system optimizes the data flow and
ensures always local read and write access at each site, which significantly reduces the
long-distance traffic between both sites.
You can use the host location setting to optimize data paths for geographically distant
locations by using the following steps:
Identify storage system names. Locate the name of the storage system on which you are
currently working and the name of the remote storage system within your established
partnership.
Set host location. Based on the storage system names that you identified, assign the
appropriate location setting to your host.
Example 7-11 lists two storage systems. Their names, which also serve as location
identifiers, are SVC_SA2 and FS9110.
Example 7-11 Identify storage locations - optional needed for host definition
IBM_2145:SVC_SA2:superuser>lssystem | grep name
name SVC_SA2
...
IBM_2145:SVC_SA2:superuser>lspartnership
id name location partnership type cluster_ip
event_log_sequence link1 link2 link1_ip_id link2_ip_id
0000020421203BA2 SVC_SA2 local
0000020420A02B2A FS9110 remote fully_configured fc
In this example, the local system is SVC_SA2; the remote system is FS9110. Those names
can be used when you define the location during the host creation or modification process.
2. Assign the newly created volumes to the host or host cluster. See Example 7-13 on
page 147.
Note: Only hosts with a location setting use the optimized data path management. You can
modify existing host objects and assign the appropriate location to the host. In your
partition select Hosts → Hosts, and select the host and modify the location.
The host location can be modified by using the CLI. See Example 7-14.
The host can also be removed using the CLI. See Example 7-15.
After you remove the policy, you can assign a new policy-based HA policy to the partition,
which creates all volumes and hosts and initiates an initial data copy to the remote system.
System A: All volumes, volume groups, hosts and host mappings remain in place, with no
data loss or interruption to host access.
System B: The partition and all of its volumes, volume groups, hosts and host mappings
are removed the system. Any servers co-located with system B will be able to access the
volumes through system A if connectivity allows.
Before you delete a partition, you must first ensure there are no active replication policies
linked to the partition.
4. After the replication policy is removed, the partition can be removed. Click Manage
partition and Delete as shown in Figure 7-23 on page 149.
5. Follow the wizard, enter the partition name you want to remove and click Delete storage
partition.
2. Delete the partition using the rmpartition command. See Example 7-18.
Note: If the partition has a replication policy and associated objects, then the rmpartition
command requires -deletenonpreferredmanagementobjects or
-deletepreferredmanagementobjectsone flag to succeed otherwise the command will fail.
If there is no replication required anymore on the system, you might want to remove the
storage system partnership to the remote storage system as well.
Important: The required zoning changes between the storage systems and to the host
systems are not part of the migration process and must be completed before the migration.
If the partition is already in a policy-based remote copy relationship, the partition migration
cannot be done by using the same method. There are multiple options available, which are
discussed in the following sections.
At the time of writing, 3-site configurations are not supported. IBM has released statements
that it is planning policy-based HA plus policy-based replication to achieve replication to a
third site in the second half of 2024.
Note: This chapter is based on the white paper Configure policy-based replication over
high-speed Ethernet transport on IBM FlashSystem authored by Abhishek Jaiswal,
Aakanksha Mathur, Akshada Thorat, Akash Shah and Santosh Yadav from IBM India.
This chapter guides you through configuring a DR solution by using high-speed Ethernet
partnerships between two systems. It details the prerequisites and setup considerations for
establishing these partnerships and provides a comprehensive procedure that incorporates
RDMA technology. Visual aids such as topology diagrams and step-by-step instructions
through both GUI and CLI interfaces are included to facilitate the setup process.
To maintain real-time data copies at the remote DR site, this chapter explores the use of
policy-based replication and remote copy technologies. In this setup, one system acts as the
production system where hosts access the data, and the other system serves as the DR
system at a distant location. Visual aids such as topology diagrams and step-by-step
instructions through both GUI and CLI interfaces are included to facilitate the setup process.
Policy-based replication offers asynchronous data replication with a variable recovery point
greater than zero, aiming to achieve an optimal recovery point considering business needs. In
contrast to remote copy, it provides higher throughput and reduced latency between systems.
Notably, it eliminates complex configuration requirements at the DR site, saving user time and
ensuring streamlined failover procedures in DR scenarios.
You can use the svcinfo lssystem command to find the layer in which the system is. For
more information, see lssystem.
You can use the svctask chsystem command to change the layer of the system. There can be
up to two redundant fabrics established between the two systems. For more information, see
chsystem.
Note: The system must contain RDMA-capable Ethernet adapters for establishing
short-distance partnership using RDMA. The adapter part number of the supported
adapter is 01LJ587.
Software considerations
Make a note of the following software considerations to establish a short-distance partnership
using RDMA:
Short-distance partnerships that use RDMA is supported in IBM FlashSystem 8.6.2.0 and
later versions. Ensure that both the systems have version 8.6.2.0 or later.
Systems where IBM HyperSwap solution is already deployed in an Ethernet environment
cannot be a part of a short-distance partnership that uses RDMA.
The candidate systems participating in the partnership must not be visible to each other
over FC connections. You can check this using the command svcinfo
lspartnershipcandidate.
Short-distance partnerships that use RDMA are supported on both layer 2 and layer 3
networks, as shown in Figure 8-1.
Figure 8-1 Configuration topology for short-distance partnership that uses RDMA
Chapter 8. Configuring FlashSystem and SVC partnerships over high-speed Ethernet 155
Note: Figure 8-1 on page 155 shows a supported configuration for asynchronous
replication. While cross-connectivity (all-to-all) is recommended for asynchronous
replication to ensure maximum redundancy, it is a requirement for High Availability (HA).
For HA, all nodes must be connected to each other.
If there is a single ISL, either –link1 or –link2 replication parameter can be used. If there
are two ISLs, both the parameters -link1 and –link2 can be used.
All the adapters in the configuration are RDMA-capable Ethernet adapters. To avoid network
congestion, provision the ISL between the two systems so that it can accommodate all the
traffic passing through it. See Figure 8-2.
Figure 8-3 Configuration of a short-distance partnership that uses RDMA using direct-attach
connections
The following section takes you through detailed configuration steps using GUI and CLI.
Chapter 8. Configuring FlashSystem and SVC partnerships over high-speed Ethernet 157
Figure 8-4 Creating a portset
2. In the Create Portset dialog, enter a name for the portset (for example, portset4, in this
instance). Portset name is a user-defined variable and you can give any name to the
portset.
3. Select the portset type as High speed replication.
4. In the Port Type section, select Ethernet and then click Create. See Figure 8-5.
Chapter 8. Configuring FlashSystem and SVC partnerships over high-speed Ethernet 159
3. Click Add IP Address. See Figure 8-8
4. Enter the information as shown in Figure 8-9 for the IP address that you are adding to the
selected port: IP address, subnet mask, VLAN, and gateway. Specify IPv4 as the type.
6. After selecting the portset (portset4), click Save. The IP address is assigned to the portset
as shown in Figure 8-11.
Chapter 8. Configuring FlashSystem and SVC partnerships over high-speed Ethernet 161
Creating a partnership
After assigning the IP addresses, perform the following steps to create a partnership.
1. Click Copy Service → Partnership and remote copy → Create Partnership.
2. Select 2-site partnership then click Continue, as shown in Figure 8-12.
3. Select IP (short distances using RDMA) as the partnership type, enter the partner
cluster IP address, and click Test Connection. See Figure 8-13.
5. Enter link bandwidth and background copy rate. Then select the high-speed replication
portsets for Portset Link1 and Portset Link2. In this example, portset4 is selected for
Portset Link1 and portset5 is selected for Portset Link2 as shown in Figure 8-15.
6. Click Create.
Chapter 8. Configuring FlashSystem and SVC partnerships over high-speed Ethernet 163
7. Notice that the partially configured partnership is created, as shown in Figure 8-16.
You can follow the preceding steps on the remote cluster to change the partnership status
to fully configured state.
8. After completing the steps for the remote cluster, notice that the partnership status shows
Configured. See Figure 8-17.
In the Properties dialog, notice that you can see a detailed view of partnerships, such as
links, configuration status, type (short distance using RDMA), and so on. See Figure 8-19.
Chapter 8. Configuring FlashSystem and SVC partnerships over high-speed Ethernet 165
Figure 8-20 mkportset and lsportset command output
In Figure 8-20, there are two high-speed replication portsets with default names as portset4
and portset5. Portset name is a user-defined attribute. The mkportset command has the
following syntax:
mkportset -type <portset_type>
Assigning IP addresses
You can assign IP addresses to the defined portsets by using the mkip command. Select only
the RDMA port for configuring IP addresses and mapping those to the high-speed replication
portsets. In the following screenshot, RDMA port 5 of node1 and node2 are selected for
configuring IP addresses. See Figure 8-21.
When you assign the IP address, you can either provide the high-speed replication portset
name or the portset ID. In the following example, the mkip command is used to assign an IP
address and map it to a high-speed replication portset. See Figure 8-22. For more
information, see mkip.
Creating a partnership
Establishing a short-distance partnership between production and recovery systems can be
done using the mkippartnership command. The command takes two options -link1 and
-link2. Users should provide an individual portset to each of these two options. Users can
provide a maximum of two links per partnership. For more information, see mkippartnership.
It is advisable to provide two portsets corresponding to each of the link options. A partnership
can also be created with a single link option, but users can use both the replication links for
redundancy purposes.
The example in Figure 8-23 shows a short-distance partnership creation with high-speed
replication portsets (as created in earlier steps). Users can provide either a high-speed
replication portset ID or a portset name while creating a partnership.
The created partnership is listed by using the lspartnership command. For more
information, see lspartnership.
To create a fully configured partnership, repeat the mkippartnership command on the remote
system. You can verify the partnership status by using the lspartnership command on each
system, as shown in Figure 8-24.
For short distance partnerships, run the sainfo lsnodeipconnectivity command to observe
RDMA connectivity. Figure 8-25 shows that the status is connected for both the links of the
fully configured partnership created in “Creating a partnership” on page 162.
Chapter 8. Configuring FlashSystem and SVC partnerships over high-speed Ethernet 167
Note: The native IP replication can be done by using a replication type portset, and the
short-distance partnership using RDMA is possible only with portsets of type
highspeedreplication. Compression and secured IP partnerships are not supported with
short-distance partnerships that use RDMA portsets.
When the connection between the two near DR sites uses reliable links, replication
performance reaches its optimal level. Because iWARP uses TCP, performance remains
relatively stable even during temporary link issues.
The graph data is for informational purposes only and does not reflect benchmark results.
8.3.8 Troubleshooting
This section lists a few troubleshooting tips to validate the configuration.
1. Validate the partnership status:
a. In the GUI, click Copy Service → Partnership and remote copy → Partnerships.
Notice that the status is displayed as Configured, as shown in Figure 8-27.
Chapter 8. Configuring FlashSystem and SVC partnerships over high-speed Ethernet 169
b. In the CLI, run the lspartnership command and check if the output shows the
partnership status as fully_configured with the other partnership attributes, as shown
in Figure 8-28.
2. Ensure connectivity between links. Ensure that all IP addresses associated with the link1
and link2 portsets on the production IBM FlashSystem storage system are connected with
all IP addresses associated with the link1 and link2 portsets on the recovery IBM
FlashSystem storage system. The same can be validated by using the following methods:
a. In the GUI, click Settings → Network → Ethernet Connectivity, as shown in
Figure 8-29.
Figure 8-30 Displaying connectivity between nodes attached through Ethernet network
3. For a more resilient configuration to have maximum redundancy, ensure that the IP
addresses configured on the ports for both the links are from different nodes.
4. If the partnership status is something other than fully_configured, then further
troubleshooting is required to understand why the partnership is not reflecting the required
ideal state, which is fully_configured.
a. Partnership is in the not_present state. An IP partnership can change to the
not_present state for multiple reasons, and it means that the replication services are
stopped. Check for alerts, warnings, or errors associated with partnership. In the CLI,
run the lseventlog command, or in the GUI click Monitoring and then view the list in
the Events tab in the GUI to find the events pertaining to the changes occurred in the
system.
b. In the CLI, run the sainfo lsnodeipconnectivity command on all the nodes of the
system to understand if there are any issues with the sessions established. Ideally, the
session status is Connected, but other states can be Protocol mismatch, Degraded, and
Unreachable.
5. Although reference of event logs and directed maintenance procedures (DMPs) from the
GUI are the recommended ways to resolve any issue pertaining to the IBM FlashSystem,
you can also examine the connectivity to the remote system.
a. Check the connectivity to the remote cluster by using the svctask ping command: For
IPv4 and IPv6 use these commands:
svctask ping -srcip4 <source_ip> <destination_ip>
svctask ping6 -srcip6 <source_ip> <destination_ip>
6. If you see the error codes 2021 or 2023 in the event logs, use the following links to help
determine the cause and determine the action to take to resolve the issue:
– Error code 2021: IBM Documentation for error code 2021
– Error code 2023: IBM Documentation for error code 2023
In the GUI, follow the Directed Maintenance Procedure (DMP) from the menu Monitoring →
Events to troubleshoot the issue. If you followed the DMP and the issue is still not resolved,
then another option is to open a support ticket with IBM for further assistance.
Chapter 8. Configuring FlashSystem and SVC partnerships over high-speed Ethernet 171
172 Ensuring Business Continuity with Policy-Based Replication and Policy-Based HA
Abbreviations and acronyms
CDM Copy Data Management
CLI Command Line Interface
CSM Copy Services Manager
DMP Directed Maintenance Procedure
DR Disaster Recovery
FC Fibre Channel
FS9100 FlashSystem 9100
GM Global Mirror
GMCV Global Mirror with Change Volumes
GUI Graphical User Interface
HA High Availability
IBM International Business Machines
Corporation
ISL Inter-Switch Link
Mbps megabits per second
PBHA Policy-based HA
PBR Policy-based replication
QoS Quality of Service
RDMA Remote Direct Memory Access
RPO Recovery Point Objective
RTO Recovery Time Objective
RTT round-trip time
SVC SAN Volume Controller
mTLS mutual Transport Layer Security
The publications listed in this section are considered particularly suitable for a more detailed
discussion of the topics covered in this book.
IBM Redbooks
The following IBM Redbooks publications provide additional information about the topic in this
document. Note that some publications referenced in this list might be available in softcopy
only.
Policy-Based Replication with IBM Storage FlashSystem, IBM SAN Volume Controller and
IBM Storage Virtualize, REDP-5704
Unleash the Power of Flash: Getting Started with IBM Storage Virtualize Version 8.7 on
IBM Storage FlashSystem and IBM SAN Volume Controller, SG24-8561
You can search for, view, download or order these documents and other Redbooks,
Redpapers, Web Docs, draft and additional materials, at the following website:
ibm.com/redbooks
Online resources
These websites are also relevant as further information sources:
Configure policy-based replication over high-speed Ethernet transport on IBM
FlashSystem whitepaper:
https://www.ibm.com/downloads/cas/NP4RWMKX
SG24-8569-00
ISBN 0738461695
Printed in U.S.A.
®
ibm.com/redbooks