Best Practice Guide
Best Practice Guide
April 2020
Added
Major SANsymphony features – CDP
Added Best Practices for CDP, previously in the SANsymphony webhelp and FAQ 1516.
Added additional Best Practices not in either location.
Major SANsymphony features - Encryption
Added Best Practices for encrypted Virtual Disks and Disk Pools.
For previous changes made to this document please see page 45.
Overview
Each DataCore implementation will always be unique and giving advice that applies to every
installation is therefore difficult. The information in this document should be considered as a
set of ‘general guidelines’ and not necessarily strict rules.
DataCore cannot guarantee that following these best practices will result in a perfect
solution – there are simply too many separate (and disparate) components that make up a
complete SANsymphony installation, most of which are out of DataCore Software’s control –
but these guidelines will significantly increase the likelihood of a more secure, stable and
high-performing SANsymphony installation.
This guide assumes that DataCore-specific terms - e.g. Virtual Disk, Disk Pool, CDP and
Replication etc. including their respective functions – will be understood. It does not replace
DataCore technical training nor does it remove any need to use a DataCore Authorized
Training Partner and assumes fundamental knowledge of Microsoft Windows and the
SANsymphony suite of products.
Also see:
End of life notifications for DataCore Software products
https://datacore.custhelp.com/app/answers/detail/a_id/1329
Design objectives
The design objective of any SANsymphony configuration is an ‘end-to-end’ process (from the
user to the data) as well as availability, redundancy and performance. None of these can be
considered in isolation but as integrated parts of an entire infrastructure. The information in
this document provides some high level recommendations that are often overlooked or
simply forgotten about.
Avoid complexity
The more complex the design, the more likely unforeseen problems will occur. Complex
design can also make a system difficult to maintain and support especially as it grows in size.
A simple approach is recommended whenever possible.
Documentation
Document the environment properly; keeping it up-to-date and accessible. Establish 'shared
knowledge' between at least two people who have been trained and are familiar with all
areas of the infrastructure.
User access
Make sure that the difference between a 'normal' server and a ‘DataCore Server’ is
understood. A DataCore Server should only be operated by a trained technician.
Also see:
DataCore Training Overview
http://www.datacore.com/Support/Training.aspx
Hemisphere Mode
Hemisphere mode can improve memory cache access both to and from the memory
controllers but the server hardware usually has to have memory modules of the same
specification to take advantage of this setting. This should be set this to Automatic.
Secure Boot
This must be disabled on new installations of SANsymphony.
Static High – If available, set the power management settings to static high to disable any
additional CPU power saving.
CPUs
All x64 processors (except for Intel’s Itanium family) is supported for use in a DataCore Server.
DataCore recommend using ‘server-class’ CPUs than those intended for ‘workstation' use.
Even so, faster (i.e. higher frequency) CPUs are always preferred over slower ones as they can
process more instructions per second. DataCore also prefer using the fastest cores when
possible rather than more-but-slower cores. Please consult your server vendor to see if any
additional CPU Sockets are necessary to be able to use all of the available PCIe/Memory-
Sockets on the server’s motherboard.
Hyper-Threading (Intel)
For Intel CPUs manufactured after 2014, DataCore recommend that Hyper-Threading is
enabled as testing has shown this can help increase the number of threads for
SANsymphony's I/O Scheduler, allowing more I/O to be processed at once. For earlier Intel
CPUs DataCore recommend that Hyper-Threading be disabled as testing has shown that
older CPUs would behave erratically with Hyper-threading enabled.
of an equivalent physical DataCore Server, there is still no guarantee that all of these vCPUs
would be used at the same rate and throughput as physical CPUs.
Power
Use redundant and uninterruptable power supplies (UPS) whenever possible.
Use the ‘DataCore Server Memory Considerations’ document available from the Support
Website to calculate the memory requirement for the type, size and complexity of the
SANsymphony configuration, always allow for plans of future growth.
If a Server CPU's uses NUMA architecture then all the physical memory modules should have
the same specification. See the NUMA Group Size Optimization and Hemisphere Mode
entries from the BIOS section on page 7 for more information.
BIOS
Collaborative power control should be disabled.
CPC Override/Mask should be enabled.
Hemisphere Mode should be set to Automatic.
Intel Turbo Boost should be disabled.
NUMA Group Size Optimization/Node Interleaving should be enabled and set to Flat
(if the option is available).
Advanced Encryption Standard instruction set support should be enabled.
Secure Boot should be disabled.
Power saving (C-states) should all be disabled but Static High should be enabled.
CPU
Generally
Use ‘server class’ processors.
Use less-but-faster cores rather than more-but-slower cores
Enable Hyper-Threading (Intel) on CPUs from 2014 or newer.
Disable Hyper-Threading (Intel) on CPUs older than 2014.
System Memory
See ‘DataCore Server Memory Considerations’ from the Support Website:
http://datacore.custhelp.com/app/answers/detail/a_id/1543
Use ECC Memory.
Enable CPC Settings.
There is however a significant implication for high availability when using a single adaptor -
even if it has multiple ports in it - as most types of adaptor failures will usually affect all ports
on it rather than just one (or some) of them.
Using many adaptors that have a small number of ports on them will reduce the risk of
multiple port failures happening at the same time.
iSCSI connections
Fundamentally, SCSI load-balancing and failover functions are managed by Multipath I/O
protocols [2]; TCP/IP uses a completely different set of protocols for its own load-balancing and
failover functions. When SCSI commands, managed by Multipath I/O protocols but ‘carried’
by TCP/IP protocols are combined (i.e. iSCSI), then interaction between the two protocols for
the same function can lead to unexpected disconnections or even complete connection loss.
NIC teaming
NIC teaming is not recommended for iSCSI connections as it adds more complexity (without
any real gain in performance); and although teaming iSCSI Targets - i.e. Front-end or Mirror
ports - would increase the available bandwidth to that target, it still only allows a single
target I/O queue rather than, for example, two, separate NICs which would allow two,
independent target queues with the same overall bandwidth.
1
This assumes that there are always an adequate number of ‘PCIe Lanes’ available in the PCI Slot being
used for the adapter. Please refer to your server hardware vendor’s own documentation for this.
2
Mirrored Virtual Disks that are configured to use multiple iSCSI Mirror paths on the DataCore Server
are, by default, auto-configured to be managed by Microsoft’s MPIO using the ‘Round Robin with
Subset’ Policy.
ISCSI connections
Use faster, separate network adaptors instead of NIC teaming.
Do not use NIC teaming or STP protocols with iSCSI connections. Use more, individual
network connections (with Multipath I/O software) to manage redundancy.
Use independent network switches for redundant iSCSI networks.
A low-end RAID controller will deliver low-end performance. An integrated (onboard) RAID
controller that is often supplied with the DataCore Server may only be sufficient to handle
just the I/O expected for the boot drive. Controllers that have their own dedicated CPU and
cache are capable of managing much higher I/O workloads and many more physical disks.
Consult with your storage vendor about the appropriate controller to meet your expected
demands.
If both 'fast' and 'slow' disk types share the same disk controller in the storage array (e.g. an
SSD sharing the same disk controller as a SAS RAID5 set), then the slower disks on that
controller can hold up I/O to the faster disks. DataCore recommend is to have a separate disk
controller for each differential of disk speed type. If there is no choice but to mix different
disk speed types on the same disk controller - for example mixing SSD with SAS - then in this
example make sure the SAS disk have 'no RAID' (or RAID0) configured and use
SANsymphony's Disk Pool mirroring feature as this should be faster than hardware RAID
mirroring.
Also see: Storage Hardware Guideline for use with DataCore Servers
http://datacore.custhelp.com/app/answers/detail/a_id/1302
DataCore Appliances may be installed with an OEM version that has been optimized for use
to cater for system resource needs.
Synchronize all DataCore Server system clocks with each other and
connected Hosts
While the system clock has no influence on I/O - from Hosts or between DataCore servers –
there are some operations that are, potentially, time-sensitive.
It is also recommended to synchronize all of the host’s system clocks as well as any SAN or
Network switch hardware clocks (if applicable) with the DataCore Servers. This can be
especially helpful when using DataCore’s VSS on a host but also generally to help with any
troubleshooting where a host’s own system logs need to be checked against those of a
DataCore Server. Many ‘SAN events’ often occur over very short periods of time (e.g. Fibre
Channel or ISCSI disconnect and reconnection issues between Hosts and DataCore Servers).
Power Options
Select the High Performance power plan under Control Panel\Hardware\Power Options.
Where not set, the SANsymphony installer will attempt to set this. Remember some power
options are also controlled directly from within the server BIOS; see the section on page 7.
The SANsymphony installer will change the memory dump type to Kernel Memory Dump
ensure that if any crash analysis is required from the DataCore Server that the correct type of
dump file is generated.
A DataCore Server that has a small boot disk and large amounts of physical memory may
end up with a Page File that fills the boot disk after the installation. In this case, it is still
recommended to keep the Kernel Memory Dump setting but manually enter a custom value
for the page file size as large as is practically possible (for your boot disk) by unchecking the
‘Automatically manage paging file size for all drives’ option.
3. Add a new REG_DWORD with a Name of 'DumpType' and a Data value of '2'
2. If the current setting is 'Automatic' then enter option '5' and then when prompted,
enter 'M' (for manual) and press enter:
4. Click 'OK' and verify that the Windows Update Setting value is now 'Manual'.
5. Enter '15' to exit out of the 'sconfig' utility and close the window.
Generally though, DataCore recommend that you always apply the latest security updates as
they become available.
DataCore recommend that you do not apply 'Preview' rollups unless specifically asked
to do so by Technical Support.
DataCore recommend that you do not apply third-party driver distributed by
Windows updates (e.g. Fibre Channel drivers).
Occasionally, Microsoft will make hotfixes available before they are distributed via
normal Windows Update. If a hotfix is not listed in the 'SANsymphony Component
Software Requirements' section of “DataCore™ SDS Prerequisites” and it is not being
distributed as part of a normal Windows software update then do not apply it.
If you are in any doubts please contact Technical Support for advice.
Windows Updates often involve updates to .NET, WMI, MPIO and other key Windows
features which SANsymphony relies on. By stopping the server and service, it ensures that
the patch installation has no negative effect on the performance of the server being patched.
After patching, DataCore recommend that the server be rebooted (even if Windows Updates
do not prompt for it) and then check for further updates. If no additional updates are
required, start the DataCore Executive Service and change the startup type back to
"Automatic" (or "Automatic (Delayed Start)") as it was before the patch process.
Third-party software
It is recommended not to install third-party software on a DataCore Server. SANsymphony
requires significant amounts of system memory as well as CPU processing; it will also prevent
certain system devices (e.g. Disk devices) from being accessed by other software
components that may be installed on the DataCore Server which may lead to unexpected
errors from those other software components.
The purpose of the DataCore Server should not be forgotten and trying to run the DataCore
Server as a Domain Controller or as a Mail Server/Relay for example, as well as SANsymphony,
must not be done as this will affect the overall performance and stability of the DataCore
Server. DataCore recognize that ‘certain types’ of third-party software are required to be able
to integrate the DataCore Server onto the user’s network. These include:
In these few cases, and as long as these applications or agents do not need exclusive access
to components that SANsymphony needs to function correctly (i.e. Disk, Fibre Channel or
iSCSI devices), then it is possible to run these alongside SANsymphony.
Always consult the third-party software vendor for any additional memory requirements
their products may require and refer to the ‘Known Issues - Third-party Hardware and
Software’ document for any potential problems with certain types of third-party software
that have already been found to cause issues or need additional configuration. DataCore
Support may ask for third-party products to be removed in order to assist with
Troubleshooting.
Also see:
Changing Cache Size
http://www.datacore.com/SSV-Webhelp/Changing_Cache_Size.htm
Never upgrade ‘in-place’ to a newer version of Windows operating system, for example
upgrading from Windows 2008 to Windows 2012 or upgrading from Windows 2012 to
Windows 2012 R2; even if the newer version is considered qualified by DataCore the upgrade
will stop the existing SANsymphony installation from running. Instead of an in-place upgrade
the DataCore Server’s operating system must be installed ‘as new’.
R2 versions of a particular Windows Operating System also need to be qualified for use on a
DataCore Server. Any ‘R2’ versions of Windows that have passed qualification for a specific
version of SANsymphony will be listed in both the SANsymphony Software release notes and
the SANsymphony minimum requirements page.
Also see:
How to reinstall or upgrade the DataCore Server's Windows Operating System
http://datacore.custhelp.com/app/answers/detail/a_id/1537
The SANsymphony Software release notes are available either as a separate download or
come bundled with the SANsymphony software.
Also see:
Software Downloads and Documentation - SANsymphony release notes
http://datacore.custhelp.com/app/answers/detail/a_id/1419
TCP/IP Networking
SANsymphony’s Console, the VMware vCenter Integration component Replication and
Performance Recording function (when using a remote SQL Server) all use their own
separate TCP/IP session.
To avoid unnecessary network congestion and delay as well as losing more than one of these
functions at once should any problems occur with one or more network interfaces, we
recommend using a separate network connection for each function.
The controller node is responsible for managing what is displayed in the SANsymphony
Console for all DataCore Servers in the Server Group – for example; receiving status updates
for the different objects in the configuration for those other DataCore Servers (e.g. Disk Pools,
Virtual Disks and Ports etc.), including the posting of any Event messages for those same
objects within the SANsymphony console.
The controller node is also responsible for the management and propagation of any
configuration changes made in the SANsymphony Console regardless of which DataCore
Server’s configuration is being modified, and makes sure that all other DataCore Servers in
the Server Group always have the most recent and up-to-date changes.
The ‘election’ of which DataCore Server is to become the controller node is decided by the
SANsymphony software automatically and whenever;
The decision on which DataCore Server becomes the controller node is decided
automatically between all the Servers in the Group and cannot be manually configured.
It is also important to understand that the controller node does not manage any Host, Mirror
or Back-end I/O (i.e. in-band connections) for other DataCore Servers in the Server Group. In-
band I/O is handled by each DataCore Server independently of the other Servers in the Server
Group, regardless if it is the elected controller or not. Nor does it send or receive Replication
data configured for another DataCore Server in the same Server Group, although it will
manage all Replication configuration changes and Replication status updates regardless if it
is the Source Replication Server or not.
This includes:
When applying SANsymphony configuration updates to all servers in the same Server
Group.
Any UI updates while viewing the SANsymphony Console, including state changes
and updates for all the different objects within the configuration (e.g. Disk Pools,
Virtual Disks, Snapshots and Ports etc.).
Configuration updates and state information to and from remote Replication Groups
Configuration updates when using SANsymphony’s VMware vCenter Integration
component.
SQL updates when using a remote SQL server for Performance Recording
The Connection Interface’s default setting (‘All’) means that SANsymphony will use any
available network interface on the DataCore Server for its host name resolution, this is
determined by the Windows operating system and how it has been configured and
connected to the existing network.
It is possible to change this setting, and choose an explicit network interface (i.e. IP Address)
to use for host name resolution instead, but this requires that the appropriate network
connections and routing tables have been set up correctly and are in place. SANsymphony
will not automatically retry other network connections if it cannot resolve to a hostname
using an explicit interface.
We recommend leaving the setting to ‘All’ and use the appropriate ‘Hosts’ file or DNS
settings to control host name resolution.
This means that all DataCore Servers in the same Server Group must have a routable TCP/IP
connection to each other so that if the controller node ‘moves’ to a different server, then the
new controller node must also be able connect to all of the remaining DataCore Servers in
the group [1].
On a Workstation
Workstations which only have the SANsymphony Console component installed cannot
become ‘controller nodes’ and never directly send or receive configuration information for
any Server Group they connect to. Just like an ‘unelected’ node the workstation will connect
to the controller node to make configuration changes or to display the information in its own
SANsymphony Console (See Understanding the Controller Node concept on the previous
page).
This means that even if the workstation is on a separate network segment from the DataCore
Servers (e.g. in a different vLAN) it must still be able to send and receive TCP/IP traffic to and
from all the DataCore Servers in that vLAN.
We also recommend that each NIC that is teamed is in its own separate network and that
‘failover’ mode is used rather than ‘load balancing’ as there is no specific performance
requirement for Inter-node communication as the TCP/IP and using ‘fail over’ mode means
that configuring and managing the network connections and switches is simpler. It also
makes troubleshooting any future connection problems easier as well.
1
‘Re-election’ of the controller node takes place if the node is shutdown or if it becomes unavailable on
the network to the rest of the Server Group for any reason.
iSCSI connections
See page 12
Replication
See page 35
DataCore do recommend however using Host Name resolution over just using IP addresses
as it is easier to manage any IP address changes that might occur, planned or unexpected,
by being able to simply update any ‘Hosts’ file or DNS entries instead of ‘reconfiguring’ a
Replication group or remote SQL server connection for Performance Recording (i.e. manually
disconnecting and reconnecting), which is disruptive.
When using a ‘Hosts’ file, do not add any entries for the local DataCore Server but only for the
‘remote’ DataCore Servers and do not add multiple, different entries for the same server (e.g.
each entry has a different IP address and/or server name for the same server) as this will
cause problems when trying to (re)establish network connections. "The server name entered
into the 'Hosts' file should match the "Computer name" for the node in the DataCore
Management Console.
If using an ‘external’ firewall solution or another method to secure the IP networks between
servers then refer to the ‘Windows Security Settings Disclosure’ for the full list of TCP Ports
required by the DataCore Server and ensure that connections are allowed through.
Also see:
Windows Security Settings Disclosure
http://www.datacore.com/SSV-Webhelp/windows_security_settings_disclosure.htm
iSCSI connections
See iSCSI connections on page on page 12 for more information on iSCSI.
Use either ‘Hosts’ or DNS settings to control all host name resolution for the DataCore
Server.
Use a managed ‘Hosts’ file (or DNS) instead of just using IP addresses.
Any Windows updates and security fixes that are currently available from Microsoft’s
Windows Update Service should be applied whenever possible.
For Firewalls and other network security requirements please refer to ‘Windows
Security Settings Disclosure’ via the online help: http://www.datacore.com/SSV-
Webhelp/windows_security_settings_disclosure.htm
These recommendations are also for optimal performance not minimal capacity where a
larger storage allocation unit (SAU) means less additional work that the Disk Pool has to do to
keep its own internal indexes up-to-date, which results in better overall performance within
the Disk Pool, especially for very large configurations.
While a larger SAU size will mean that there is often more, initial capacity allocated by the
Disk Pool, the chance of newer Host writes needing to allocate yet more new SAUs will be
less likely and will instead be written to one of the already allocated SAUs.
The following applies to all types of Disk Pools, including normal, shared, SMPA and Bulk
Pools.
Whenever an SAU is allocated, reclaimed or moved to a new physical disk within the same
Disk Pool, the Catalog is updated.
It is important that Catalog updates happen as fast as possible to not to interfere with other
I/O within the Disk Pool; for example, if the Catalog is being updated for one SAU allocation
and another Catalog update for a different SAU is required, then this other Catalog update
will have to wait for a short time before its own index can be updated. This can be noticeable
when a lot of SAUs need to be allocated all within a very short time; and while the Disk Pool
will try to be as efficient as possible when handling multiple updates for multiple SAU, there
is an additional overhead while the Catalog is updated for each new allocation before the I/O
written to the SAU is considered complete. This can, in extreme cases, result in unexpected
I/O latency during periods of significant SAU allocations.
Therefore we recommend that the Catalog be located on the fastest disk possible within the
Disk Pool. As of SANsymphony 10.0 PSP9, the location of the catalogue will be proactively
maintained per Disk Pool to be located on the fastest storage.
DataCore recommend therefore that all Disk Pools have 'dedicated' physical disks used just
for storing the primary and secondary Disk Pool Catalogs and that these physical disks are
as fast as possible.
As the Catalog is located within the first 1GB of the physical disk used to store it and as there
is a minimum Disk Pool requirement of any physical disk to have enough free space to
allocate at least one SAU, that this 'dedicated' physical disk be 2GB in size; 1GB for the Catalog
itself and 1GB for the largest SAU possible within the disk pool (see previous section on
Storage Allocation Unit size in this chapter).
In all releases, there is only ever a maximum of two copies of the Catalog in a Disk Pool at any
time.
Also see:
http://www.datacore.com/SSV-webhelp/Creating_Disk_Pools.htm#Mirroring_Pool_Disks
disk added to the Disk Pool. If the physical disk that holds the backup copy of the Disk Pool is
removed then a new backup copy of the Catalog will be written to the 'next available'
physical disk in the Disk Pool. The location of the primary copy remains unchanged.
It is not currently possible for a user to move the Catalog to a physical disk of their choice in a
Disk Pool.
How the Catalog location is managed during physical disk I/O failures
If there is an I/O error when trying to update or read from the primary Catalog then the
backup Catalog will become the new primary catalog and if there is another physical disk
available then that will now become the new backup catalog location.
Each SAU represents a number of contiguous Logical Block Addresses (LBA) equal to its size
and, once allocated, will be used for further reads and writes within the LBA range it
represents to a Virtual Disk's storage source. Any time a write I/O is sent by the Host to a LBA
not able to be allocated within existing SAU's for that Virtual Disk a new SAU is allocated by
the Disk Pool.
The amount of space taken in Disk Pool's Catalog (see previous section) for each allocated
SAU is the same regardless of the size of the SAU that was chosen when the Disk Pool was
created. As the Catalog has a theoretical maximum size, it means that the larger the SAU size
chosen, the larger the amount of physical disk that can be added into a Disk Pool.
As each SAU is allocated, the Disk Pool's Catalog is updated which. The smaller the SAU size
the more likely it is that any new write I/O will be outside of the LBA range of already
allocated SAUs and so the more likely that the Catalog will need to be updated. In the
previous section - The Disk Pool Catalog – it is recommended to have the Catalog on the
fastest disk possible so as to be able to complete any Catalog updates as fast as possible,
likewise it is recommended to have the largest SAU size as possible – currently this is 1GB – to
make the chance of future writes being outside of the range of already allocated SAU LBAs
less likely.
In addition, the more SAUs there are in a Disk Pool the more work has to be done to analyze
those SAUs, either for Migration Planning or reclamation within the Disk Pool, or for GUI
updates. Hence, the larger the SAU the lower the resources required to carry out those tasks.
There are however exceptions to this recommendation – see section 'Disk Pools and
Snapshot' in the Snapshot chapter on page 39 and also refer to DataCore's Host
Configuration guide which may also offer particular advice for your Host's operating system
and Disk Pools (e.g. Excessive SAU usage when formatting Linux filesystems).
Its impact on reclamation should also be considered when choosing an SAU size. Previously
allocated SAUs are only reclaimed when they consist entirely of “zero” writes, any non-zero
write will cause the SAU to remain allocated. The larger the SAU, the higher the likelihood
that this is the case.
It is recommended that mirrored virtual disks use Disk Pools with the same SAU size for each
of their storage sources.
The number of disks in a Pool should be assigned to satisfy the expected performance
requirements with an allowance of 50% (or more) for future growth. It is not uncommon for a
Disk Pool that has been sized appropriately to eventually suffer because of increased load
over time.
Also see: What is the maximum amount Physical Storage that a Disk Pool can manage?
http://datacore.custhelp.com/app/answers/detail/a_id/968
Auto-Tiering considerations
Whenever write I/O from a Host causes the Disk Pool to allocate a new SAU, it will always be
from the highest available tier as listed in the Virtual Disk's tier affinity unless there is no
space available, in which case the allocation will occur on the 'next' highest tier and so on.
The allocated SAU will then only move to a lower tier if the SAU's data temperature drops
sufficiently. This can mean that higher tiers in a Disk Pool always end up being full, forcing
further new allocations to lower Tiers unnecessarily.
Use the Disk Pool's 'Preserve space for new allocations' setting to ensure that a Disk Pool will
always try to move any previously-allocated, low-temperature SAUs down to the lower tiers
without having to rely on temperature migrations alone. DataCore recommend initially
setting this value to the maximum of 20% and adjust according to your IO patterns, after a
period of time.
Also see: Changing the tier space preserved for new allocations
http://www.datacore.com/SSV-Webhelp/Automated_Storage_Tiering.htm
Auto-Tiering considerations
Use the Disk Pool's 'Preserve space for new allocations' setting to ensure that a Disk
Pool will always try to move any previously-allocated, low-temperature SAUs down to
the lower tiers without having to rely on temperature migrations alone. DataCore
recommend initially setting this value to the maximum of 20% and adjust according
to your IO patterns, after a period of time.
Replication
Also see:
What is Replication?
http://www.datacore.com/SSV-Webhelp/Replication.htm
Replication settings
Data Compression
When enabled, the data is not compressed while it is in the buffer but within the TCP/IP
stream as it is being sent to the remote DataCore Server. This may help increase potential
throughput sent to the remote DataCore Server where the link between the source and
destination servers is limited or a bottleneck. It is difficult to know for certain if the extra time
needed for the data to be compressed (and then decompressed on the remote DataCore
Server) will result in quicker replication transfers compared to no Data Compression being
used at all.
A simple, comparison test should be made after a reasonable period of time by disabling
compression temporarily and observing what (if any) differences there are in transfer rates or
replication time lags.
See the section ‘Enabling/disabling data compression during data transfer’ from the
online help for more information: http://www.datacore.com/SSV-
Webhelp/Configuring_Server_Groups_for_Replication.htm
Any third-party, network-based compression tool can be used to replace or add additional
compression functionality between the links used to transfer the replication data between
the local and remote DataCore Servers, again comparative testing is advised.
Transfer Priorities
Use the Replication Transfer Priorities setting - configured as part of a Virtual Disk’s storage
profile - to ensure the Replication data for the most important Virtual Disks are sent more
quickly than others with in the same Server Group.
See the section ‘Replication Transfer Priority’ from the online help for more information:
http://www.datacore.com/SSV-Webhelp/Replication_Operations.htm
Therefore, the disk device that holds the Replication buffer should be able to manage at least
2x the write throughput for all replicated Virtual Disks combined together. If the disk device
used to hold the Replication buffer is too slow it may not be able empty fast enough (so as to
be able to accommodate new Replication data). This will result in a full buffer and an overall
increase in the replication time lag (or latency) on the Replication Source DataCore Server.
A full Replication buffer will prevent future Replication checkpoint markers from being
created until there is enough available space in the buffer and in extreme cases may also
affect overall Host performance for any Virtual Disks served to it that are being replicated.
Using a dedicated storage controller for the physical disk(s) used to create the Windows disk
device where the buffer is be located will give the best possible throughput for the
replication process. Do not use the DataCore Server’s own boot disk so as to not cause
contentions for space and disk access.
It is technically possible to ‘loopback’ a Virtual Disk to the DataCore Server as a local SCSI disk
device to then use as the Replication buffer’s location. This is not recommended as apart
from the extra storage capacity that this will require, there may be unexpected behavior
when the SANsymphony software is ‘stopped’ (e.g. for maintenance) as the Virtual Disk being
used would suddenly no longer be available to the Replication process, potentially corrupting
the replication data that was being flushed while the SANsymphony Software was stopping.
Creating a mirror from the Virtual Disk being ‘loop-backed’ may be considered a possible
solution to this but in the case where the mirrored Virtual Disk used for the Replication buffer
also has to handle a synchronous mirrored Virtual Disk resynchronization (e.g. after an
unexpected shutdown of the DataCore mirror partner) the additional reads and writes used
by the mirror synchronizing process as well as not using DataCore Server’s own write caching
(while the mirror is not healthy) will significantly reduce the overall speed of the Replication
buffer so this configuration is not recommended either.
Situations when the Replication Link is ‘down’, and where the replication process will
continue to create and store replication data in the buffer until the link is re-established,
needs to be considered too. For example, plan for an ‘acceptable’ amount of network down-
time the Replication Group (e.g. 24 hours) and knowing (even approximately) how much
replication data could be generated in that time would allow for an appropriate sizing to
prevent the Replication ‘In log’ state.
Planning for future growth of the amount of replication data must also be considered. Create
GPT type Windows disk devices and using Dynamic Disks will give the most flexibility in that
it should be trivial to expand an existing NTFS partition used for the location of an existing
Replication buffer if required.
Be aware that determining the optimum size of the buffer for a particular configuration is
not always trivial and may take a few attempts before it is known.
Replication connections
TCP/IP link speed
The speed of the network link will affect how fast the replication process will be able to send
the replication data to the remote DataCore Server and therefore influence how fast the
buffer can empty. Therefore the link speed will have a direct effect on the sizing of the
Replication buffer. For optimum network bandwidth usage the network link speed should be
at least half the speed of the read access speed of the buffer.
WAN/LAN optimization
The replication process does not have any specific WAN or LAN optimization capabilities but
can be used alongside any third-party solutions to help improve the overall replication
transfer rates between the local and remote DataCore Servers.
All Replication configuration changes & updates made via the SANsymphony Console.
This includes Virtual Disk states; all Replication performance metrics (e.g. transfer
speeds and the number of files left to transfer). This TCP/IP traffic is always sent to and
from the ‘controller nodes’ of both the Source and Destination Replication Groups 1.
The Replication data between the Source and Destination Replication Groups. This
TCP/IP traffic always sent from the DataCore Server selected when the Virtual Disk
was configured for Replication on the Source Server Group regardless which
DataCore Server is the ‘controller node’.
In both cases, the DataCore Server’s own Connection Interface setting is still used.
1
See the section ‘TCP/IP Networking – The SANsymphony Server Group – The Controller node’ on page
25 for more explanation.
This means that if the ‘controller node’ is not the same DataCore Server that is configured for
a particular Virtual Disk’s Replication, then the two different TCP/IP traffic streams (i.e.
Configuration changes & updates and Replication data) will be split between two different
DataCore Servers on the Source with each DataCore Server using their own Connection
Interface setting.
As the elected ‘controller node’ can potentially be any DataCore Server in the same Server
Group it is very important to make sure that all DataCore Servers in the same Local
Replication Group can route all TCP/IP traffic to all DataCore Servers in the Remote
Replication Group and vice versa.
A Host Operating System’s page (or swap) file can also generate ‘large’ amount of extra,
unneeded replication data - which will not be useful after it has been replicated to the
remote DataCore Server. Use separate Virtual Disks if these operations are not required to be
replicated.
Some third-party backup tools may ‘write’ to any file that they have just backed up (for
example to set the ‘archive bit’ on a file it has backed up) and this too can potentially
generate extra amounts of replication data. Use time-stamp based backups to avoid this.
Encryption
See “Replicating encrypted virtual disks” on page 43.
Other
Exclude the replication buffer from any Anti-Virus software checks.
Host operations that generate large bursts of writes - such as Live Migration, vMotion,
host-based snapshots or even page/swap files for example –that are not required to
be replicated should use separate, un-replicated Virtual Disks.
Use timestamp-based backups on Host files that reside on a Virtual Disk to avoid
additional replication data being created by using a file’s ‘archive-bit’ instead.
Where encrypted virtual disks are being replicated, see the “Encryption” section of
this document.
Snapshot
Also see:
http://www.datacore.com/SSV-webhelp/Snapshot.htm
In the Disk Pool chapter – see page 310 - we recommended to use the 'largest SAU size as
possible' for a Disk Pool. When using Snapshots however it is recommended to use the
smallest SAU size as possible. This is because virtual disks often have multiple, differential
snapshots created from them that deleted after a relatively short time. As each snapshot
destination created from a Virtual Disk is an independent storage source, using a large SAU
size in this situation can sometimes lead to excessive and unnecessary allocations of storage
from a Disk Pool (and in extreme cases cause the Disk Pool to run out of SAUs to allocate).
Multiple snapshots for a Virtual Disk can also mean multiple write I/Os used by the copy-on-
write process for each single write I/O sent to the Virtual Disk requiring significant numbers
of Catalog updates. Finally as each snapshot is deleted, any SAUs that were allocated to that
snapshot will need to be reclaimed and not only does this contribute to extra I/O within the
Disk Pool (to zero out the previously allocated SAUs) but even more Catalog updates as the
SAUs are 'removed' from their association with the snapshot's storage source.
As the recommendation here is for the smallest SAU - which conflicts with the
recommendation to use the largest SAU size for Disk Pools generally – and because smaller
SAUs increase the likelihood of new allocations (see the previous section on Storage
Allocation Unit size in the Disk Pools chapter), Snapshots should have their own dedicated
Disk Pools.
When using mirrored Virtual Disks, a snapshot destination can be created from either, or
both, storage sources on each DataCore Server. If possible create all snapshots on the non-
preferred DataCore Server for a Virtual Disk, this is because the overall workload on the
preferred DataCore Server of a Virtual Disk will be significantly more than that of the non-
preferred side which only has to manage mirror write I/O whereas the preferred side received
both reads and writes from the Host as well as having to manage the I/O writes to the mirror
on the other DataCore Server.
Although it is possible to have up to 1024 snapshots per Virtual Disk, each active snapshot
relationship adds an additional load to the source virtual disk for the copy-on-write process.
As an example, if there are 10 snapshots enabled for a single virtual disk then any write I/O
sent to the virtual disk can end up generating 10 additional I/O requests – one for each
snapshot – as the copy-on-write process has to be completed before the DataCore Server can
accept the next write to the same location on the source Virtual Disk, the additional wait
time for all 10 I/Os to complete for that 1 initial I/O sent to the source can be significant and
end up with considerable latency on the Host.
For this reason, it’s best to keep the number of snapshots for each source virtual disk to as
minimum as possible.
CDP
Also see:
https://docs.datacore.com/SSV-WebHelp/continuous_data_protection_(cdp).htm
CDP requires adequate resources (memory, CPU, disk pool storage performance and disk
capacity) and should not be enabled on DataCore Servers with limited resources. Before
enabling, review the following FAQ:
Use separate, dedicated pools for CDP-enabled virtual disks, and for the history logs for those
virtual disks.
When creating a Disk Pool for CDP History Logs, use the same SAU size as the pool with the
source Virtual Disk in.
Disk Pools used should have sufficient free space at all times. System Health thresholds and
email notification via tasks should be configured for notification when disk pool free space
reaches the attention threshold to ensure sufficient free space.
Enabling CDP for a Virtual Disk increases the amount of write I/O to that virtual disk as it
causes writes to go to the History Log as well as the underlying physical disk. This may
increase I/O latency to the disk pools used by the Virtual Disk and the History Log and
decrease host I/O performance to virtual disks using these Disk Pools if not sized accordingly.
The default history log size (5% of the virtual disk size with a minimum size of 8 GB) may not
be adequate for all virtual disks. The history log size should be set according to I/O load and
retention time requirements. Once set, the retention period can be monitored and the
history log size can be increased if necessary. The current actual retention period for the
history log is provided in the Virtual Disk Details > Info Tab (see Retention period).
Enable CDP on non-Preferred Server to reduce the impact of History Log filling.
Wait to enable CDP until after recoveries are completed and/or when large amounts of data
have been copied or restored to the Virtual Disk.
When copying large amounts of data at one time to newly created Virtual Disks, enable CDP
after copying the data to avoid a significant I/O load.
Do not enable or disable CDP on a Virtual Disk while served to Hosts. These activities are
disruptive and may result in slow performance, mirrors failing or loss of access (if not
mirrored).
Do not send large blocks of zero writes to a CDP enabled Virtual Disk as this could result in
shortened retention time and does not allow the pool space to be reclaimed until destaged.
If pool reclamation is needed, disable CDP and wait for the History Log to destage, then run
the zeroing routines. CDP can then be re-enabled.
Rollbacks are designed to be enabled for short periods of time and then to be split or
reverted to once the desired data has been found or recovered. Where possible, do not send
writes to a rollback or keep it enabled for a long period of time.
Rollbacks should only be created for the purpose of finding a consistent condition prior to a
disruptive event, and then restoring the Virtual Disk data using the best rollback. Rollbacks
should be split once the required point-in-time has been found. Delete any rollbacks which
are no longer needed.
After an event that requires restoration of data, I/O to the affected Virtual Disk should be
immediately suspended and then rollbacks should be created. Suspending I/O will keep
older data changes from being destaged, which in turn will keep the rollback from expiring
or I/O to the Virtual Disk from failing (where a “Persistent” rollback has been created). Keep
I/O suspended until data recovery is complete.
Encryption
Also see:
How does it work – Encryption
https://datacore.custhelp.com/app/answers/detail/a_id/1725
Before implementing encryption, review the BIOS section of this Best Practice Guide (page
7).
Encryption and decryption is performed as data is written to or read from pool disks. As such,
any writing to or reading from encrypted SAUs has a small performance overhead while the
data is encrypted or decrypted.
Mixed pools containing both encrypted and unencrypted Virtual Disk sources, or
Dedicated pools containing either encrypted or unencrypted Virtual Disks sources.
Mixed Dedicated
Pro: Provides ease of management as both Con: Requires manual management of
Virtual Disk types can reside in the same Disk Disk Pools to ensure they only contain
Pool. Virtual Disks of one type.
Con: Increased reclamation activity due to Pro: Only one type of SAU (encrypted or
conversion of SAUs between types (encrypted unencrypted) is required.
and unencrypted) as free space is managed.
Upon creation of the first encrypted Virtual Disk in a pool, an encryption key will be
generated. This key should be exported and stored in a safe location (not on the
SANsymphony node itself). For SMPA pools, the same key is used for all SANsymphony nodes,
but for standard mirrored virtual disks, the encryption key for each Disk Pool will be unique.
See:
https://docs.datacore.com/SSV-WebHelp/data-at-rest-_pool_key_tool.htm
Where dedicated disk pools are required, the logstore and mapstore should be set to the
unencrypted pool, as on server shutdown these write to a hidden, unencrypted Virtual Disk
in the specified pool. Setting either of these features to a pool intended to be dedicated to
only encrypted Virtual Disks would therefore cause it to have both encrypted and
unencrypted storage sources in it.
https://docs.datacore.com/SSV-WebHelp/mirroring_and_mirror_recovery.htm#Logstore
https://docs.datacore.com/SSV-WebHelp/snapshot_operations.htm#mapstore_locations
Data is encrypted as it is written to Disk Pool storage, and not before. As such, when
encrypted virtual disks are replicated, the data written to the buffer, and therefore also sent
to the destination Virtual Disk, will not be encrypted. In cases where this data needs to be
encrypted as well, hardware level encryption will need to be implemented on the replication
buffer and the link between Server Groups. DataCore do not have any recommendations for
encryption other than to mention that the extra time required for the encryption/decryption
process of the replication data might add to the overall replication time lag. Comparative
testing is advised.
Previous Changes
2019
October
Added
BIOS Settings
Added recommendation to enable AES, and disable Secure Boot.
The operating system
Added statement on installing DataCore applicances on OEM versions of Windows.
DataCore Software and Microsoft software updates
Added recommended process for applying Windows Updates.
Storage Allocation Unit size
Added statements about how SAU size affects performance and reclamation.
Updated
Windows Hosts file / DNS settings
Clarified DNS vs Hosts file preference.
Clarified populating the Hosts file.
The Disk Pool Catalog
Updated to include the change to pool catalogue location handling as of 10.0 PSP9.
Removed
Which SANsymphony versions does this document apply to?
Removed reference to SANsymphony 9.0 PSP4 Update 4 as this release is now End-of-Life.
DataCore Software and Microsoft software updates
Removed section on Windows Service Packs.
2018
February
Updated
CPUs - This section has been re-written and simplified as some of the terminology was
confusing.
Windows service packs, updates, security and hot fixes - This section has been renamed
and updated for Windows 2012 and 2016 with many of the explanations simplified.
Disk Pools - Minor updates to the text, mainly for clarity concerning existing Disk Pools vs
new Disk Pools.
2017
August
Added
New sections:
The DataCore Server – BIOS - This information was previously documented in DataCore
Support's FAQ 1467
Disk Pools
Snapshot
Updated
The DataCore Server - CPU
This document has also been reviewed for SANsymphony 10.0 PSP 6.
2015
September
Added
A new section:
Replication recommendations
This section details best practices for the asynchronous Replication feature.
August
Updated
Initial re-publication.
Major updates to all sections.
High level design objectives
Hardware configuration recommendations for the DataCore Server
Software configuration recommendations for the DataCore Server
DataCore, the DataCore logo and SANsymphony are trademarks of DataCore Software Corporation. Other DataCore
product or service names or logos referenced herein are trademarks of DataCore Software Corporation. All other
products, services and company names mentioned herein may be trademarks of their respective owners.
ALTHOUGH THE MATERIAL PRESENTED IN THIS DOCUMENT IS BELIEVED TO BE ACCURATE, IT IS PROVIDED “AS IS”
AND USERS MUST TAKE ALL RESPONSIBILITY FOR THE USE OR APPLICATION OF THE PRODUCTS DESCRIBED AND
THE INFORMATION CONTAINED IN THIS DOCUMENT. NEITHER DATACORE NOR ITS SUPPLIERS MAKE ANY
EXPRESS OR IMPLIED REPRESENTATION, WARRANTY OR ENDORSEMENT REGARDING, AND SHALL HAVE NO
LIABILITY FOR, THE USE OR APPLICATION OF ANY DATACORE OR THIRD PARTY PRODUCTS OR THE OTHER
INFORMATION REFERRED TO IN THIS DOCUMENT. ALL SUCH WARRANTIES (INCLUDING ANY IMPLIED
WARRANTIES OF MERCHANTABILITY, NON-INFRINGEMENT, FITNESS FOR A PARTICULAR PURPOSE AND AGAINST
HIDDEN DEFECTS) AND LIABILITY ARE HEREBY DISCLAIMED TO THE FULLEST EXTENT PERMITTED BY LAW.
No part of this document may be copied, reproduced, translated or reduced to any electronic medium or machine-
readable form without the prior written consent of DataCore Software Corporation