Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
669 views144 pages

S3 Connector Installation

The document provides installation instructions for Scality's S3 Connector version 7.10.8, detailing deployment architectures, prerequisites, and setup processes. It outlines both single-site and multi-site deployment options, emphasizing the flexibility and scalability of the solution for various use cases. Additionally, it includes legal notices regarding copyright and liability disclaimers associated with the documentation.

Uploaded by

leeyongpyo97
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
669 views144 pages

S3 Connector Installation

The document provides installation instructions for Scality's S3 Connector version 7.10.8, detailing deployment architectures, prerequisites, and setup processes. It outlines both single-site and multi-site deployment options, emphasizing the flexibility and scalability of the solution for various use cases. Additionally, it includes legal notices regarding copyright and liability disclaimers associated with the documentation.

Uploaded by

leeyongpyo97
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 144

Installation 7.10.

November 6, 2023
LEGAL NOTICE

All brands and product names cited in Scality’s technical documentation are the property
of their respective owners.
The author and publisher have taken care in the preparation of the technical documen-
tation content but make no expressed or implied warranty of any kind and assume no
responsibility for errors or omissions. No liability is assumed for incidental or consequen-
tial damages in connection with or arising out of the use of the information or programs
contained therein.
Scality retains the right to make changes at any time without notice.
Scality assumes no liability, including liability for infringement of any patent or copyright,
for the license, sale, or use of its products except as set forth in the Scality licenses.
Scality assumes no obligation to update the information contained in its documentation.
All rights reserved. No part of this documentation may be reproduced, stored in a retrieval
system, or transmitted, in any form, or by any means, electronic, mechanical, photocopy-
ing, recording, or otherwise, without prior written consent from Scality, S.A.
Copyright ©2009-2023, Scality. All rights reserved.

About Scality

Scality, leader in software solutions for distributed file and object storage and multi-cloud
data control, builds the most powerful storage tools to make data easy to protect, search
and manage anytime, on any cloud. We give customers the freedom and control neces-
sary to be competitive in a data driven economy. Recognized as a leader in distributed
file and object storage by Gartner and IDC, we help you to be ready for the challenges of
the fourth industrial revolution.
Let us show you how. Follow us on Twitter @scality and @zenko. Visit us at www.scality.
com.
CONTENTS

1 Introduction 1

2 Deployment Architecture 3
2.1 Single-Site Deployments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Multi-Site Deployments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

3 Prerequisites, Dependencies, and Setup 15


3.1 Hardware Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2 Dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.3 Operating System Requirements . . . . . . . . . . . . . . . . . . . . . . . 17
3.4 Docker Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.5 SSH Connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.6 Preparing Deployment as a Non-Root User . . . . . . . . . . . . . . . . . 19
3.7 Installing S3 Clients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

4 Installing the S3 Connector 22


4.1 Setting Up the Deployment Machine . . . . . . . . . . . . . . . . . . . . . 22
4.2 Defining the S3 Cluster Inventory . . . . . . . . . . . . . . . . . . . . . . . 23
4.3 Configuring the S3C Cluster . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.4 Offline Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
4.5 Extending an Installation to Multiple Metadata Clusters . . . . . . . . . . 113
4.6 Setting Up a Failover Deployment Machine in Multiple Site Architectures 119

5 Upgrading towards 7.10.8 121


5.1 Install Dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
5.2 Download and Extract the Offline Installer . . . . . . . . . . . . . . . . . . 122
5.3 Set ENV_DIR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
5.4 Copy the Old Environment Files . . . . . . . . . . . . . . . . . . . . . . . . 125
5.5 Set the Path to the New Installer . . . . . . . . . . . . . . . . . . . . . . . 125
5.6 Modify the group_vars/all File . . . . . . . . . . . . . . . . . . . . . . . . 126
5.7 Update the Backbeat Variables . . . . . . . . . . . . . . . . . . . . . . . . 127
5.8 Import the SAML Identity Provider Certificate . . . . . . . . . . . . . . . . 128
5.9 Create Management Account Credentials . . . . . . . . . . . . . . . . . . 129

2023, Scality, Inc i


5.10 Update Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
5.11 Upload Binaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
5.12 Back up Backbeat Credentials . . . . . . . . . . . . . . . . . . . . . . . . . 130
5.13 Upgrade Stateful Components . . . . . . . . . . . . . . . . . . . . . . . . 131
5.14 Upgrade the Backbeat Queue Protocol . . . . . . . . . . . . . . . . . . . . 134
5.15 Rebalance the Kafka Topic Partitions . . . . . . . . . . . . . . . . . . . . 136
5.16 Upgrade Stateless Components . . . . . . . . . . . . . . . . . . . . . . . . 137
5.17 Cleanup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
5.18 Restore Backbeat Credentials . . . . . . . . . . . . . . . . . . . . . . . . . 139

2023, Scality, Inc ii


CHAPTER

ONE

INTRODUCTION

Note: When running on a RHEL 8 operating system, Docker-related commands must be


replaced by their ctrctl equivalent (see RHEL 8 Cheatsheet for S3C and Containers local
administration with ctrctl tools for more information).

S3 Connector provides an AWS S3- and IAM-compatible interface to Scality RING, with
support for core AWS S3 bucket and object APIs, including multipart upload (MPU).
Scale-out capability enables concurrent same-bucket access from multiple S3 Connec-
tor instances for both read and write operations.
S3 Connector scales seamlessly from a distributed system atop a minimum cluster of
three standard x86 servers to systems comprising thousands of physical storage servers
with a total storage capacity that runs into hundreds of petabytes.
In addition, S3C supports the AWS IAM (Identity and Access Management) model of
accounts, users, groups and policies, and integration is available for Microsoft Active
Directory through ADFS and for Single Sign-On (SSO) services through the SAML 2.0
protocol.
S3 Connector uses Ansible as a federated deployment manager, and is installed using
pre-built container images. A single Ansible command can deploy separate container
images for:
• S3 Connector
• S3 Metadata components
• S3 Vault authentication components
• sproxyd components for connecting to back-end storage
• Nginx server for SSL termination and internal failover
• Logging components that install Elasticsearch, Logstash, and Kibana (ELK)
S3 Connector uses Nginx for internal component-level failover only. It is not intended as
an external traffic load balancer. A dedicated external load balancer is recommended

2023, Scality, Inc 1


for best overall throughput and optimal failover across S3 Connector machines and end-
points.
For security, every effort has been made to alleviate the need for elevated privileges.
Sudo privileges are only needed to install software dependencies and remove incompat-
ible software from the target servers.
All components, including S3 Connector can be run on industry-standard servers. As a
software storage back end, S3 Connector performs at an extremely high level, becoming
more reliable as the system grows, unlike traditional storage appliances and systems.
S3 Connector CRR provides disaster recovery capabilities in the event of a single-site
failure or network outage.
For all operations in all configurations, S3 Connector provides users with strong read-
after-write consistency. Once a PUT or DELETE operation or an operation that changes
object tags, ACLs, or metadata is received and processed, subsequent GET and LIST
operations return the updated value. There is no latency between written and returned
data. Returned data is therefore considered strongly consistent. This includes versioning
enables and delete operations that may have been eventually consistent under Amazon
S3.

2023, Scality, Inc 2


CHAPTER

TWO

DEPLOҮMENT ARCHITECTURE

The RING/S3C solution offers tremendous flexibility for a variety of use cases and bud-
gets. You can deploy it at a single site or widely distributed to several service locations,
offering everything a simple private cloud storage location to a robust geodistributed
storage solution.
Single-Site Deployments offer simplicity, economy, and speed, while Multi-Site Deployments
offer redundancy, robustness, and scalability, and manage issues like network latency
and dynamic failover using different strategies to satisfy different expectations.

2.1 Single-Site Deployments

The baseline RING/S3C architecture involves a deployment in one data center. S3 Con-
nector maintains an object namespace distrbuted across a quorum of nodes maintained
in the servers.
Because the server space is sited in a single location, latency is less of an issue than in
geodistrbuted “stretched” deployments. Single-site deployments are less complex, less
latency-bound, and faster than multi-site deployments. Though the service is robust,
with redundancy and failover built into the deployment, the deployment itself may remain
vulnerable to single-point failures owing to such site factors such as power and network
failure.

2.1.1 Single-Site Five-Server Model

The five-server deployment is considered the “baseline” architecture, providing a quorum


of five nodes. Each node contains a copy of the namespace, subject to frequent check-
ing, with discrepancies resolved by the majority vote of the nodes. On a live system, one
server can be lost without service degradation, and two servers can be lost without loss
of data.

2023, Scality, Inc 3


In this configuration, the RING will have six nodes, but the S3 Connector layer only uses
five nodes for connector instances.

2.1.2 Single-Site Three-Server Model

The single-site three-server deployment offers cost-competitive object storage at a basic


service level, with a clear Fast Track upgrade path.
From the perspective of the S3 namespace maintained by the Metadata engine, the three-
server configuration is a conventional five-node virtual cluster, with at least one copy of
the namespace on each of the three physical servers. This design leverages advances
in storage density and reliability to affordably produce excellent single-site reliability and
performance.

2023, Scality, Inc 4


2.1.3 Single-Site Six-Server Model

The six-server deployment is provides a a quorum of five nodes, plus either a stateless
or warm standby node, for a RING installation with a robust, resilient S3 namespace.
The fundamental organization unit is five servers, with an additional server either for
stateless access or failover. Each node contains a copy of the namespace, subject to
frequent checking, with discrepancies resolved by the majority vote of the nodes. On a
live system, one server can be lost without service degradation, and two servers can be
lost without loss of data.
A single-site deployment offers a straightforward configuration with a reliable S3 names-
pace servicing a RING private cloud. While adding nodes into stretched or asynchronous
multi-site deployments increases reliability and recoverability, additional nodes incur
greater overhead, consuming more transaction time, and are thus associated with slower
performance.

2023, Scality, Inc 5


2.2 Multi-Site Deployments

Like single-site deployments, multi-site deployments provide full access from any con-
nector to any bucket, even buckets across a metro area network (MAN). As with single-
site deployments, these stretched models distribute across all host machines; however
they also split the S3 Metadata service across all stretched deployment sites.

2.2.1 Installation on Stretched RINGs for Two or Three Sites

The S3 Connector supports deployment across either two or three data centers (sites)
using a stretched model. The stretched deployment model is designed to continue ser-
vice and data availability even in the event of a data center outage, and to protect the
data in the event of a site outage. Some of these availability properties, data protection
schemes, storage overhead, and failover mechanisms differ between 2-site and 3-site
deployments. For information on administrator actions pertaining to site failover, see
“Site Recovery Procedures” in S3 Connector Operation.
The stretched model uses a single data RING, consisting of machines distributed and

2023, Scality, Inc 6


accessed across the sites. Data is protected using both a distributed erasure coding
and replication scheme according to best practices for 2-site and 3-site data durability
requirements.
In these stretched deployments, the S3 Metadata service is also distributed across
servers on the two sites. Because the Metadata service is run as a special cluster of
five servers, the metadata distribution scheme is non-uniform across the sites. The 2-
site and 3-site inventory template files described above can serve as the basis of de-
ploying these stretched modes, but will require editing and customization for specific
site requirements. Please consult a Scality Customer Support Engineer (CSE) for further
details and assistance in editing the template files.

Note: The stretched deployment must have a high-speed, low-latency (< 10


ms) network connection between the sites. See Multi-Site Deployments.

With five servers across the two or three sites, the Metadata service has the resilience to
automatically recover and maintain availability from one or two simultaneous server fail-
ures. The service automatically reassigns a Metadata server and the Metadata leader
during nominal and failover operations and maintains consistency across all active
servers. The stretched configurations also ensure that the service can be continued
either automatically or through administrator initiated recovery procedures after a site
failure, as described in S3 Connector Operation.
The sites in a stretched deployment can be data centers, labs, or co-location facilities
connected via a fast, low-latency (below 10 ms preferred) network. Typical stretched
model deployments are multiple labs or data center rooms within a campus or two dif-
ferent physical sites in close proximity, such as New York and northern New Jersey, San
Francisco and San Jose, or different cities in a small country (where network latencies
can be categorized as “metro area,” typically well under 10 ms).

Note: Currently, multi-site stretched deployment models do not support in-


tercontinental data centers or data centers separated by wide area geogra-
phies (for example, between the US east and west coasts).

In any stretched deployment model, the physical machines that host the data RING and
the S3 Connector containers are deployed across the two or three sites of the deployment
environment. By having a single stretched data RING, such models offer an advantage
over traditional replication by managing a single logical copy of the data, which can re-
duce the recovery point objective (RPO).
When site failover is required, RPO is the standard way of referring to how current the
data is with respect to the last updates (writes) before the failure. In a stretched model,
because data is written synchronously from all sites, the system provides a zero RPO.
Data is protected using a distributed erasure-coding schema with Scality’s best practice

2023, Scality, Inc 7


recommendations to deploy 7 + 5 erasure codes for large objects and Class of Service
(CoS) = 2 (3 copies, one per site) for small objects. This provides data protection in the
event that one data center fails and an additional server or disk fails on the surviving
sites.
Multi-site deployments are designed to:
• Provide higher data durability than a single site, by distributing data and metadata
across two or more sites.
• Maintain availability of the service in the event of multiple types of failure, includ-
ing site failure or outage, even if service availability is not preserved in all failure
conditions.
• Provide a disaster recovery failover capability if one of the sites becomes inacces-
sible due to network errors, power outages, or site failure.
Multi-site deployment models for S3 Connector include:

Two-Site, Active-Passive Asynchronous Replication

S3 Connector supports CRR between two deployments. These sites can be data centers,
labs, or co-location facilities, connected via a wide-area network. The same overall goals
of data durability, service availability after a site failure, and site disaster recovery ability
are maintained in the two-site model. The two-site model does, however, present several
key differences in its behaviors and tolerance of site failures:
• The Metadata servers are not distributed in this setup, unlike those in stretched
deployments.
• Data and metadata are replicated asynchronously from one site to the other.
• The system does not automatically fail over to the other site. The administrator
must manually change the application endpoint and credentials to use the other
site.
The preferred use case for this topology is robust disater recovery where RTO is not crit-
ical.

Two-Site, Six-Server Models

S3 Connector supports two-site stretched metro area deployments. As with the three-
site model, sites can be data centers, labs, or co-location facilities, connected via a fast,
low-latency (< 10 ms) metro area network. The same overall goals of providing higher
data durability, service availability after a site failure, and a site disaster recovery ability
are maintained in the two-site model. The two-site model does, however, present several
key differences in its behavior and tolerance of site failures:

2023, Scality, Inc 8


• Active Metadata servers are distributed non-uniformly as 3 + 2; therefore three ac-
tive servers are on one site (the majority site) and two active Metadata servers are
on the other site (the minority site).
• To ensure cross-site consistency of data and metadata, the Metadata leader elec-
tion is enforced on the minority site. This forces the system to maintain consis-
tency of at least three Metadata servers across the two sites, to optimize RPO.
• The system does not automatically continue service availability. In the event either
site fails, the administrator must first initiate a change-in-leader assignment (on the
majority site) or a recovery to activate the WSBs (on the minority site).

Important: Though six-server configurations, like twelve-server configurations, provide


service continuance in the event of a site failure—-with manual administrator action re-
quired for site failover-—there is a key difference. Because a six-server system has fewer
servers available to host warm standbys, it has less redundancy on a site failure. After
a site fails, a six-server configuration has limited tolerance for additional disk or server
failures.

Two-site, six-server metro-area stretched RING deployments can be configured with or


without a witness server (or servers). Witness server configurations offer the best per-
formance, but may require manual intervention.

Two-Site Stretched RING with Witness

In a two-site, six-server deployment, the preferred topology is with an offsite witness


server. Data is synchronously written to both sites. If a server fails, the witness server
steps in to replace the defective server. This provides uninterrupted service, with zero
RTO and zero RPO even on a full failure of either primary site, and no manual intervention
required on site failure.

2023, Scality, Inc 9


This topology offers some of the advantages of a three-site stretched RING deployment
by maintaining a third site (the witness server) with full realtime metadata replication.
This requires much less bandwidth than a third site with full replication would otherwise
consume. The witness may be implemented in hardware or virtualized (hardware is rec-
ommended), it must be deployed on separately from the RING it witnesses. While it is
possible to colocate it with other servers, this is not advisable: loss of the site with the
witness destroys the quorum and brings down the system, requiring manual intervention.
This topology requires a 10 ms round-trip time (RTT) or lower latency.

Two-Site Stretched RING without Witness

In a two-site, six-server deployment without witness server(s), the Metadata service is


distributed across the participating data center sites in a 3 + 2 topology.

Site 1 3 active Metadata servers


Site 2 2 active Metadata servers

The six-server model deploys only one warm standby (WSB) Metadata server.

Site 1 0 WSBs
Site 2 1 WSB

The WSB shadows any single active Metadata server on Site 1.


The following diagram depicts the S3 Connector containers’ layout, including the active
and WSB Metadata servers, in a two-site, six-server deployment.

2023, Scality, Inc 10


This configuration offers advantages in cost and possibly deployment time, but offers
automatic protection only against server failures, not site outages, and its automatically
failover can can tolerate only one lost server on the minority site.

Two-Site, Twelve-Server Model

The physical machines used to host both the data RING and the S3 Connector containers
are provisioned across the two sites of a stretched deployment. Data is protected using
a 5 + 7 erasure coding schema for large objects and CoS = 3 (four copies, two per site)
for small objects. This provides data protection if one data center fails and an additional
single server or disk fails on the surviving site.
In the two-site stretched model, the Metadata service is also distributed across the par-
ticipating data center sites in a 3 + 2 topology.

Site 1 3 active Metadata servers


Site 2 2 active Metadata servers

In addition, this stretched model deploys five warm-standby (WSB) Metadata servers,
each of which is assigned as a follower of a specific active Metadata server. The WSBs
are activated after a site failure, in which case they join the pool of active Metadata
servers, replacing the servers lost or rendered inaccessible by the site failure.
The standard two-site model of deploying the WSBs:

Site 1 2 WSBs
Site 2 3 WSBs

The diagram depicts the layout of the S3 Connector containers, including the active and
WSB Metadata servers, in a twelve-server deployment with the sites connected over a
high-speed, low-latency (< 10 ms) network.

2023, Scality, Inc 11


TWO-SITE TWELVE-NODE
SITE 1 SITE 2
SUPERVISOR 10 Gbe Metro Area Network
(MGMT CONSOLE) 3 x STORAGE NODE
3 x STORAGE NODE & CONNECTOR
& CONNECTOR (WARM STANDBY)
(ACTIVE)
2 x STORAGE NODE 2 x STORAGE NODE
& CONNECTOR & CONNECTOR
(WARM STANDBY) (ACTIVE)

STATELESS NODE STATELESS NODE


<10 ms latency

Three-Site, Six-Server Model

Important: Though six-server configurations, like twelve-server configura-


tions, provide service continuance in the event of a site failure—with manual
administrator action required for site failover—there is a key difference. Be-
cause there are fewer servers available to host WSBs, a six-server system has
less redundancy when a site failure occurs. This means that after a site fails,
a six-server configuration has limited tolerance for additional disk or server
failures.

In a three-site, six-server deployment, the Metadata service is distributed across the par-
ticipating data center sites in a 2 + 2 + 1 topology.

Site 1 2 active Metadata servers


Site 2 2 active Metadata servers
Site 3 1 active Metadata server

The six-server model deploys only one warm standby (WSB) Metadata server.

Site 1 0 WSBs
Site 2 0 WSBs
Site 3 1 WSB

The WSB can shadow any Metadata server. The “shadow” instruction determines which
node the WSB replaces in a failover.
Example: Server 6 is on Site 3 and shadows Server 1 on Site 1. If Site 1 is lost, Server 6 can
be activated. If Site 2 is lost, inventory can be updated and Server 6 can be assigned to
shadow, for instance, Server 3 on Site 2. Then the procedure can be launched to activate
the WSB alias on Server 6.

2023, Scality, Inc 12


The following diagram depicts the S3 Connector containers’ layout, including the active
and WSB Metadata servers, in a three-site, six-server deployment.

Three-Site, Twelve-Server Model

S3 Connector supports three-site stretched MAN deployments. The sites can be data
centers, labs, or co-location facilities connected via a fast, low-latency network, as de-
scribed in “Multi-Site Deployments”.
In the stretched model, the Metadata service is distributed across all participating dat-
acenter sites. In a three-site, twelve-server deployment, the S3 Metadata servers are
distributed in a 2 + 2 + 1 topology.

Site 1 2 active Metadata servers


Site 2 2 active Metadata servers
Site 3 1 active Metadata server

In addition, this stretched model deploys five warm-standby (WSB) Metadata servers,
each of which is assigned as a follower of a specific active Metadata server. The WSBs
are activated after a site failure, in which case they join the pool of active Metadata
servers and replace the servers lost or rendered inaccessible by the site failure.
The standard three-site model for deploying the WSBs:

Site 1 2 WSBs
Site 2 1 WSB
Site 3 2 WSBs

The following diagram shows the layout of the S3 Connector containers, including the
Active and WSB Metadata servers, on a 12-or-more server deployment, with the sites
connected over a high-speed, low-latency (< 10 ms) network.

2023, Scality, Inc 13


THREE-SITE STRETCHED
SITE 1 SITE 2 SITE 3
SUPERVISOR
(MGMT CONSOLE) 2 x STORAGE NODE &
CONNECTOR (ACTIVE)
2 x STORAGE NODE 2 x STORAGE NODE
& CONNECTOR & CONNECTOR
(ACTIVE) (WARM STANDBY)

STORAGE NODE
& CONNECTOR
(ACTIVE)
2 x STORAGE NODE
& CONNECTOR STATELESS NODE
(ACTIVE)
STORAGE NODE STATELESS NODE
& CONNECTOR
(WARM STANDBY)

2023, Scality, Inc 14


CHAPTER

THREE

PREREQUISITES, DEPENDENCIES, AND SETUP

To install S3 Connector, you must assemble prerequisites, satisfy dependencies, and


implement the setup steps described in this section.

3.1 Hardware Requirements

Each S3 Connector must comply with the following hardware requirements:


• 32 GB memory (RAM)
• 8 CPU cores
• At least one SSD for the S3 Metadata database (256 GB recommended minimum,
with real-world production sizing recommendations adjusted for expected number
of objects and average object sizes)
• Local SSD storage (1 TB recommended minimum)

3.2 Dependencies

Note: Installing S3 Connector requires root access.

Dependencies include:
• Ansible 2.9.16
Ansible is an open-source deployment and configuration manager. It must be in-
stalled on a single deployment server only.
S3 Connector uses Ansible as a federated deployment manager across multiple
servers. Installation is performed using pre-built container images. A single Ansi-
ble command can deploy separate container images for S3 Connector, S3 Meta-

2023, Scality, Inc 15


data components, S3 Vault authentication components, and sproxyd components
for connecting to backend Scality RING storage, and for logging components that
install ElasticSearch, Logstash, and Kibana (ELK).
For a federated deployment, the number of servers and components to be deployed
must be listed in an inventory file called from Ansible.
• The S3 Connector Federation tar archive file (containing the installer and binaries)
is available to those with customer credentials from the current RING Downloads
page in the Confluence wiki.
• Target server operating systems and hardware
RHEL/CentOS 7.4 is the minimum supported operating system for S3 Connector
components. For performance reasons, the S3 Metadata service (which also man-
ages Vault metadata) requires an SSD on each machine to store its database.
• RING 6.x or later
S3 Connector can be installed on Scality RING 6 or later. The recommended erasure
coding schema is 9 + 3 with CoS3 (Scality Class of Service) replication for smaller
files. This enables S3 Connector to efficiently store both small and large object
payloads.
• Security certificates, but only when configuring S3 Connector for HTTPS. The de-
fault configuration uses HTTP.
• A load balancer to spread the load across all S3 Connector endpoint instances.
• Wildcard DNS entry to resolve bucket names: the hostname must resolve to the S3
API endpoint (which usually resolves to the load balancer in front of all S3 Connec-
tor endpoints).
• SSH keys on target servers, configured to allow Ansible to connect to every ma-
chine (using SSH).
• SSH users on the target servers with either the same password on all deployment
machines or a common means for gaining root access on the various target ma-
chines (for example, a common root password).
• NTP (Network Time Protocol) is required to synchronize the clocks on all servers
using Chrony which is now default across RHEL, CentOS, and Rocky Linux distribu-
tions.
• Python 3.6
• openssl
• A non-root user, defined as a member of the docker system group (on RHEL 7 based
systems).

2023, Scality, Inc 16


3.3 Operating System Requirements

• Red Hat Enterprise Linux or CentOS 7.4 or later


• SELinux kernel security module disabled for target servers (SELinux is not currently
compatible with S3 Connector components.)

Note: Disabling the SELinux kernel security module requires root permissions. To
determine whether SELinux is enabled, run sestatus from the target server com-
mand line. If SELinux is enabled, edit the SELINUX entry in /etc/selinux/config
to SELINUX=disabled and reboot.

Warning: If S3 Connector is installed on a SELinux-enabled environment the


security module should not be subsequently disabled, as doing so will prevent
the containers from restarting (e.g., in the event of a reboot).

• “requiretty” must be disabled in sudo configuration (/etc/sudoers)

3.4 Docker Images

Docker images are Docker’s build component. Using Docker images ensures that in-
stalled applications always run the same, regardless of the run environment.
S3 Connector components are installed in pre-built Docker images that Ansible can de-
ploy to the target environment from the Scality Docker Hub website (https://hub.docker.
com/u/scality/).
Scality offers these pre-built Docker images for S3 Connector.

2023, Scality, Inc 17


Docker Image Description of Deployed Package
scality/nfsd and For NFS over S3
scality/portmap
scality/kudzu Monitoring software that allows administrators to
visualize S3 cluster health in the RING
administration GUI.
scality/cosbench Automate performance tests with Cosbench
scality/kafka Message-queuing middleware used for CRR
scality/backbeat Scality core engine for asynchronous replication.
scality/elk ELK-stack helpers. A logstash curator for
elasticsearch index deletion, and kibana for
displaying elasticsearch indexes.
scality/identisee Provides a graphical interface (console) for
managing Identity and Access Management (IAM)
resources.
scality/metadata Scality Metadata module for all services that require
a persistent metadata model.
scality/s3 Standalone S3 protocol server used for on-premise
or cloud development, testing, and live deployments.
scality/s3-kms scality/s3 image with S3 server-side encryption.
scality/sproxyd-tengine Scality’s proprietary sproxyd software installation.
This is a required proxy service for communicating
with a Scality RING.
scality/vault-md Scality Vault service, which handles secure
authentications

The Scality archive file contains all third-party Docker images required to install S3 Con-
nector:

2023, Scality, Inc 18


Docker Image Description of Deployed Package
redis:3.2.8-alpine The Redis container is used for:
• Policy caching for Vault-SAML session cookie
caching to enable single sign-on feature.
• Storing UTAPI (Utilization API) health metrics as
provided by the health probes for load balancing.

nginx:1.11.10-alpine The Nginx image, acting as an S3 frontend, provides load


balancing by default, ensuring availability of the S3
Connector service by using the component’s health
probes to ensure that requests are routed to a working
instance of the connector.
logstash:6.8.23 Simultaneously ingests logs from S3C components,
transforms log data, and then sends it to Elasticsearch.
zookeeper:3.4.10 Process coordination middleware used for CRR

3.5 SSH Connection

S3 Connector deployment uses the Ansible automation engine. Thus, SSH connections
from the deployment machine to the servers that will host the S3 cluster must be estab-
lished. Ideally:
• SSH using the root user in the remote hosts.
• SSH using a passphrase-less SSH key.

3.6 Preparing Deployment as a Non-Root User

If the deployment machine cannot directly log in as root on the remote servers:
1. Make sure the remote user is part of the sudoers. It is recommended to have a
passwordless sudoers. In the /etc/sudoers of all S3 connectors:

user ALL=(ALL) NOPASSWD: ALL

2. Follow the Configuring the Inventory for Non-Root Ansible Operations procedure dur-
ing the cluster Inventory definition.
3. On the deployment machine, the federation directory must be writable by the user
running Ansible.

2023, Scality, Inc 19


3.7 Installing S3 Clients

S3 Connector requires a client to be installed for access:

AWS CLI s3cmd Cyberduck

• Amazon client • Simple, S3-only • GUI for S3 on


for full Amazon command-line most common
Web Services interface (CLI) platforms (Mac,
(not just for S3) • No risk of Windows, Linux)
• Required for accidentally • Drag and drop
Identity Access modifying IAM • Limited features
Management entities
(IAM) operation • Does not include
full set of S3
protocol features

3.7.1 Install the AWS CLI Tool

Note: The AWS CLI tool is bundled with the repo/venv/bin directory (with repo/ placed
at the same directory level as federation/). It can be used directly.

1. Download and install the AWS CLI in an environment that can access the S3 con-
nectors.
• If pip is installed:

$ pip install awscli

• If pip is not installed:

$ curl "https://s3.amazonaws.com/aws-cli/awscli-bundle.zip" -o "awscli-


,→bundle.zip"

$ unzip awscli-bundle.zip
$ cd awscli-bundle
$ ./install

2. Configure the AWS CLI tool.


When prompted, use the generated access key and secret key using the field id for
the access key and the field value for the secret key. Confirm that the region name
is set to us-east-1.

2023, Scality, Inc 20


$ aws configure
AWS Access Key ID [None]: TJQMJBIDTP3AT8IFAVPR
AWS Secret Access Key [None]: wOwh=ELcG1pXSPfwbqhm6mnHLl2ITbJp0wQF311e
Default region name [None]: us-east-1
Default output format [None]:
[devops@ws2 ~]$

3.7.2 Install S3cmd

[root@ws2]# yum install -y epel-release


[root@ws2]# yum install -y s3cmd

3.7.3 Install Cyberduck (Optional)

1. Download Cyberduck:

http://www.cyberduck.io

2. Download the S3 HTTP Cyberduck profile:

https:/goo.gl/jha7EX

3. Double-clicking or otherwise opening the downloaded Cyberduck profile opens Cy-


berduck with the S3.amazonaws.com - S3 profile pre-loaded.

2023, Scality, Inc 21


CHAPTER

FOUR

INSTALLING THE S3 CONNECTOR

4.1 Setting Up the Deployment Machine

4.1.1 Preparing the Installation

1. Install Python 3.6 and openssl packages on the deployment machine and on every
targeted hosts.

$ salt \* cmd.run "yum install --enablerepo=rhel* --enablerepo=base* -y␣


,→python36 libselinux-python3 openssl"

2. Download the s3-offline-${VERSION}.tar.gz from our repository using the credentials


provided with your support contract.
3. Copy the s3-offline-${VERSION}.tar.gz to the deploying machine. In a RING installa-
tion, store it in /srv/scality/s3.
4. Extract the tar archive file deployment server. Otherwise, extract the Federation
tar archive file to a different machine, and then copy the inventory and variables
templates that are to be modified to the deployment server.

$ tar xzf s3-offline-${VERSION}.tar.gz

5. In a RING installation, the /srv/scality/s3/s3-offline symlink can now point to the


newly created directory:

$ ln -s s3-offline-${VERSION} /srv/scality/s3/s3-offline

6. Copy the client-template directory.

$ ENV_DIR=s3config
$ cp -r s3-offline-${VERSION}/federation/env/client-template s3-offline-$
,→{VERSION}/federation/env/${ENV_DIR}

2023, Scality, Inc 22


${ENV_DIR} contains the inventory and configuration of the future deployment.
Begin the deployment configuration by following the Configuring the S3C Cluster proce-
dure.

Note: If you use RHEL, contact Scality Support for assistance.

4.2 Defining the S3 Cluster Inventory

The Ansible Inventory file describes the S3 Connector deployment configuration, includ-
ing specifying the target servers used for the deployment and indicating where the var-
ious component services are to be hosted. Ansible orchestrates the federated deploy-
ment of services across all target servers according to the inventory file.
A set of default standard inventory file templates is available to be edited for specific
requirements, depending on the number of data centers (sites) in the deployment:
• Single-site
• Two-site stretched
• Three-site stretched
To declare the machines to which S3 Connector components are to be deployed, use the
appropriate inventory file previously copied to the deployment server under federation/
env/${ENV_DIR}.

2023, Scality, Inc 23


4.2.1 Host Groups

Hosts are assigned into groups of hosts. Some groups are mandatory and have a specific
role in the S3 cluster. This table lists the groups Federation uses:

Group Contains… Can be empty


[runners_metadata] stateful connectors, also No
known as metadata
connectors, where the
authentication and bucket
databases are hosted. Also
contains Backbeat
components for CRR
architectures.
[runners_s3] stateless connectors, No
responsible for HTTP/HTTPS
connections with the S3
clients.
[loggers] servers hosting an Yes
Elasticsearch node,
responsible for storing log
centralization.
[filers] NFS-over-S3 service hosts. Yes
[cosbench_controller] host that controls the Yes
CosBench drivers for load
tests. This should stay empty
in a production environment.
[cosbench_drivers] hosts for CosBench drivers for Yes
load tests. This should stay
empty in a production
environment.

Hosts can belong to one or both of the [runners_metadata] and [runners_s3] groups.
Either of the following configurations is supported:
• Co-located architectures, where hosts belong to both [runners_metadata] and
[runners_s3] groups,
• Decoupled architectures, where hosts belong to the [runners_metadata] group and
other hosts belong to the [runners_s3] group.

2023, Scality, Inc 24


4.2.2 Frontend Zones

Since versions 7.4.9 and 7.9.0, a new host grouping option divides stateless hosts (from
the [runners_s3] group) into different logical zones.
Zones are independent groups of hosts that can, for example, be used for specific use
cases.
For each of these groups, the frontend_zone variable must be set, and its value must be
the name of the group itself.
The following example shows the creation of two zones:

[zone1]
md1-cluster1
md2-cluster1
md3-cluster1

[zone1:vars]
frontend_zone=zone1

[zone2]
md4-cluster1
md5-cluster1

[zone2:vars]
frontend_zone=zone2

[runners_s3]
md1-cluster1
md2-cluster1
md3-cluster1
md4-cluster1
md5-cluster1

Note: The name of the zones is at your convenience.

2023, Scality, Inc 25


4.2.3 Hosts Declaration

Beginning with version 7.4.0, Federation uses Ansible aliases to refer to stateful hosts
using a simple naming convention.

Stateful Connectors Aliases

The stateful connectors are the hosts that store the databases of the cluster. Their name
follows a particular convention:
mdX-clusterY
wsbX-clusterY

With:
• The name beginning with “md” for the five active connectors,
• The name beginning with “wsb” for the warm standby connectors used in multi-site
architectures,
• X, from 1 to 5, indicating the place of the connector in the cluster,
• Y, from 1 to 10, indicating the ID of the cluster. It must begin at 1.
All stateful connectors must be included in the [runners_metadata] group.

Stateless Connectors Aliases

The stateless connectors are the hosts that directly handle the requests from the S3
clients, and use the stateful connectors as a backend.
You can choose any alias you wish for stateless connectors. For example:
stateless01
stateless02

All stateless connectors must be included in the [runners_s3] group.

Addressing the Connectors

Provide the IP address of the S3 connectors for Ansible to connect through SSH using
the ansible_host variable.
md1-cluster1 ansible_host=10.10.10.1
md2-cluster1 ansible_host=10.10.10.2
md3-cluster1 ansible_host=10.10.10.3
(continues on next page)

2023, Scality, Inc 26


(continued from previous page)
md4-cluster1 ansible_host=10.10.10.4
md5-cluster1 ansible_host=10.10.10.5

In a dual-network architecture where there is a need to separate the data network from
the management network, configure the data_host variable with the IP address belong-
ing to the data (also known as “backend” or “production”) network. The IP addresses de-
fined by the data_host variable do not have to be accessible from the supervisor. They
belong to the network with the most redundancy and bandwidth available.
md1-cluster1 ansible_host=10.10.10.1 data_host=192.168.1.1
md2-cluster1 ansible_host=10.10.10.2 data_host=192.168.1.2
md3-cluster1 ansible_host=10.10.10.3 data_host=192.168.1.3
md4-cluster1 ansible_host=10.10.10.4 data_host=192.168.1.4
md5-cluster1 ansible_host=10.10.10.5 data_host=192.168.1.5

4.2.4 Installation on a Single Site

A minimum of three servers is required for an S3 Connector installation on a single-site


RING. The S3 Metadata service establishes a special cluster with a quorum that works
most efficiently with an odd number of servers.
An example of a single-site inventory file is provided in federation/env/${ENV_DIR}/
inventory.1site. Copy this file to your own inventory file, and customize it here:
$ ENV_DIR=s3config
$ cp federation/env/${ENV_DIR}/inventory.1site federation/env/${ENV_DIR}/
,→inventory

Once done, follow the Configuring the S3C Cluster procedure.

4.2.5 Defining Site Groups in Multi-Site Decoupled Deployments

For a multi-site upgrade, each site defined in the deployment must have a defined group
with:
• A correct Metadata alias set in the site definition. For example:
[majority]
md1-cluster1
md2-cluster1
md4-cluster1

[minority]
(continues on next page)

2023, Scality, Inc 27


(continued from previous page)
md3-cluster1
md5-cluster1
wsb1-cluster1

• A variable {{site}} set to the name of the group (i.e.: if the group’s
name is “minority,” the variable’s value must be set to minority), using the
following ini section in the inventory file:
[majority:vars]
site=majority

[minority:vars]
site=minority

Examples of two- and three-site inventory files are provided in federation/env/


${ENV_DIR}/inventory.2site and federation/env/${ENV_DIR}/inventory.3site.
Copy them to your own inventory file, and customize them here:
$ ENV_DIR=s3config
$ cp federation/env/${ENV_DIR}/inventory.2site federation/env/${ENV_DIR}/
,→inventory

Once done, follow the Configuring the S3C Cluster procedure.

4.2.6 Configuring the Inventory for Non-Root Ansible Operations

If direct and passwordless root access to the S3 connectors is not possible, add the
following Ansible variables in the inventory (add ansible_sudo_pass only if sudo requires
a password):
[all:vars]
ansible_user=user
ansible_pass=connection_password
ansible_sudo_pass=sudo_password

4.2.7 Template Changes to Enable Elasticsearch and Kibana

The ELK stack is disabled by default. To enable the ELK stack on S3 Connector:
1. Uncomment the sites under [loggers:children] (the exact file depending on the
number of sites).
• If the deployment used the federation/env/${ENV_DIR}/inventory.1site file,
uncomment the line #stateful in the file.

2023, Scality, Inc 28


[loggers:children]

# Define here the ELK server groups

stateful

• If the deployment used the federation/env/${ENV_DIR}/inventory.2site file,


uncomment the line #active in the file.

[loggers:children]

active

• If the deployment used the federation/env/${ENV_DIR}/inventory.3site file,


uncomment the line #all_sites in the file.

[loggers:children]

all_sites

2. Uncomment the env_log_centralization section (explained in Send Unfiltered


Logs to an Elasticsearch Cluster), without specifying the env_log_centralization.
endpoints array.
To use Kibana for analysis and visualization of the log files, connect to port 15601 of one
of the deployed S3 connector belonging to the [loggers] group.

4.2.8 Template Changes for COSBench

Optionally, to measure cloud storage performance, install the COSBench controller on


a separate logger machine by adding the cosbench_controller section in the inventory
file. Ensure the correct host alias is in use:

[cosbench_controller]
controller ansible_host=10.10.10.10

Next, open the controller Web UI at http://10.10.10.10:19088/controller/index.html.


To set up endpoints for the COSBench workload, add the cosbench_drivers section in
the inventory file:

[cosbench_drivers]
server1.example.com ansible_host=10.10.10.21
server2.example.com ansible_host=10.10.10.22
server3.example.com ansible_host=10.10.10.23
(continues on next page)

2023, Scality, Inc 29


(continued from previous page)
server4.example.com ansible_host=10.10.10.24
server5.example.com ansible_host=10.10.10.25

By placing cosbench drivers on the same machines as the S3 runners, localhost can
serve as the endpoint for the cosbench workload (instead of mapped DNS names or
IP addresses). This is possible because Ansible links the runners_s3 section with the
cosbench_drivers section of the inventory file.

Warning: The COSBench tool itself consumes significant CPU and memory re-
sources, which may alter test results if the COSBench drivers are also S3 Connectors.

4.3 Configuring the S3C Cluster

4.3.1 Pre-Installation Configuration

Before installing S3 Connector and components, modify the federation/env/


${ENV_DIR}/group_vars/all file as described in Configurations.
The YAML-formatted group_vars/all file sets a list of Ansible variables that control how
S3 components are deployed and configured. This section lists the Ansible variables
that must be configured to install S3 Connector.

Warning: Empty lines and character spacing are critical to the functioning of yaml
files. Do not change non-variable parts of group_vars/all.

Note: Replace all example.com names in federation/env/${ENV_DIR}/group_vars/all


with the mapped DNS names or IP addresses of the servers in the target environment.
Additional lines can be uncommented and set appropriately for optional environment
operation.

2023, Scality, Inc 30


4.3.2 Post-Installation Configuration

After installation, you can modify the federation/env/${ENV_DIR}/group_vars/all file


and apply your changes using Ansible.

ENV_DIR=s3config
./ansible-playbook -i env/${ENV_DIR}/inventory run.yml

The run.yml playbook run can be limited to:


• A limited number of hosts, using the --limit option. See Patterns: targeting hosts
and groups.
• Pre-determined tags, using the --tags option. Use the --list-tags option to list
the available tags:

./ansible-playbook -i env/${ENV_DIR}/inventory --list-tags run.yml

Warning: Under some circumstances, running run.yml can result in a few minutes
of production outage. The configuration topics listed below provide procedures to
prevent most such service interruptions. If these do not apply to your cluster, contact
Scality Support.

4.3.3 Configurations

Mandatory S3C Core Component Configuration

Modify or set the following variables in the federation/env/${ENV_DIR}/group\_vars/


all file. It is best to not modify these variables after installation.

Configuring the Runtime User

Warning: Do not use root:root, which breaks Redis.

An S3 installation must designate the Unix user/group under which the S3 cluster will
operate. Configure the UID and GID of the run-time user:

env_user:
uid: 1000
gid: 1000

2023, Scality, Inc 31


Or provide the name of the user and group:

env_user:
user: 'scality'
group: 'scality'

Configuring the Data Directory

Modify the env_host_data value to indicate the local directory in which S3 Connector
nodes will store authentication databases and S3 Component configuration files.

env_host_data: /opt/scality/s3/data

Note: Starting with version 7.2, env_host_data is no longer used to specify the storage
mount point for S3’s bucket storage (Metadata). The env_metadata_bucket_datapaths
must be defined instead.

Configuring Metadata Storage

Defining the SSDs List for Bucket Metadata Storage

Modify env_metadata_bucket_datapaths to list the mount points (each must match a


different SSD) for storing the bucket-storage (S3 metadata) database.

env_metadata_bucket_datapaths:
- /scality/ssd2/s3
- /scality/ssd3/s3
- /scality/ssd4/s3

The mount point configured in env_host_data can be included in this list, but this shares
its storage space with the bucket metadata storage.

Important: Using SSDs to store metadata is strongly advised. Make sure SSDs are
ext4-formatted and mounted before installation.

2023, Scality, Inc 32


Setting Metadata Installation ID

The env_metadata_installID variable must be incremented on an S3 cluster re-


installation. Leave it commented out when installing the S3 cluster the first time.
Incrementing this variable prevents restoration of previous Metadata databases:

env_metadata_installID: 1

Note: The env_metadata_installID variable value must be between 0 and 255 (inclu-
sive).

Setting Metadata Backup RING Location

While backing up metadata into the RING, the Metadata processes can use a custom
location constraint:

env_metadata_backup_to_location: us-west-2

Note: The specified location must be defined in the location_constraints list. When in
doubt, leave this variable commented out.

Configuring Metadata Pruning

Set metadata journal pruning to true to gain disk space in the servers storing metadata:

env_metadata:
prune: true

Pruning can be throttled, in bytes per second, with:

env_metadata:
prune: true
prune_throttling_rate: 100000

By default, throttling is disabled (0).

2023, Scality, Inc 33


Configuring Metadata Backup Scrubber

The Metadata components verify the integrity of their backups on the RING using the
Metadata Backup Scrubber feature. This can be throttled with:

env_metadata:
scrubbing_throttling_rate: 200000

Configuring Metadata Inter-Process Latency

The env_network_latency_ms variable defines the maximum time in milliseconds after


a Metadata leader failure that Metadata processes wait before electing a new Metadata
leader.
In rare cases it is necessary to raise this number, but modifying it except under the di-
rection of Scality Support staff is inadvisable.
The value of this variable must never be less than 10.

Internal Configuration File Storage

Edit env_host_conf to indicate the location of the following files:


• The ID of the stateful members
• The key used to encrypt the Authentication key

env_host_conf: /etc/scality/s3

Log File Storage

Edit env_host_logs to indicate the hosted location of the S3 Connector log files.

env_host_logs: /var/log/s3

2023, Scality, Inc 34


Configuring Storage Locations

Scality’s S3 Connector can store an object payload in multiple Scality RINGs, each de-
scribed as a location constraint.
The key for each location constraint is its name, which the user can choose. Each loca-
tion_constraint entry must have:
• A type
• legacy_aws_behavior
• Either a connector declaration or a details declaration
The Boolean legacy_aws_behavior key indicates whether the user wants requests to this
endpoint to exhibit legacy AWS behavior.
If the location constraint type is scality, the connector key contains sproxyd and its
configuration options, including the node and port and any other sproxyd configuration
options.

Important: For multi-site stretched RINGs, each sproxyd-backed location must have the
lazy_get_enabled field set to false.

location_constraints:
[...]
sproxyd:
[...]
lazy_get_enabled: false
[...]

location_constraints:
legacy:
type: scality
legacy_aws_behavior: yes
connector:
sproxyd:
data_parts: 7
coding_parts: 5
erasure_cos: 3
bootstrap_list:
- node1.example.com:4243
- node2.example.com:4243
- node3.example.com:4243
- node4.example.com:4243
- node5.example.com:4243
(continues on next page)

2023, Scality, Inc 35


(continued from previous page)
- node6.example.com:4243
us-east-1:
type: scality
legacy_aws_behavior: no
connector:
sproxyd:
chord_cos: 2
bootstrap_list:
- node7.example.com:4243
- node8.example.com:4243
- node9.example.com:4243
- node10.example.com:4243
- node11.example.com:4243
- node12.example.com:4243

Important: The “us-east-1” location constraint must be included. This is the default loca-
tion assigned to a bucket when no location is explicitly provided on a PUT bucket request,
and it is the endpoint used if the request endpoint is not contained in the rest_endpoints
configuration.

Note: The sproxyd connector installed by Ansible does not use the sagentd component
to connect to the Scality RING Supervisor.

To configure sproxyd, see Sproxyd Configuration Options.

Location Constraint Aliases

Declare a location constraint that will be an alias to another location constraint:

location_constraints:
site1:
type: scality
legacy_aws_behavior: yes
connector:
sproxyd:
data_parts: 7
coding_parts: 5
erasure_cos: 3
bootstrap_list:
- node1.example.com:4243
(continues on next page)

2023, Scality, Inc 36


(continued from previous page)
- node2.example.com:4243
- node3.example.com:4243
- node4.example.com:4243
- node5.example.com:4243
- node6.example.com:4243
us-west-2: : {type: alias, location: site1}

In this example, the location us-west-2 is an alias for site1. If used by the S3 clients,
us-west-2 will use site1’s sproxyd configuration.

Configuring the S3 REST Endpoints

Warning: env_s3.website_endpoints and env_s3.rest_endpoints must not contain


the same domain name. If they do, the S3 API will no longer be accessible.

Change env_s3.rest_endpoints to define the S3 Connector API endpoint.

Warning: At release 7.0, the format was changed.

S3 Connector supports multiple data backend locations. This is not contained in the
AWS specification.
To set up the configuration, a rest_endpoints section must specify each endpoint as a
key and the default location for that endpoint as a value.

env_s3:
rest_endpoints:
s3.example.com : us-east-1
127.0.0.2: us-east-1
localhost: legacy
node1.amazonaws.com : us-east-1

Regarding the endpoint itself (s3.example.com in the preceding code sample), the host-
name must resolve to the S3 API endpoint (which usually resolves to the load balancer
in front of all endpoints). The hostname DNS entry must be a wildcard entry to resolve
bucket names (for example, bucketname.s3.example.com).
The default location (us-east-1 in the sample) is the default location to put data for re-
quests to the given endpoint, unless a given bucket is tied to a location constraint.
Thus, as illustrated in the example, if a user hits s3.example.com, the default location to

2023, Scality, Inc 37


put data is treated as us-east-1. As such, the location us-east-1 must map to a location
contained in the list of location constraints (see Configuring Storage Locations).

Post-install reconfiguration procedure

1. Go to the s3-offline installer’s federation/ folder.


2. List the hosts to be reconfigured.
$ ENV_DIR=s3config
$ ../repo/venv/bin/ansible -i env/${ENV_DIR}/inventory --list-hosts␣
,→runners_s3

hosts (5):
md1-cluster1
md2-cluster1
md3-cluster1
md4-cluster1
md5-cluster1

3. Reconfigure each S3 service, one host at a time. If the load is balanced between
hosts, first disable the host temporarily from the load balancer configuration.
$ ./ansible-playbook -i env/${ENV_DIR}/inventory run.yml --skip-tags
,→"requirements,run::images" -l md1-cluster1 -t s3

$ ./ansible-playbook -i env/${ENV_DIR}/inventory run.yml --skip-tags


,→"requirements,run::images" -l md2-cluster1 -t s3

$ etc

Sending Filtered Logs and Inventory to the RING’s Elasticsearch Cluster

In a RING 7.10.8 environment, use the address of the servers hosting the RING Elastic-
search cluster to fill the env_elasticsearch.endpoints array:

env_elasticsearch:
endpoints:
- "10.100.132.51:9200"
- "10.100.132.53:9200"
- "10.100.132.52:9200"

Authentication is handled by env_elasticsearch.user and env_elasticsearch.password:

env_elasticsearch:
user: USER
password: PASSWORD
endpoints: [...]

2023, Scality, Inc 38


SSL communication is handled by env_elasticsearch.ssl and env_elasticsearch.cacert:

env_elasticsearch:
ssl: yes
cacert: cacert-file.crt
endpoints: [...]

The cacert file must be stored in federation/env/${ENV_DIR}/.

Post-install reconfiguration procedure

1. Go to the s3-offline installer’s federation/ folder.


2. Reconfigure the log-shipper component
$ ENV_DIR=s3config
$ ./ansible-playbook -i env/${ENV_DIR}/inventory run.yml --skip-tags
,→"requirements,run::images" -t logger

Setting the Leader Site for a Stretched Two-Site RING

If the minority of the Metadata servers is listed in the inventory file as being on site_a,
set the env_metadata_force_leader_on_site variable:

env_metadata_force_leader_on_site: site_a

If the minority is on site_b, uncomment this line and change it to site_b.

Configuring the Offline Installation Files Destination Folder

The env_offline_dest variable allows to customize the path to the folder that stores RPM
and container image files in the S3 connectors.
Leave this variable set to its default value, unless the /opt partition lacks enough avail-
able disk space:

env_offline_dest: '/opt/scality/s3-offline-{{ env_deploy_version }}'

2023, Scality, Inc 39


Configuring the Үum Repositories Used During the Installation

The enabled_repositories variable lists the yum repositories used during the installation
of the various RPM packages needed for S3 to be installed.
During installation, Federation disables all Yum repositories not listed in this variable.
On a non-trivial repository setup, add the custom Yum repositories to the list, retaining
the scality-offline-s3 repository (except for RHEL):

Note: If you are using a light archive or the archive shipped with the RING, do not use
scality-offline-s3:
1. Open the env/${ENV_DIR}/group_vars/all file.
2. Ensure the enabled_repositories for Federation do not use scality-offline-
s3:

enabled_repositories:
- scality-internal
- scality-offline

CentOS

enabled_repositories:
- scality-offline-s3
- custom-repo-1
- custom-repo-2

RHEL

- scality-offline
- custom-repo-1
- custom-repo-2

Configuring S3 Repositories

Ensure the enabled_repositories match the archive and operating system used:

2023, Scality, Inc 40


Light Archive Non-light Archive
CentOS enabled_repositories: enabled_repositories:
• scality-internal • scality-internal
• scality-offline • scality-offline
• scality-offline-s3

RHEL enabled_repositories: enabled_repositories:


• scality-offline • scality-offline
• scality-offline-s3

Important: Ensure the repositories listed under enabled_repositories are the same as
the ones used during the installation process.

Launching the Installation

Continue with Optional S3C Core Component Configuration or launch the Offline Installation.

Optional S3C Core Component Configuration

Core components in Scality’s S3 cluster can be fine-tuned using optional Ansible vari-
ables.
To enable or configure optional S3 core component options, modify or set the following
variables in the federation/env/{{EnvDir}}/group_vars/all file.

Setting the IP Address and Port Used by the S3 Component

By default, the S3 component listens on port 8000 in all available IP addresses. To alter
this behavior, modify the env\_s3\_listen\_on variable:

env_s3_listen_on:
- '[::]:{{ env_s3_port|default(8000) }}'

Note: If an IPv4 address is set as host, the server listens only on IPv4. If an IPv6 address
is set as host, the server listens on both IPv4 and IPv6. Multiple values can be used to
launch multiple servers. However, if the IP addresses overlap or are the same, different
ports must be assigned.

2023, Scality, Inc 41


Post-install reconfiguration procedure

1. Go to the s3-offline installer’s federation/ folder.


2. List the hosts to be reconfigured.
$ ../repo/venv/bin/ansible -i env/{{EnvDir}}/inventory --list-hosts␣
,→runners_s3

hosts (5):
md1-cluster1
md2-cluster1
md3-cluster1
md4-cluster1
md5-cluster1

3. Reconfigure each S3 service, one host at a time. If the load is balanced between
hosts, disable the host temporarily from the load balancer configuration.
$ ./ansible-playbook -i env/{{EnvDir}}/inventory run.yml --skip-tags
,→"requirements,run::images" -l md1-cluster1 -t s3

$ ./ansible-playbook -i env/{{EnvDir}}/inventory run.yml --skip-tags


,→"requirements,run::images" -l md2-cluster1 -t s3

Setting the IP Address Used by the bucketd Component

Important: On RING 8.x versions, bucketd listens on all interfaces by default.

To set bucketd to listen on a specific interface, use the env_metadata_bucket_dbd_host


variable.

Listing 1: Example with the localhost interface


env_metadata_bucket_dbd_host: 'localhost'

Post-install reconfiguration procedure

1. Go to the s3-offline installer’s federation/ folder.


2. List the hosts to be reconfigured.
$ ../repo/venv/bin/ansible -i env/{{EnvDir}}/inventory --list-hosts␣
,→runners_s3

hosts (5):

2023, Scality, Inc 42


md1-cluster1
md2-cluster1
md3-cluster1
md4-cluster1
md5-cluster1

3. Reconfigure one host at a time.


$ ./ansible-playbook -i env/{{EnvDir}}/inventory run.yml --skip-tags
,→"requirements,run::images" -l md1-cluster1 -t stateless,metadata

If UTAPI V2 is enabled, run the following procedure to reconfigure bucketd on stateful


hosts in addition to the procedure above.

Post-install reconfiguration procedure

1. Go to the s3-offline installer’s federation/ folder.


2. List the hosts to be reconfigured.
$ ../repo/venv/bin/ansible -i env/{{EnvDir}}/inventory --list-hosts␣
,→runners_metadata

hosts (5):
md1-cluster1
md2-cluster1
md3-cluster1
md4-cluster1
md5-cluster1

3. Reconfigure one host at a time.


$ ./ansible-playbook -i env/{{EnvDir}}/inventory run.yml --skip-tags
,→"requirements,run::images" -l md1-cluster1 -t stateful,utapi

Declaring External Load Balancers

Important: This operation is only required when the load balancer is configured to for-
ward requests directly to Cloudserver (port 8000 by default). When the load balancer
is configured to forward requests to nginx (ports 80/443 by default), this operation is
not necessary, as nginx headers are always taken into account. Note that forwarding
requests directly to Cloudserver is a deprecated architecture.

2023, Scality, Inc 43


Introduction

The RING benefits from having the client access the Scality S3 endpoints using a load
balancer. It enables the distribution of network traffic across multiple S3 connectors and
guarantees application uptime and performance in single and multi-site scenarios.
Scality S3 connectors enable the S3 service using HTTP and HTTPS protocols. These
connectors can be resident on the storage servers or deployed on dedicated servers.
Scality S3 connectors are stateless. They do not store information (which is stored in the
RING) and the S3 client can connect to any S3 connector at any time, without affecting
the user experience.
To maintain high availability and performance of the S3 service, a method is needed to
distribute HTTP/HTTPS requests across the different S3 endpoints.
If the service is balanced between hosts by a load balancer (BigIp F5, haproxy, etc.) acting
as a reverse-proxy, the S3 client’s IP addresses are replaced by the load balancer’s IP
address.
This behavior can alter the way IP-based IAM conditions check the source of requests.

Declare Load Balancer

To access the correct IP address for checking, set up the reverse proxies the requests
are sent from and the HTTP header that contains the S3 client’s IP address.

env_s3_requests_via_proxy: true
env_s3_trusted_proxy_cidrs:
- '1.2.3.4/24'
- '5.6.7.8'
env_s3_client_ip_header: x-real-ip

The following table describes each configuration variable.

Parameter Description
env_s3_requests_via_proxySet to true if reverse proxies are used.
env_s3_trusted_proxy_cidrs
List of trusted reverse-proxy CIDRs.
env_s3_client_ip_header Request header set by reverse proxy with the S3 client’s IP
address.

Post-install reconfiguration procedure

1. Go to the s3-offline installer’s federation/ folder.


2. List the hosts to be reconfigured.

2023, Scality, Inc 44


$ ../repo/venv/bin/ansible -i env/{{EnvDir}}/inventory --list-hosts␣
,→runners_s3

hosts (5):
md1-cluster1
md2-cluster1
md3-cluster1
md4-cluster1
md5-cluster1

3. For each listed host:


1. To achieve 100% availability, temporarily disable the host from the load-
balancer configuration.
2. Reconfigure the S3 service of the host.
$ ./ansible-playbook -i env/{{EnvDir}}/inventory run.yml --skip-
,→tags "requirements,run::images" -l md1-cluster1 -t s3

Adding Healthcheck Host IP Addresses

The S3 service’s healthcheck HTTP service can be used to check overall S3 cluster
health. It consists of a GET action sent to any S3 server.

curl 127.0.0.1/_/healthcheck/deep

Only IP addresses included in the env\_healthcheck\_allow\_ips configuration variable


are allowed to send this request. Add the IP address of the hosts responsible for the
healthcheck of the S3 cluster.
In the example below, all 254 hosts in the 10.11.12.0 subnet and the host 10.10.10.1
are allowed to access the healthcheck service.

Listing 2: Example
env_healthcheck_allow_ips:
- '{{ ansible_host }}'
- '10.11.12.0/24'
- '10.10.10.1'

Post-install reconfiguration procedure

1. Go to the s3-offline installer’s federation/ folder.


2. List the hosts to be reconfigured.

2023, Scality, Inc 45


$ ../repo/venv/bin/ansible -i env/{{EnvDir}}/inventory --list-hosts␣
,→runners_s3

hosts (5):
md1-cluster1
md2-cluster1
md3-cluster1
md4-cluster1
md5-cluster1

3. Reconfigure each S3 service, one host at a time. If the load is balanced between
hosts, disable the host temporarily from the load balancer configuration.
$ ./ansible-playbook -i env/{{EnvDir}}/inventory run.yml --skip-tags
,→"requirements,run::images" -l md1-cluster1

$ ./ansible-playbook -i env/{{EnvDir}}/inventory run.yml --skip-tags


,→"requirements,run::images" -l md2-cluster1

Setting Up the S3 Frontend Service

Either software load balancers (Nginx containers, for example) or hardware load bal-
ancers can be set as SSL termination points for end-user applications. They can also
be used to check the health of the installed S3 Connector instances and to blacklist any
connectors that are not responding properly to the health check.
Scality bundles a frontend service in the scality-frontend-s3 Docker container, which
hosts an Nginx process that can:
• Listen on port 80, the standard HTTP port,
• Terminate SSL/TLS on port 443,
• Check its local S3 Connector’s health,
• Direct requests to its local S3 Connector if it is healthy,
• Direct requests to other S3 Connectors if the local one is not healthy.

Configure the IP and Port to Listen On

To configure the S3 frontend service, uncomment the following variables and assign val-
ues appropriate for the environment:
1. Configure the port the HTTP service listens on.

env_s3_frontend_port: 80

2023, Scality, Inc 46


2. Configure the IP address(es) or network interface(s) the frontend service listens to.

Listing 3: Example on eth2 and eth3 interfaces


env_s3_frontend_listen_on:
'{{ ansible_eth2.ipv4.address }}'
'{{ ansible_eth3.ipv4.address }}'

The S3 frontend service listens to all IP addresses by default.

Note: If the interface name contains a dot (for example, bond0.221) use the fol-
lowing syntax instead.

env_s3_frontend_listen_on:
- '{{ ansible_facts["bond0.221"].ipv4.address }}'

Post-install reconfiguration procedure

1. Go to the s3-offline installer’s federation/ folder.


2. List the hosts to be reconfigured.
$ ../repo/venv/bin/ansible -i env/{{EnvDir}}/inventory --list-hosts␣
,→runners_s3

hosts (5):
md1-cluster1
md2-cluster1
md3-cluster1
md4-cluster1
md5-cluster1

3. Reconfigure each S3 frontend service, one host at a time. If the load is balanced
between hosts, disable the host temporarily from the load balancer configura-
tion.
$ ./ansible-playbook -i env/{{EnvDir}}/inventory run.yml --skip-tags
,→"requirements,run::images" -l md1-cluster1 -t s3

$ ./ansible-playbook -i env/{{EnvDir}}/inventory run.yml --skip-tags


,→"requirements,run::images" -l md2-cluster1 -t s3

2023, Scality, Inc 47


Configure S3 Frontend Buffer Size

Configure the proxy buffer size.

s3_frontend_nginx_proxy_buffer_size: 8k

Note: The nginx configuration will have parameters proxy_buffer_size and


proxy_busy_buffers_size set to the value of this variable. The default value of the con-
figuration variables is 8k.

Configuring SSL/TLS Certificates for S3 Frontend Service

Certificate Requirements

• Self-signed certificates require a valid CA certificate/bundle to be provided.


• Required x509v3 Key Usage Extensions: TLS Web Server Authentication (also
known as “serverAuth”).
• Subject Alternative Name (SAN): Must include each of the entries contained in
env_s3.rest_endpoints in the env/{{EnvDir}}/group_vars/all file, for instance s3.
company.com.
• Optionally:
– A *.s3.company.com wildcard SAN entry in the certificate (for virtual host-style
access).
– A DNS SAN entry for each endpoint server hostname.
– An IP SAN entry for each endpoints IP address.
• The certificate must match *.s3.company.com for virtual host-style access to work
for rest_endpoint service URL. When multiple rest_endpoint URLs exist, for example
svc1.company.com and svc2.company.com, one wildcard SAN must exist for each
rest_endpoint (for example *.svc1.company.com, and *.svc2.company.com).
• Any rest_endpoint without its own wildcard will not support virtual host-style ac-
cess unless the certificate contains a SAN entry for each bucket requiring virtual
host-style access for the application (for example bucket1.s3.company.com and
bucket2.s3.company.com).
• If you do not configure access using a SAN wildcard or a unique entry, the bucket
can only access the application using a path-style access, such as s3.company.
com/bucket. Scality recommends using the host-style access.

2023, Scality, Inc 48


Upload Certificate Files to the Supervisor

1. Log in as root on the supervisor.


2. Go to the federation folder.

cd /srv/scality/s3/s3-offline/federation

3. Create the env/{{EnvDir}}/ssl directory.

mkdir env/{{EnvDir}}/ssl

4. Upload the certificate (.crt file), key (.key file) and intermediate authority certificate
(CA bundle file) into the env/{{EnvDir}}/ssl directory.

Enable SSL/TLS in S3 Frontend Service

To enable SSL/TLS, uncomment the following variables and assign appropriate values
for the environment:
1. Configure the port the SSL service listens on.

env_s3_frontend_ssl_port: 443

2. Define whether the frontend service should only host SSL service (true) or listen
on both HTTP and HTTPS (false).

Listing 4: Example of a frontend service only hosting


SSL service
env_s3_frontend_ssl_only: true

3. Set the SSL/TLS certificate, secret key, and CA bundle file names matching the
ones previously uploaded to env/{{EnvDir}}/ssl.

env_s3_frontend_ssl_cert: s3.crt # Mandatory


env_s3_frontend_ssl_key: s3.key # Mandatory
env_s3_frontend_ssl_ca_bundle: ca-bundle.crt # Optional unless Self-
,→signed certificate

4. Set the server name.

env_s3_frontend_server_name: s3.example.com # must match SSL cert CN

5. Set the TLS ciphers accepted by the frontend service.

2023, Scality, Inc 49


env_s3_frontend_ssl_protocols:
- TLSv1
- TLSv1.1
- TLSv1.2
- TLSv1.3

Post-install reconfiguration procedure

1. Go to the s3-offline installer’s federation/ folder.


2. List the hosts to be reconfigured.
$ ../repo/venv/bin/ansible -i env/{{EnvDir}}/inventory --list-hosts␣
,→runners_s3

hosts (5):
md1-cluster1
md2-cluster1
md3-cluster1
md4-cluster1
md5-cluster1

3. Reconfigure each S3 frontend service, one host at a time. If the load is bal-
anced between hosts, first disable the host temporarily from the load balancer
configuration.
$ ./ansible-playbook -i env/{{EnvDir}}/inventory run.yml --skip-tags
,→"requirements,run::images" -l md1-cluster1 -t s3

$ ./ansible-playbook -i env/{{EnvDir}}/inventory run.yml --skip-tags


,→"requirements,run::images" -l md2-cluster1 -t s3

4. Reconfigure the kudzu container to point to the modified ports.


$ ./ansible-playbook -i env/{{EnvDir}}/inventory run.yml --skip-tags
,→"requirements,run::images" -t kudzu

Note: Make sure your load balancer is reconfigured if you are enabling
SSL/TLS.

2023, Scality, Inc 50


Configuring SSL/TLS Certificates for Vault and S3 Backend Service

The following S3 components can be configured to be HTTPS-accessible only.


• scality-s3
• scality-bucketd
• scality-vault
• scality-metadata-bucket-repd
• scality-metadata-bucket-map
• scality-metadata-vault
The components these S3 components connect to must also be accessible via HTTPS.
This can be set independently to the HTTPS offloading capacity of the scality-frontend-
s3 container.

Important: The entire S3 cluster must be reconfigured, and restarted, at the same time.
This requires a few downtime minutes.

Prerequisites

• TLS private keys (env_tls_key: s3.key) must not be protected by a password.


• You must provide SSL certificate(s) that correctly certify the ansible\_host vari-
able provided in the env/{{EnvDir}}/inventory file in the Federation folder con-
taining the S3 cluster configuration.

Certificate Requirements

• Self-signed certificates require a valid CA certificate/bundle to be provided.


• Required x509v3 Key Usage Extensions:
– TLS Web Server Authentication (also known as “serverAuth”).
– TLS Web Client Authentication (also known as “clientAuth”).
• Subject Alternative Name (SAN)
– Must have SAN entries for each server in the runners_metadata group con-
taining the value from global_data_host variable.
– If the inventory specifies ansible_host as a fully qualified name:

2023, Scality, Inc 51


md1-cluster1 ansible_host=server1.s3.company.com
md2-cluster1 ansible_host=server2.s3.company.com
[...]

The servers must then be certified with either:


* A single certificate with *.s3.company.com wildcard.
* A single certificate with SAN entries for server1.s3.company.com,
server2.s3.company.com, and so on in the certificate’s Subject Alterna-
tive Name field.
* Per-server certificates, certifying each server’s identity and IP address
from global_data_host.

Distribute Certificates

Distribute the certificates in the same folder and with the same name in all servers.
For example:
• /opt/scality/ssl/s3.cert for the mandatory certificate file.
• /opt/scality/ssl/s3.key for the mandatory key protecting the certificate file.
• /opt/scality/ssl/ca.cert for the optional signing tiers’ CA.

Important: The key file cannot be protected by a passphrase.

Configure Paths to Certificate Files

Once these files are present in all S3 connectors, uncomment and set the following vari-
ables:

env_tls_cert: /opt/scality/ssl/s3.cert # Mandatory


env_tls_key: /opt/scality/ssl/s3.key # Mandatory
env_tls_ca: /opt/scality/ssl/ca.cert # Optional unless self-signed␣
,→certificate

Note: When internal TLS is enabled with these variables, env_tls_cert must point to a
generated certificate that can authenticate clients as well as servers.

2023, Scality, Inc 52


Important: When using self-signed certificates, make sure to provide env_tls_ca with a
valid CA certificate.

Configure Vault TLS Protocols/Ciphers

Uncomment and set the TLS protocols to be accepted by the Vault component.

env_vault_ssl_protocols:
- TLSv1
- TLSv1.1
- TLSv1.2

Post-install reconfiguration procedure

1. Go to the s3-offline installer’s federation/ folder.


2. List the hosts to be reconfigured.
$ ../repo/venv/bin/ansible -i env/{{EnvDir}}/inventory --list-hosts all
hosts (5):
md1-cluster1
md2-cluster1
md3-cluster1
md4-cluster1
md5-cluster1

3. Reconfigure each host, one by one. If the load is balanced between hosts, dis-
able the host temporarily from the load balancer configuration.
$ ./ansible-playbook -i env/{{EnvDir}}/inventory run.yml --skip-tags
,→"requirements,run::images" -l md1-cluster1

$ ./ansible-playbook -i env/{{EnvDir}}/inventory run.yml --skip-tags


,→"requirements,run::images" -l md2-cluster1

2023, Scality, Inc 53


Configuring S3 Website Endpoints

Warning: env_s3.website_endpoints and env_s3.rest_endpoints must not contain


the same domain name. If they do, the S3 API will no longer be accessible.

S3 Connector supports website endpoints that allow direct HTTP access to buckets as
if they were websites.
For example, if a user has a bucket named “foo”, a bucket website request to S3 Connec-
tor is made to http://foo.s3-website.scality.example.com. This is made possible by
the following configuration:

env_s3:
[...]
website_endpoints:
- s3-website.scality.example.com

Declare as many website endpoints as needed.

Warning: The env_s3.website_endpoints and env_s3.rest_endpoints lists are mu-


tually exclusive.

Note: Once the S3 connectors are configured, set up the bucket’s ACL and website
settings by following the Configuring a Bucket for Website Hosting procedure.

Post-install reconfiguration procedure

1. Go to s3-offline installer’s federation/ folder.


2. List the hosts to be reconfigured.
$ ../repo/venv/bin/ansible -i env/{{EnvDir}}/inventory --list-hosts␣
,→runners_s3

hosts (5):
md1-cluster1
md2-cluster1
md3-cluster1
md4-cluster1
md5-cluster1

2023, Scality, Inc 54


3. Reconfigure each S3 service, one host at a time. If the load is balanced between
hosts, disable the host temporarily from the load balancer configuration.
$ ./ansible-playbook -i env/{{EnvDir}}/inventory run.yml --skip-tags
,→"requirements,run::images" -l md1-cluster1 -t s3

$ ./ansible-playbook -i env/{{EnvDir}}/inventory run.yml --skip-tags


,→"requirements,run::images" -l md2-cluster1 -t s3

Configuring Vault

You can customize Vault role caching and configure a custom Vault sproxyd location by
editing federation/env/{{EnvDir}}/group_vars/all.
1. Open federation/env/{{EnvDir}}/group_vars/all.
2. Find env_vault:.
env_vault:
# Disabling cache_roles means that roles will not be cached in a␣
,→distributed

# cache
cache_roles: yes
# sproxyd:
# # 'location': name of location defined in location_constraints section
# location: 'us-east-1'

Tip: Set role caching in the Vault component whenever possible.

3. Uncomment [...] to change sproxyd from its default us-east-1, # sproxyd: and
# location:.
4. Enter your custom location in the location line.

Post-install reconfiguration procedure

1. Go to the s3-offline installer’s federation/ folder.


2. List the hosts to be reconfigured.
$ ../repo/venv/bin/ansible -i env/{{EnvDir}}/inventory --list-hosts␣
,→runners_s3

hosts (5):
md1-cluster1
md2-cluster1

2023, Scality, Inc 55


md3-cluster1
md4-cluster1
md5-cluster1

3. Reconfigure each S3 service, one host at a time. If the load is balanced between
hosts, disable the host temporarily from the load balancer configuration.
$ ./ansible-playbook -i env/{{EnvDir}}/inventory run.yml --skip-tags
,→"requirements,run::images" -l md1-cluster1 -t vault

$ ./ansible-playbook -i env/{{EnvDir}}/inventory run.yml --skip-tags


,→"requirements,run::images" -l md2-cluster1 -t vault

Configuring the Bit Rot Checker

Configure the Bit Rot Checker

Configure the S3 bit rot checker using Ansible inside the S3 installer’s federation/ folder.
S3 bit rot components are configured in the env_bitrot: section inside the env/
{{EnvDir}}/group_vars/all file.
1. Edit group_vars/all.
2. If the env_bitrot: section is commented out, uncomment it.
3. Set the env_bitrot.enabled variable to yes.

env_bitrot:
enabled: yes

Install the Bit Rot Checker

1. Apply the bit rot configuration and run the ingesters by applying the run.yml Ansi-
ble playbook with the bitrot tag.

$ cd /srv/scality/s3/s3-offline/federation

$ ./ansible-playbook -i env/{{EnvDir}}/inventory run.yml -t bitrot

2. Install the Grafana dashboards.

2023, Scality, Inc 56


Configuring Log Retention Policies

The log retention policy for some S3 components can be configured using the following
Ansible variables:

Variable Description
stdout_logfile_maxbytes The maximum size of the active log file
stdout_logfile_backups The number of log archives to keep
stderr_logfile_maxbytes The maximum size of the active error log file
stderr_logfile_backups The number of error log archives to keep

Set those variables for each of the following S3 components:

Environment Default values


env_sproxyd, env_vault, env_s3 stdout_logfile_maxbytes:
300 MB
stdout_logfile_backups: 2
stderr_logfile_maxbytes:
100 MB
stderr_logfile_backups: 7
env_utapi, env_kudzu, env_bucket, env_backbeat, stdout_logfile_maxbytes:
env_cdmi, env_metadata, env_identisee, env_bucket_ 100 MB
notifications stdout_logfile_backups: 7
stderr_logfile_maxbytes:
100 MB
stderr_logfile_backups: 7
env_logger stdout_logfile_maxbytes: 3
GB
stdout_logfile_backups: 1
stderr_logfile_maxbytes:
100 MB
stderr_logfile_backups: 7
env_frontend stdout_logfile_maxbytes:
10M
stdout_logfile_backups: 10

In the env/{{ EnvDir }}/group_vars/all file, these variables are configured under the
chosen env name heading.
Each env_* prefix modifies the corresponding component’s configuration:
• env_sproxyd modifies the scality-sproxyd component,
• env_vault modifies the scality-vault component,

2023, Scality, Inc 57


• etc.
If the file already contains the env_*` variable, re-use it instead of declaring a new one.

Listing 5: For example


env_s3:
stdout_logfile_maxbytes: 200MB
stdout_logfile_backups: 10
[...]
env_vault:
stdout_logfile_maxbytes: 400MB
stdout_logfile_backups: 3
[...]
env_sproxyd:
stdout_logfile_maxbytes: 700MB
stdout_logfile_backups: 20
[...]

Post-install reconfiguration procedure

1. Go to the s3-offline installer’s federation/ folder.


2. List the hosts to be reconfigured.
$ ../repo/venv/bin/ansible -i env/{{EnvDir}}/inventory --list-hosts␣
,→runners_s3

hosts (5):
md1-cluster1
md2-cluster1
md3-cluster1
md4-cluster1
md5-cluster1

3. Reconfigure each S3 service, one host at a time. If the load is balanced between
hosts, disable the host temporarily from the load balancer configuration.
$ ./ansible-playbook -i env/{{EnvDir}}/inventory run.yml --skip-tags
,→"requirements,run::images" -l md1-cluster1 -t s3,vault,sproxyd

$ ./ansible-playbook -i env/{{EnvDir}}/inventory run.yml --skip-tags


,→"requirements,run::images" -l md2-cluster1 -t s3,vault,sproxyd

2023, Scality, Inc 58


Sproxyd Configuration Options

For general sproxyd configuration parameters, see the sproxyd section in the RING Drivers
section of the RING Reference.
The following table lists the supported sproxyd configuration options for sproxyd-type
connectors. For each location constraint, each of these options can be specified by
adding the information under the sproxyd: heading.

Configuration Option Description


chord_cos Makes sproxydclient go through the chord driver
rather than the arc driver. Mutually exclusive with
erasure\_cos. Defines the number of additional
copies to keep in replication.
lazy_get_enabled For geo-stretched RINGs with significant inter-site
latency. It affects only the replication (chord) driver
in sproxyd. Set lazy_get_enabled to 0 to get
fast_mode. The default value is 1 (no fast get)
lazy_get_enabled : true. See the RING Reference
for more information
data_parts Maximum number of ARC data chunks (k.)
coding_parts Number of ARC coding chunks (m).
erasure_cos Controls the number of copies for data that is too
small (< 60 KB) to be stored in erasure coding (i.e. in
ARC). Mutually exclusive with chord\_cos.
min_redundant_parts_put_ok Controls the number of coding keys that sproxyd
writes before returning OK to S3 on a PUT operation.
The default -1 means all coding is written before a
200 OK is returned. The safest value is -1. You do no
need to change this value except in a multi-site
environment

Important: Replication with chord_cos is only supported with RING and S3C version >=
7.4.6. Upgrade the RING if the version is < 7.4.6.

Note: Defining chord\_cos makes S3 traffic to be stored in replication (chord) via chord\
_cos copy (plus one of the original data copies). The metadata backup is still stored via
erasure coding.

2023, Scality, Inc 59


Tune sproxyd Processes

Optional tuning parameters are provided to change sproxyd performance from


group_vars/all. These variables are shared among all location constraints. They can-
not be configured per location constraint.

Warning: Do not adjust these parameters unless instructed by Scality.

Default values:
• split_memory_limit: 67108864
• split_n_get_workers: 20
• split_n_put_workers: 20
• split_n_io_workers: 20
• n_workers: 500
• n_responders: 500
In the env/{{ EnvDir }}/group_vars/all file, these variables are configured under the
env_sproxyd: heading.

Listing 6: For Example


env_sproxyd:
split_memory_limit: 671088640
n_workers: 200

Sproxyd Additional Parameters

• sproxyd_traces_enabled: Enables tracing for troubleshooting certain perfor-


mance issues. See Playbook for Sproxyd Tracing.
• sproxyd_traces_log_dir: Indicates the directory where tracing files are written.

Post-install reconfiguration procedure

1. Go to the s3-offline installer’s federation/ folder.


2. List the hosts to be reconfigured.
$ ../repo/venv/bin/ansible -i env/{{EnvDir}}/inventory --list-hosts␣
,→runners_s3

hosts (5):

2023, Scality, Inc 60


md1-cluster1
md2-cluster1
md3-cluster1
md4-cluster1
md5-cluster1

3. Reconfigure each S3 service, one host at a time. If the load is balanced between
hosts, disable the host temporarily from the load balancer configuration.
$ ./ansible-playbook -i env/{{EnvDir}}/inventory run.yml --skip-tags
,→"requirements,run::images" -l md1-cluster1 -t sproxyd

Configuring the Redis Components

Redis is used as in-memory database backend for various S3C components, such as
UTAPI, Vault and the NFS servers.

Set Up Redis Authentication

Activate the Redis authentication mechanism with the env_redis_password variable:

env_redis_password: 'myuniqueandultrasecretpassword'

Configure the Redis Cluster Name

Warning: Do not change the Redis cluster name after the cluster has been initialized.

To configure the cluster name, set the env_redis_sentinel_cluster_name variable:

env_redis_sentinel_cluster_name: cluster-1

The default value is scality-s3.

Post-install reconfiguration procedure

Modifying the Redis configuration triggers the reconfiguration and restarting of all
Redis components (client, sentinel and server) at once. This requires a 5-minute ser-
vice interruption.

2023, Scality, Inc 61


1. Go to the s3-offline installer’s federation/ folder.
2. Reconfigure all hosts at once.
$ ./ansible-playbook -i env/{{EnvDir}}/inventory run.yml --skip-tags
,→"requirements,run::images"

Continue with Configuring Additional S3C Features or launch the Offline Installation.

Enabling Short Version IDs

When enabled, the short version ID feature allows the S3 Connector to return version IDs
for objects that are 32 characters long. When disabled, version IDs returned are >= 42
characters.

Note: Existing objects always keep their version ID in the original format used when
they were created.

By default, short version IDs are set to False. To enable short version IDs, set the follow-
ing configurable in the env/s3config/group_vars/all file stored in the /srv/scality/
s3/s3-offline/federation folder of the supervisor.
1. Set the feature to True in the env/s3config/group_vars/all file.

env_short_version_id: True

2. Change directories to /srv/scality/s3/s3-offline/federation.

$ cd /srv/scality/s3/s3-offline/federation

3. Run the ansible-playbook command.

$ ./ansible-playbook -i env/s3config/inventory run.yml -t s3

Configuring Additional S3C Features

Scality’s S3 Connector comes with various additional features that can be configured
and activated depending on the user’s needs.
Modify or set the following variables in the federation/env/${ENV_DIR}/group_vars/
all file, to enable or configure optional S3 features.
When you have finished editing group_vars/all, upgrade the cluster as described in Up-
grading towards 7.10.8.

2023, Scality, Inc 62


Configuring Utilization Reporting

S3 Connector provides an API for resource utilization tracking and metrics reporting.
Starting with S3C connector version 7.7.0 and onwards, a second version of the API is
offered, moving database duties from Redis to Warp 10. Because of the architectural
change introduced in version 7.7.0, only one version of UTAPI can be enabled.

Important: UTAPI V2 supports bucket and account-level metrics, but does not support
service-level metrics in its current implementation.

Utilization API V1 (UTAPI V1)

Enable UTAPI V1

In the env_s3 section, set enable_utilization_api to true.

env_s3:
[...]
enable_utilization_api: true

Configure UTAPI V1

Set variables in the env_s3 section as follow. All configuration items below are optional.
• Choose one or more levels of metrics: valid values are account, bucket, user and
service.

env_s3:
utilizaton_api_metrics:
- buckets
- accounts

When UTAPI is enabled this configuration is not set, S3 pushes metrics for all levels
by default. In the example above, bucket and account metrics are collected, but
user and service metrics are not.
• Expire bucket metrics following bucket deletion.

env_s3:
utapi_expire_metrics: true
utapi_expire_metrics_ttl: 0

2023, Scality, Inc 63


The TTL sets the duration (in seconds) for metrics to be retained after bucket dele-
tion. When set to 0 and when utapi_expire_metrics is set to true metrics expire
immediately.

Note: By default, S3 does not expire any metrics, even if S3 buckets are deleted.

Tip: In deployments where buckets are expected to be created and deleted fre-
quently, set utapi_expire_metrics to true to avoid excessive memory consump-
tion.

• Modify the utapi-replay frequency to retry failed metric updates.


env_s3:
utapi_replay_schedule: "*/5 * * * *"

Note: By default, utapi-replay is run every five minutes. The utapi-replay fea-
ture caches metrics if the Redis service is unavailable. This feature brings stability
when Redis is unavailable to receive requests to store the metrics.

Post-install reconfiguration procedure

1. Go to the federation/ directory of the s3-offline installer.


2. List the hosts to be reconfigured.
$ ENV_DIR=s3config
$ ../repo/venv/bin/ansible -i env/${ENV_DIR}/inventory --list-hosts␣
,→runners_s3

hosts (5):
md1-cluster1
md2-cluster1
md3-cluster1
md4-cluster1
md5-cluster1

3. Reconfigure each UTAPI service, one host at a time. If the load is balanced
between hosts, first temporarily disable the host from the load balancer config-
uration.
$ ./ansible-playbook -i env/${ENV_DIR}/inventory run.yml --skip-tags
,→"requirements,run::images" -l md1-cluster1 -t s3

$ ./ansible-playbook -i env/${ENV_DIR}/inventory run.yml --skip-tags


,→"requirements,run::images" -l md2-cluster1 -t s3

$ etc

2023, Scality, Inc 64


Utilization API V2 (UTAPI V2)

Enable UTAPI V2

Edit the following configuration parameter in group_vars/all:

# Configuration block for Utapi V2


#
# This configuration block enables Utapi V2 which uses Warp 10 as it's database.
# The V2 architecture no longer uses Redis Sentinel to store metrics long term,
# it only uses redis-local for interim cache that gets written to Warp10
# env_utapi:
# enabled: False
# service_port: 8100
# ingestion_schedule: "*/5 * * * * *"
# checkpoint_schedule: "*/30 * * * * *"
# snapshot_schedule: "5 0 * * * *"
# max_snapshot_size: "P6H"
# repair_schedule: "0 */5 * * * *"
# reindex_schedule: "0 0 0 * * Sun"
# disk_usage_schedule: "0 */5 * * * *"
# expiration_schedule: "15 * * * *"
# expiration_enabled: True
# metric_retention_days: 45
# hard_limit: 100G
# warp10_maxops: 1000000000
# warp10_maxfetch: 1000000000
# snapshot_enabled: True
# sensision_enabled: False
# warp10_java_heap_max: '4g'
# warp10_java_heap_initial: '1g'
# warp10_java_extra_opts: ''

Parameter Description
env_utapi Uncomment this parameter and change the value of enabled
to True.
service_port Leave this parameter at 8100 unless you have a port conflict.
ingestion_schedule These are crontab-style parameters.

checkpoint_schedule Important: The reindex_schedule parameter controls when


snapshot_schedule the reindex background task runs. It is recommended to
schedule these tasks during periods withcontinues
low traffic and min-
on next page
imal processor demands.
2023, Scality, Inc 65
Table 2 – continued from previous page
Parameter Description
repair_schedule
disk_usage_schedule
reindex_schedule
expiration_enabled: This parameter is enabled by default.
True
metric_retention_daysThis parameter controls the retention period for UTAPI
metrics.
expiration_schedule This parameter controls the time at which UTAPI polls for
data that is metric_retention_days old. It operates on a
crontab-like schedule.
hard_limit This parameter sets a byte threshold above which Warp 10
refuses writes. When the size falls below this limit, Warp 10
automatically unlocks and begins accepting writes.
To enable this parameter, set the limit to an appropriate
value, expressed as a binary-dimensioned whole number (B,
K, G, T, P).
To disable this parameter, set the value to null. This limit is
checked on the schedule by disk_usage_schedule.
warp10_ The parameters whose keys start with warp10_ are for
performance and tuning purposes only.

Important: Do not edit these parameter without Scality’s


confirmation.

sensission_enabled This parameter is used for internal testing. Do not edit it.
snapshot_enabled This parameter is for data migration during the UTAPI
upgrade.

Configure Reindex Metrics

Trigger the Reindex Task

You can manually or automatically trigger the reindex task to ensure the accuracy of the
UTAPIv2 metrics. This is useful for billing purposes. The reindex task scans all buckets
and validates the storage consumption and the object count.

Note: Reindex must run on *-cluster1, not on a single machine.

The reindex task is scheduled by default on every Sunday. To change this scheduling,

2023, Scality, Inc 66


modify the reindex_schedule parameter in group_vars/all.

env_utapi:
reindex_schedule: "0 0 0 * * Sun"

To manually trigger the reindex task, use the following command on each *-cluster1
node.

Warning: The reindex task consumes between 5% and 20% of the resources on state-
ful nodes. Ensure to trigger it when there is low traffic on the platform.

On EL7 systems

$ docker exec -u root -i scality-utapi yarn start_v2:task:reindex -- --now

On EL8 systems

$ ctrctl exec -tty --user 0 scality-utapi yarn start_v2:task:reindex -- --now

Set Log Levels

The UTAPI V2 reindex task logs its operations. These logs are named reindex-<number>.
log and are found in /var/log/s3/scality-utapi/logs/ on stateful servers. For exam-
ple /var/log/s3/scality-utapi/logs/reindex-0.log.
By default, logging tracks info-level events. You can configure logging to record the fol-
lowing levels of detail (from least to most verbose):
• error
• warn
• info
• debug
• trace
You can configure the log level with a setting in group_vars/all or by triggering a reindex
task and passing the log level as an argument.
• Using the group_vars/all setting:
Add the following UTAPI_LOG_LEVEL parameter and set the logging level of details.

2023, Scality, Inc 67


env_utapi_extraenv2: "UTAPI_LOG_LEVEL=debug"

• Triggering the reindex task:


From the command line, run the following command on all servers in metadata
cluster1 from the federation/ directory:

On EL7 systems

ENV_DIR=s3config
../repo/venv/bin/ansible -i env/${ENV_DIR}/inventory -b -m shell -a "docker␣
,→exec -u root -i scality-utapi yarn start_v2:task:reindex -– --now"␣

,→*cluster1

On EL8 systems

ENV_DIR=s3config
../repo/venv/bin/ansible -i env/${ENV_DIR}/inventory -b -m shell -a "ctrctl␣
,→exec --tty --user 0 scality-utapi yarn start_v2:task:reindex -– --now"␣

,→*cluster1

The “info” log level (default) reports the following messages:


• started reindex task at the start of reindexing.
• started bucket reindex at the beginning of each bucket’s reindex operation.
• finished bucket reindex when a bucket reindex is complete.
• discrepancy detected in metrics, writing corrective record on detection
of a correctable error.
• failed bucket reindex, associated account skipped on detection of a non-
correctable error.
• finished reindex task on completion.
The program executes these in the following order:
1. Once at program start.

message: 'started reindex task'


level: 'info'

2. Per bucket.

2023, Scality, Inc 68


message: 'started bucket reindex'
level: 'info'
bucket: '<bucket_name>'

...

message: 'finished bucket reindex',


level: 'info'
bucket: '<bucket_name>'

3. As needed.

message: 'discrepancy detected in metrics, writing corrective record',


level: 'info'
objectDelta: <number_of_objects>
sizeDelta: <number_of_bytes>

With one of:

bck: '<bucket_name>'
acc: '<account_canonical_id>'

...

message: 'failed bucket reindex, associated account skipped',


level: 'error'
bck: '<bucket_name>'

4. Once at program end.

message: 'finished reindex task',


level: 'info'

Log entries consist of the following possible fields:

2023, Scality, Inc 69


Field Description
message The event being reported
level The log level
name Utapi
time Unix epoch time
module Reindex task
bucket The bucket name
acc The account canonical ID
bck The bucket name
objectDelta Discrepancy in object count
sizeDelta Size discrepancy (in bytes)
hostname Host name
pid Process ID

Stop the Reindex Task

The reindex task can be lengthy and processor-intensive. To stop a reindex task in
progress, restart scality-utapi from the metadata cluster.

On EL7 systems

ENV_DIR=s3config
../repo/venv/bin/ansible -i env/${ENV_DIR}/inventory -b -m shell -a "docker␣
,→restart scality-utapi" *-cluster1

On EL8 systems

ENV_DIR=s3config
../repo/venv/bin/ansible -i env/${ENV_DIR}/inventory -b -m shell -a "ctrctl␣
,→restart scality-utapi" *-cluster1

Exclude a Bucket’s Metrics

You can exclude a list of buckets from any UTAPI metric collection.
1. Edit the the env_utapi section in env/${ENV_DIR}/group_vars/all by adding the
buckets you want to exclude.

2023, Scality, Inc 70


env_utapi:
[...]
event_filter_enabled: True
event_filter:
deny:
bucket: [demo1, demo2, demo3]

In this example, the demo1, demo2, and demo3 buckets are excluded from UTAPI met-
ric collection.
2. Run the following command to apply the changes.

$ ansible-playbook -i env/s3config/inventory run.yml -t s3,utapi --skip-


,→tags requirements,run::images,cleanup

Enable Service User

The service user is created under the scality-internal account, and can query all met-
rics regardless of who owns them. It is disabled by default.
1. Enable the service user under env_utapi in group_vars/all.

env_utapi:
service_user_enabled: True

2. List the hosts to be reconfigured.

../repo/venv/bin/ansible -i env/EnvDir/inventory --list-hosts runners_s3

3. Reconfigure each S3 frontend service, one host at a time. If the load is balanced
between hosts, first disable the host temporarily from the load balancer configura-
tion.

./ansible-playbook -i env/EnvDir/inventory run.yml --skip-tags /


requirements,run::images -l Host -t s3,utapi

4. Find the access and secret keys in env/s3config/vault/


service-utapi-user-keys.json.
Refer to Service Utilization API for more information on UTAPI metrics.

2023, Scality, Inc 71


Upgrade UTAPI V1 to V2

You can choose between UTAPI V1 and UTAPI V2 during an upgrade. UTAPI V1 is retained
by default and continues to operate after an upgrade if no reconfigurations are entered.
1. Enable UTAPI V2 by changing env_utapi.enabled to true and adjusting settings
as described in Enable UTAPI V2.
2. Disable UTAPI V1 by setting env_s3.enable_utilization_api to false.

Note: If both are set, a pre-check assertion is given.

In addition to modifying the env/s3config/group_vars/all configuration file, you must


migrate UTAPI V1 data to UTAPI V2. Data migration scripts are provided for this purpose.
To migrate indexes without bringing a complete history forward and ensure UTAPI v2
provides correct metrics, use the Simple Data Migration procedure.
To migrate bringing a complete history forward and maintaining UTAPI v2 interoperability
with historic data, use the Full Data Migration procedure.
For either migration, first configure env_utapi and change env_s3.
enable_utilization_api to false in group_vars/all.

Note: Old UTAPI metrics persist in Redis until the migration is complete.

Simple Data Migration

1. Perform a rolling upgrade on stateful nodes as described in Upgrade Stateful Com-


ponents. On a live setup, each node will receive information in the format it is ex-
pecting until all are upgraded to V2.
2. Upgrade the stateless components as described in Upgrade Stateless Components.
3. Trigger a reindex for UTAPI V2.

2023, Scality, Inc 72


On EL7 systems

$ docker exec -it -e 'ENABLE_UTAPI_V2=1' scality-utapi node bin/reindex.js -


,→-now

On EL8 systems

$ ctrctl exec --tty --user 0 scality-utapi 'ENABLE_UTAPI_V2=1 node bin/


,→reindex.js --now'

Full Data Migration

1. Trigger a reindex on the running UTAPI V1 instance.


From a node with CloudServer running on it, enter:

On EL7 systems

$ docker exec -it scality-s3 python3.4 /home/scality/s3/node_modules/utapi/


,→lib/reindex/s3_bucketd.py

On EL8 systems

$ ctrctl exec --tty --user 0 scality-s3 python3.4 /home/scality/s3/node_


,→modules/utapi/lib/reindex/s3_bucketd.py

2. In env/s3config/group_vars/all, set env_utapi.snapshot_enabled to False.


3. Perform a rolling upgrade on stateful nodes as described in Upgrade Stateful Compo-
nents. On a live setup, each node receives information in the format it is expecting
until all are upgraded to V2.
4. Upgrade the stateless components as described in Upgrade Stateless Components.
5. Run the following command to start the migration.

2023, Scality, Inc 73


On EL7 systems

$ docker exec -u root -it scality-utapi yarn start_v2:task:migrate

On EL8 systems

$ ctrctl exec --tty --user 0 scality-utapi yarn start_v2:task:migrate

This yarn command is idempotent. If it fails, run it again.


6. In env/s3config/group_vars/all, set env_utapi.snapshot_enabled to True.
7. Perform another rolling upgrade on the stateful nodes.
8. Trigger a reindex for UTAPI V2.

On EL7 systems

$ docker exec -it -e 'ENABLE_UTAPI_V2=1' scality-utapi node bin/reindex.js -


,→-now

On EL8 systems

$ ctrctl exec --tty --user 0 scality-utapi 'ENABLE_UTAPI_V2=1 node bin/


,→reindex.js --now'

Elasticsearch Configuration

Send Unfiltered Logs to an Elasticsearch Cluster

The env_log_centralization variable enables sending complete logs to the designated


Elasticsearch cluster:

env_log_centralization:
endpoints:
- "10.100.132.51:9200"
- "10.100.132.53:9200"
- "10.100.132.52:9200"

2023, Scality, Inc 74


Note: The Elasticsearch cluster to which log centralization can send logs can be of
versions 2.3, 2.4, or 5.

If env_log_centralization.endpoints is defined, a host must also be defined in the


[loggers] inventory group to have a running curator process that can delete old indexes.
If env_log_centralization is defined and env_log_centralization.endpoints is not, Federa-
tion:
1. Spawns an embedded Elasticsearch cluster in the hosts belonging to the [loggers]
group in the federation/env/${ENV_DIR}/inventory file, inside containers called
scality-elasticsearch.
2. Sends unfiltered logs to this Elasticsearch cluster.
Use env_log_centralization.components to choose the components from which to gather
and centralize logs. For example, under the following configuration:
env_log_centralization:
components: ['bucketd','vault']

Federation reads logs only from the scality-bucketd and scality-vault containers. The
default value is to gather logs from all containers:
components: ['bucketd', 'metadata-bucket', 'metadata-vault', 's3', 'vault']

Authentication and SSL configuration are the same as for env_elasticsearch:


env_elasticsearch:
user: USER
password: PASSWORD
ssl: yes
cacert: cacert-file.crt
endpoints: [...]

Configuring Elasticsearch Instances Hosted in Loggers Group

When env_log_centralization.endpoints is not defined, Federation starts an Elastic-


search cluster in the hosts listed in the [loggers] group of the S3 cluster’s inventory.
Configure this cluster’s Elasticsearch nodes using the env_logger variable.
Listening ports can be tuned with:
env_logger:
elasticsearch_port: 19200
elasticsearch_cluster_port: 19300

2023, Scality, Inc 75


RAM consumption (in GB) can be tuned with:

env_logger:
es_heap_size_gb: 8

The env_logger.elasticsearch_data_path variable designates where Elasticsearch in-


dexes shall be stored. If this variable is not set, Federation uses the env_host_data value.

env_logger:
elasticsearch_data_path: /some/directory/in/the/host

Configure Log Centralization Index Retention Policy

Elasticsearch index retention is handled by the curator process, run every night within
the scality-elk container. Fine-tune the maximum size of data hosted within the log cen-
tralization Elasticsearch cluster with:

env_logger:
max_index_size_gb: 250

Configure Kibana for Log Centralization Visualization

Set the port Kibana listens on:

env_logger:
kibana_port: 15601

Post-install reconfiguration procedure

1. Go to the federation/ folder of the s3-offline installer.


2. Reconfigure the logger:
$ ENV_DIR=s3config
$ ./ansible-playbook -i env/${ENV_DIR}/inventory run.yml --skip-tags␣
,→requirements -t logger

2023, Scality, Inc 76


Enabling Cross-Region Replication (CRR) Between Two Sites

Warning: When configuring site level replication (echo mode), do not create any
accounts or buckets until the deployment is complete on both sites.

Important: This feature does not support the inter-component TLS protocol. Do not
enable it on deployments with TLS.

Tip: Both sites can be the replication source in a bi-directional replication schema.

Note: Cross Region Replication (CRR) supports buckets with object lock enabled in
bucket level replication mode. Site level replication (echo mode) does not support object
lock.

Important: Make sure the CRR target site is up and accessible before you start config-
uring CRR.

CRR Configuration on the Source S3 Cluster

1. Enable Cross-Region Replication (CRR) by activating Backbeat in file env/


${ENV_DIR}/group_vars/all:

env_enable_backbeat: yes

2. In a bi-directional replication schema, choose a different replication group ID on


each site.

env_metadata_replication_group_id: RG001

3. To set up SSL client certificate verification (optional) use the following variables:
• env_backbeat_ssl_cert: Name of the certificate PEM file
• env_backbeat_ssl_ca_bundle: Name of the CA certificates PEM file
• env_backbeat_ssl_key: Name of the private key PEM file

2023, Scality, Inc 77


4. You can also set up internal Backbeat communication TCP ports. This is required
if another ZooKeeper cluster is running on the same servers:
• env_backbeat_quorum_port
• env_backbeat_quorum_follower_port
• env_backbeat_quorum_election_port

Note: This setup is only required if SVSD is installed.

5. Tune replication according to your needs and the inter-site network bandwidth.
• env_backbeat_partition_count configures the overall site replication paral-
lelization. While the default value (5) fits for a majority of use cases, set it
to:
– Between 10 and 40 for 2 Gb/sec inter-site link bandwidth.
– Between 1 and 5 for 10 Mb/sec to 100 Mb/sec inter-site link bandwidth.

Note: Federation only applies this setting if Backbeat has not been installed
yet.

• env_backbeat_processor_concurrency configures the number of replications


processed in parallel for each aformentioned partition. Default: 10.
• env_backbeat_mpu_parts_concurrency configures how many parts from a
multiple-part object are processed in parallel. Default: 10.
• env_backbeat_processor_retry_timeout_s controls (in seconds) the time
window during which Backbeat retries a replication, in case of failure. Default:
300.
• env_backbeat_replication_failure_expiry_time_s controls the failed CRR
log retention time. Default: 86400 seconds (24 hours). This value can be
increased in group_vars/all to retain the logs for a longer time period.
6. Declare target site information in env_replication_endpoints:

env_replication_endpoints:
- site: {{targetSiteName}}
default: true
servers:
- {{ipAddress1}}:9080
- {{ipAddress2}}:9080
- {{ipAddress3}}:9080
(continues on next page)

2023, Scality, Inc 78


(continued from previous page)
- {{ipAddress4}}:9080
- {{ipAddress5}}:9080
echo: {{true/false}}

• Where {{targetSiteName}} is the name of the target/remote site to which


CRR will send S3 objects, and
• The {{ipAdressesX}} settings identify target cluster addresses. Port 9080
is the value of env_backbeat_replication_port (default 9080), defined at the
target site.

Note: Multiple targets can be configured, but a given source bucket can only be
replicated to a single target bucket.

7. Set the echo variable to true or false, depending on the kind of replication needed:
• true for site-level replication - Replicate all buckets from all accounts.
• false for bucket-level replication - Replicate only selected buckets.

Post-install reconfiguration procedure

1. Go to the s3-offline installer’s federation/ folder.


2. Backup the env/${ENV_DIR} directory.
ENV_DIR=s3config
cp -r env/${ENV_DIR} env/${ENV_DIR}.old

3. If echo was set to true, copy the env/${ENV_DIR}/vault/


admin-clientprofile/admin1.json file from the target site into env/
${ENV_DIR}/vault/admin-clientprofile/.
4. Install the Backbeat component.
$ ./ansible-playbook -i env/${ENV_DIR}/inventory run.yml -t backbeat,
,→logger

5. List stateless hosts.


$ ../repo/venv/bin/ansible -i env/${ENV_DIR}/inventory --list-hosts␣
,→runners_s3

hosts (7):
md1-cluster1
md2-cluster1
md3-cluster1
md4-cluster1
md5-cluster1
stateless01
stateless02
2023, Scality, Inc 79
6. For each listed host, reconfigure s3. If the host is behind a load balancer, deac-
tivate the server before reconfiguring it.
$ ./ansible-playbook -i env/${ENV_DIR}/inventory run.yml --skip-tags
,→"requirements,run::images" -t s3,vault --limit stateless01

7. If env_metadata_replication_group_id was modified, reconfigure metadata.


List stateful hosts.
$ ../repo/venv/bin/ansible -i env/${ENV_DIR}/inventory --list-hosts␣
,→runners_metadata

hosts (5):
md1-cluster1
md2-cluster1
md3-cluster1
md4-cluster1
md5-cluster1

8. Reconfigure each listed server, one server at a time.


$ ./ansible-playbook -i env/${ENV_DIR}/inventory run.yml --skip-tags
,→"requirements,run::images" -t metadata --limit md1-cluster1

9. Run the kafka-rebalance tooling playbook to ensure all partitions are rack-
aware.
$ ./ansible-playbook -i env/${ENV_DIR}/inventory tooling-playbooks/kafka-
,→rebalance.yml

Note: If echo was set to false, follow the procedure in the “Enabling a Bucket’s Replica-
tion” section of S3 Connector Operation to enable a bucket’s replication.

CRR Configuration on the Target S3 Cluster (Optional)

1. Define the env_backbeat_replication_port variable:


env_backbeat_replication_port: 9080

2. Set up encrypted inter-site communication with the following variables:


• env_backbeat_use_ssl: Set to true to activate encrypted inter-site communi-
cation
• env_backbeat_ssl_cert: Name of the PEM certificate file
• env_backbeat_ssl_ca_bundle: Name of the CA’s PEM certificate file
• env_backbeat_ssl_key: Name of the PEM private key file

2023, Scality, Inc 80


3. Define client SSL authentication. When set, only the source S3 cluster can connect
to the target S3 cluster:
• env_backbeat_replication_verify_client: set to true to enable client SSL
certificate verification
• env_backbeat_ssl_client_cert: Path to the client PEM certificate file to be
verified. Set up the same certificate on the source S3 cluster.

Post-install reconfiguration procedure

1. Go to the federation/ folder of the s3-offline installer.


2. List stateless hosts.
$ ENV_DIR=s3config
$ ../repo/venv/bin/ansible -i env/${ENV_DIR}/inventory --list-hosts␣
,→runners_s3

hosts (7):
md1-cluster1
md2-cluster1
md3-cluster1
md4-cluster1
md5-cluster1
stateless01
stateless02

3. For each listed host, reconfigure s3. If the host is behind a load balancer, deac-
tivate the server before reconfiguring it.
$ ./ansible-playbook -i env/${ENV_DIR}/inventory run.yml --skip-tags
,→"requirements,run::images" -t s3 --limit stateless01

$ ./ansible-playbook -i env/${ENV_DIR}/inventory run.yml --skip-tags


,→"requirements,run::images" -t s3 --limit stateless02

$ etc.

4. If env_metadata_replication_group_id was modified, reconfigure metadata.


List stateful hosts.
$ ../repo/venv/bin/ansible -i env/${ENV_DIR}/inventory --list-hosts␣
,→runners_metadata

hosts (5):
md1-cluster1
md2-cluster1
md3-cluster1
md4-cluster1
md5-cluster1

5. Reconfigure each listed server, one server at a time.

2023, Scality, Inc 81


$ ./ansible-playbook -i env/${ENV_DIR}/inventory run.yml --skip-tags
,→"requirements,run::images" -t metadata --limit md1-cluster1

6. Run the kafka-rebalance tooling playbook to ensure all partitions are rack-
aware.
$ ./ansible-playbook -i env/${ENV_DIR}/inventory tooling-playbooks/kafka-
,→rebalance.yml

Enabling Lifecycle Expiration Policies

Important: This feature does not support enabling internal encryption for the cluster.

Enable Lifecycle Expiration Policies by Activating Backbeat

Check that env_enable_backbeat: is set to “yes” in env/s3config/group_vars/all.

env_enable_backbeat: yes

Scale Lifecycle Expiration

To apply any of the following configurables, change the values in env/s3config/


group_vars/all and run the post-install reconfiguration procedure.

Change Lifecycle Conductor Concurrency

env_backbeat_lifecycle_conductor_concurrency
Default: 1000
This configurable controls the number of buckets that are processed concurrently to
check if they have a Lifecycle Expiration policy set.

2023, Scality, Inc 82


Change Lifecycle Bucket Processor Concurrency

env_backbeat_lifecycle_bucket_processor_concurrency
Default: 1
This configurable controls the number of Lifecycle Bucket Processors that concurrently
process buckets with Lifecycle Policies enabled and identify the objects that match the
Lifecycle policy rules. Objects that are eligible to be lifecycled are queued in scality-
backbeat-queue (Kafka) to be processed by the Lifecycle Object processor.

Change Lifecycle Object Processor Concurrency

env_backbeat_lifecycle_object_processor_concurrency
Default: 20
This configurable controls the number of object processors which concurrently do the
job of expiring objects that were identified and queued in scality-backbeat-queue (Kafka)
by the Lifecycle Bucket Processor.

Increase Partitions in scality-backbeat-queue (Kafka)

env_backbeat_lifecycle_partition_count
Default: 5
This configurable can be used to increase the number of partitions in scality-backbeat-
queue (Kafka) to increase the rate of processing of objects to be lifecycled. This option
requires analyzing the deployment to identify that increasing the partitions is needed to
increase the rate of objects being lifecycled.

Warning: The number of partitions for a topic in scality-backbeat-queue (Kafka) can-


not be decreased once it is increased. Increasing the number of partitions can impact
system resources (CPU, RAM, Network I/O etc.) and the quality of service of inbound
S3 traffic. Contact Scality Support before increasing the number of partitions.

Post-install reconfiguration procedure

1. Go to the s3-offline installer’s federation/ folder.


2. Change the configurables in env/s3config/group_vars/all.
3. Apply the configurables to Backbeat component.

2023, Scality, Inc 83


$ ENV_DIR=s3config
$ ./ansible-playbook -i env/${ENV_DIR}/inventory run.yml -t backbeat,logger

Setting Up Server-Side Bucket Encryption

To encrypt S3 buckets at the S3 Connector level, you can use any of these Key Manage-
ment Systems (KMS) to manage keys through the NAE protocol:
• Gemalto SafeNet KeySecure
• Kmip Hashicorp Vault
• Hytrust KMIP Server

Important:
• To set up Gemalto SafeNet KeySecure, refer to the official Scality documentation.
• To set up Kmip Hashicorp Vault or Hytrust KMIP Server, contact Scality Technical
Services.

When creating an encrypted bucket, KMS creates an encrypted key and sends it to the
S3 Connector to encrypt/decrypt one or several objects in a bucket. A key must be gen-
erated for every created bucket.

KeySecure Requirements

Installing KeySecure KMS requires:


• A single Gemalto KeySecure server or a cluster, supporting the NAE protocol
• Port 9000 open
• Either admin credentials for KeySecure or S3 credentials for a KeySecure user that
is part of the Key Users group.

Note: For more information on configuring KeySecure, contact Scality Support.

2023, Scality, Inc 84


Install KeySecure

To configure KMS:
1. Log in as root on the supervisor.
2. Go to the federation folder.

cd /srv/scality/s3/s3-offline/federation

3. Set the ENV_DIR environment variable to point to the folder containing the cluster’s
configuration. This must be a directory stored in the env/ directory.

ENV_DIR=s3config

4. Create the kms/ configuration directory.

# mkdir env/${ENV_DIR}/kms

5. Copy Gemalto’s ProtectAppICAPI.properties file to the env/${ENV_DIR}/kms/ di-


rectory.
6. To enable TLS, update the following parameters in ProtectApplCAPI.properties:
• NAE_IP= (Enter the IP address for the Network Attached Encryption server.)
• Protocol= (Enter either ssl or tcp.)
• CA_File= (Enter the file name of the certificate authority.)
• Cert_File= (Enter the file name of the client certificate.)
• Key_File= (Enter the file name of the private key associated with the client
certificate specified in Cert_File.)
7. Copy the CA, certificate and key files, corresponding to the CA_File, Cert_File and
Key_File settings respectively, to the env/${ENV_DIR}/kms/ directory.
8. Edit the env/${ENV_DIR}/group_vars/all file.
9. Uncomment the env_s3.kms part of the configuration and specify the KMS user-
name and password and the name of the files to be copied from the env/
${ENV_DIR}/kms/ directory.

env_s3:
[...]
kms:
username: scality-s3
password: Scality123!
files:
(continues on next page)

2023, Scality, Inc 85


(continued from previous page)
- Ca_File.cert
- Cert_File.cert
- Key_File.secret

Post-install reconfiguration procedure

1. Go to the federation/ folder of the s3-offline installer.


2. List stateless hosts.
$ ENV_DIR=s3config
$ ..repo/venv/bin/ansible -i env/${ENV_DIR}/inventory --list-hosts␣
,→runners_s3

hosts (7):
md1-cluster1
md2-cluster1
md3-cluster1
md4-cluster1
md5-cluster1
stateless01
stateless02

3. For each listed host, reconfigure s3. If the host is behind a load balancer, deac-
tivate the server before reconfiguring it.
$ ./ansible-playbook -i env/${ENV_DIR}/inventory run.yml --skip-tags␣
,→requirements -t s3 --limit stateless01
$ ./ansible-playbook -i env/${ENV_DIR}/inventory run.yml --skip-tags␣
,→requirements -t s3 --limit stateless02
$ etc.

Set Up Encryption with a KMIP Appliance

KMIP is a standard protocol for accessing third-party key management systems. Scal-
ity’s KMIP driver for S3 Connector requires a version 1.2 (or later) KMIP server. The server
must also support the following KMIP profiles:
• The Baseline Server profile (with TLS transport encoding and TTLV message en-
coding)
• The Symmetric Key Lifecycle Server profile
• Basic Cryptographic Server profile
To configure KMIP:

2023, Scality, Inc 86


1. Generate client and server certificates.
2. Log in as root on the supervisor.
3. Go to the federation folder.

cd /srv/scality/s3/s3-offline/federation

4. Set the ENV_DIR environment variable to point to the folder containing the cluster’s
configuration. This must be a directory stored in the env/ directory.

ENV_DIR=s3config

5. Create the kmip/ configuration directory.

# mkdir env/${ENV_DIR}/kmip

6. Upload the certificates and key files into env/${ENV_DIR}/kmip.


7. Edit the env/${ENV_DIR}/group_vars/all file.
8. Uncomment the env_s3.kmip part of the configuration.

env_s3:
[...]
kmip:
port: 5696
host: <KMIP server address>
compoundCreate: false
bucketAttributeName: x-zenko-bucket
pipelineDepth: 8
key: <client key file name>
cert: <client cert file name>
ca:
- <server cert file name>
- <ca cert file name>

The following table describes the KMIP configuration fields. Adapt them according to
your needs.

2023, Scality, Inc 87


Field Default Description
Value
port 5696 TCP port the KMIP server listens to
host undefined The KMIP server’s host name or IP address
key, cert, ca - Federation must find the required key/cert/ca files in the
kmip directory in the deployment inventory folder
(relative to /group_vars/all, this is ../kmip/).
compound- false Set this option true if the KMIP server supports Create
Create and Activate in one operation. For two-step creation,
leave it false to prevent clock desynchronization issues.
(Two-step creation uses the server’s “now” instead of the
client-specified activation date, which also targets the
present instant.)
bucketAt- - Set the bucket name attribute here if the KMIP server
tributeName supports storing custom attributes along with the keys.
The reason for this option is that KMIP appliances
reference the managed objects using an unfriendly
identifier that is not related to the bucket to which the
key belongs
This option enables you to specify an attribute name
that stores the bucket name to which the key belongs.
Because different KMIP appliances have different rules
and attribute-naming constraints, this field has no valid
default value. Leaving this option unset has no
functional effect, but setting it help debugging and
administration.
pipelineDepth 8 This parameter specifies the depth of the request
pipeline. If the server sends replies out of order and
confuses the client, setting this value to 1 is a
convenient workaround for a server-side bug.
The default value of 8 works well, and there is little
performace improvement to be gained by tuning this
value.

Note: Zero is not an appropriate value. Given a 0, the


server falls back to 1.

Post-install reconfiguration procedure

1. Go to the federation/ folder of the s3-offline installer.

2023, Scality, Inc 88


2. List stateless hosts.
$ ..repo/venv/bin/ansible -i env/${ENV_DIR}/inventory --list-hosts␣
,→runners_s3

hosts (7):
md1-cluster1
md2-cluster1
md3-cluster1
md4-cluster1
md5-cluster1
stateless01
stateless02

3. For each listed host, reconfigure s3. If the host is behind a load balancer, deac-
tivate the server before reconfiguring it.
$ ./ansible-playbook -i env/${ENV_DIR}/inventory run.yml --skip-tags␣
,→requirements -t s3 --limit stateless01
$ ./ansible-playbook -i env/${ENV_DIR}/inventory run.yml --skip-tags␣
,→requirements -t s3 --limit stateless02
$ etc.

Enabling SAML Support

The S3 Connector Vault component enables integration of Single Sign-On (SSO) services
using SAML-based authentication or Active Directory Federation Services (ADFS). Doing
so precludes having to replicate the entire user base into S3 Vault at deployment time.
Instead, the first time an S3 Connector user authenticates via SAML, an IAM user is cre-
ated in S3 Vault and an access key is issued for that user.
Scality can help with third-party SAML integrations. Most customers who deploy SAML
authentication use Microsoft’s ADFS solution, with which Scality is best positioned to
assist. See Integrating Keycloak and Active Directory to S3 Connector in the Technical
Services Knowledge Base.
Enable SAML support in Vault by uncommenting and setting the env_vault_saml section
variables.
The following variables can be set for SAML support in the configuration file:

2023, Scality, Inc 89


Variable Description Default (if
uncommented)
entry_point ID provider’s URL —
entry_point_logout ID provider’s logout URL ADFS
(Optional)
issuer.host IP address of Vault interface https://127.0.0.1
to single-sign-on users
issuer.port Port for the Vault interface to 3030
single-sign-on users
secretkey_expiration_hours Number of hours the retrieved 1
accessKey and secretKey are
valid
session_expiration_hours Number of hours a session is 6
valid
tls_cert Path to the HTTPS certificate —
file on the target machines
tls_key Path to the HTTPS cerificate —
private key on the target
machines
idp_cert * Path to the X509 certificate —
retrieved from the SAML
identity Provider
account_name_claim Field containing the account accountname
name in ADFS or in SAML 2.0
implementations
account_address_claim Field containing the account emailaddress
email address in ADFS or in
SAML 2.0 implementations
primary_group_sid_claim Field containing the ID for the primarygroupsid
user in ADFS or in SAML 2.0
implementations
* The identity provider’s X509 certificate is used to verify the signature of SAML re-
sponses. Refer to your SAML identity provider’s documentation to retrieve this cer-
tificate. For ADFS, extract this certificate from the Metadata XML file at https://
<your-ADFS-server>/FederationMetadata/2007-06/FederationMetadata.xml

2023, Scality, Inc 90


Configuring S3 Bit Rot

You can set up the S3 bit rot feature using Ansible from the S3 Installer’s federation/
folder.

Enabling the Bit Rot Feature

S3 bit rot feature components are configured in the env_bitrot section in


env/MyEnv/group_vars/all.
Enable S3 bit rot with the env_bitrot.enabled variable:

env_bitrot:

enabled: yes

Bucket Lists to Check

You can set which S3 buckets to check by setting either:


• A blacklist, env_bitrot.buckets_blacklist, to check all S3 buckets except the speci-
fied ones, or
• A whitelist, env_bitrot.buckets_whitelist, to check only the specified S3 buckets.
In the following example, the bucket1 and bucket2 S3 buckets are blacklisted and will
not be checked. All other buckets are checked.

env_bitrot:

buckets_blacklist: ["bucket1","bucket2"]

In the next example, bucket3 and bucket4 are listed as the only buckets to check:

env_bitrot:

buckets_whitelist: ["bucket3","bucket4"]

Both buckets_blacklist and buckets_whitelist settings are mutually exclusive. If both are
set, Federation throws an error.
Default value: empty

2023, Scality, Inc 91


Elasticsearch Index Prefix

The S3 bit rot ingester creates an Elasticsearch index for each checked S3 bucket. The
index’s name is prepended by a prefix you can override:

env_bitrot:

es_index_prefix: 'new-prefix'

In this configuration, the Elasticsearch index for the S3 bucket “bucket1” is “new-prefix-
bucket1”.
Default value: scality-bitrot

Maximum Object Age for Checks

The S3 bit rot checker regularly checks the integrity of S3 objects that have:
• An entry (document) in the bit rot database, and
• A check age older than the value set in the env_bitrot.object_max_check_age vari-
able.
The env_bitrot.object_max_check_age variable format is defined in the Date Math algo-
rithm. In the following example, the bit rot checker checks objects not checked in the
past month:

env_bitrot:

object_max_check_age: 'now-1M'.

Default value: now-1M

Warn When an S3 Object Is Empty

S3 objects can be empty. Use env_bitrot.warn_empty_objects to treat empty S3 objects


as a bit rot check error:

env_bitrot:

warn_empty_objects: True

Default value: True

2023, Scality, Inc 92


Catchup Script Pause Timer

In order to prevent the bitrot-catchup script from slowing Metadata down, a 1 second
pause is inserted between each listing call sent to Metadata. This pause can be tuned
with env_bitrot.catchup_pause_s.

env_bitrot:

catchup_pause_s: 1

Default value: 1 second

Elasticsearch Configuration

S3 bit rot requires Elasticsearch to be configured:

# Elasticsearch configuration. Set IP:PORT to the ES nodes. These can be the␣


,→RING main ES cluster.

env_elasticsearch:

endpoints:

- "IP:PORT"

- "IP:PORT"

If Elasticsearch is not correctly configured, Federation throws an error.

Setting Up Bucket Notification

The bucket notification feature enables an external application to receive notifications


when certain events happen in a bucket. S3 Connector acts as a Kafka producer, and
Kafka brokers the information produced to consumers as configured.

Limitations

Object Delete Event Limitation

S3:ObjectRemoved:Delete event notification messages are produced with a null times-


tamp.

2023, Scality, Inc 93


AWS Compliance Limitations

The following fields are required for AWS conformity, but are not supported by the S3
Connector backend. These fields are returned with a null value.
• userIdentity.principalId
• requestParameters.sourceIPAddress
• responseElement.x-amz-request-id
• responseElement.x-amz-id-2
• s3.bucket.ownerIdentity.principalId
• s3.bucket.arn
• s3.object.eTag
• s3.object.sequencer

Target Cluster Limitations

The RING Bucket Notification Kafka producer is configured to:


• Retry an endless amount of times when messages are not delivered.
• Abort the request (and its potential retries) after 5000 ms overall.
• Expect at least one in-sync replica acknowledgement of the message for the re-
quest to be successful.

Configure Bucket Notification

To configure bucket notification, edit the following block in federation/env/


${ENV_DIR}/group_vars/all, uncommenting, enabling, and configuring features as re-
quired and as desired. Configuration examples follow.
# env_bucket_notifications:
# enabled: no
# destinations:
# # set a unique resource name to the destination
# - resource: 'destination1'
# # supported values for type - 'kafka'
# type: 'kafka'
# # comma separated url/brokers
# host: 'kafka-host1:9092,kafka-host2:9092'
# # port is not necessary for kafka as destination, (optional)
(continues on next page)

2023, Scality, Inc 94


(continued from previous page)
# port: '9092'
# # topic name
# topic: 'topic1'
# # auth section can be skipped if the destination does not use␣
,→ any
# # authentication (optional).
# auth:
# # supported auth type - 'kerberos'
# type: 'kerberos'
# # set ssl to true if required
# ssl: false
# # CA certificate file for verifying the broker's certificate,
# # (optional) if ssl is not required
# ca: 'ca certificate file name'
# # Client's certificate, (optional) if ssl is not required
# client: 'client certificate file name'
# # Client's key, (optional) if ssl is not required
# key: 'key file name'
# # Key password, if any, (optional) if ssl is not required
# keyPassword: 'key-password'
# # use 'SASL_SSL' as protocol for SSL, otherwise use 'SASL_
,→PLAINTEXT'

# protocol: 'SASL_PLAINTEXT'
# # Kerberos service name configured in kafka server
# serviceName: 'kafka'
# # Principal to authenticate in kerberos
# principal: 'user/[email protected]'
# # Keytab file of the principal
# keytab: 'keytab file name'

Note: The required key/cert/ca/keytab files must be placed in the backbeat directory
in the deployment inventory folder env/client-template/backbeat.

Notifications to Single Broker, No Authentication

The following sample configuration delivers notifications to a single Kafka broker


(10.200.2.92:9092) without authentication.
env_bucket_notifications:
enabled: yes
# set a unique resource name to the destination
destinations:
(continues on next page)

2023, Scality, Inc 95


(continued from previous page)
- resource: 'destination1'
type: 'kafka'
host: '10.200.2.92:9092'
topic: 'topic1'

Notifications to Multiple Brokers, No Authentication

The following sample configuration delivers notifications to multiple Kafka brokers


(10.200.2.92:9091,10.200.2.92:9092,10.200.2.92:9093) without authentication.

env_bucket_notifications:
enabled: yes
destinations:
- resource: 'destination2'
type: 'kafka'
host: '10.200.2.92:9091,10.200.2.92:9092,10.200.2.92:9093'
topic: 'topic2'

Notifications to Multiple Destinations, No Authentication

The following sample configuration delivers notifications to multiple destinations with-


out authentication.

env_bucket_notifications:
enabled: yes
destinations:
- resource: 'destination1'
type: 'kafka'
host: '10.200.2.92:9091'
topic: 'topic1'
- resource: 'destination2'
type: 'kafka'
host: '10.200.2.92:9091,10.200.2.92:9092,10.200.2.92:9093'
topic: 'topic2'

2023, Scality, Inc 96


Notifications to Stretched Deployment

To deploy bucket notifications for a stretched deployment:


env_bucket_notifications:
enabled: yes

This deploys bucket notification containers, backbeat-queue (Kafka), and quorum


(ZooKeeper), and runs the run-backbeat-worker-base role to create ZooKeeper paths and
topics needed for bucket notification.

Notifications to Single Broker with Authentication

The following sample configuration delivers notifications to a single Kafka broker


(10.200.2.92:9092) with Kerberos authentication.
env_bucket_notifications:
enabled: yes
destinations:
# set a unique resource name to the destination
- resource: 'destination1'
type: 'kafka'
host: '10.200.2.92:9092'
topic: 'topic1'
auth:
# This section is optional, can be skipped if the destination␣
,→does not

# use any auth. supported auth type - 'kerberos'


type: 'kerberos'
# set ssl to true if required
ssl: false
# CA certificate file for verifying the broker's certificate
ca: 'ca certificate file name'
# Client's certificate
client: 'client certificate file name'
# Client's key
key: 'key file name'
# Key password, if any
keyPassword: 'key-password'
# use 'SASL_SSL' as protocol for SSL, otherwise use 'SASL_
,→PLAINTEXT'

protocol: 'SASL_PLAINTEXT'
# Kerberos service name configured in kafka server
serviceName: 'kafka'
# Principal to authenticate in kerberos
(continues on next page)

2023, Scality, Inc 97


(continued from previous page)
principal: 'user/[email protected]'
# Keytab file of the principal
keytab: 'keytab file name'

Notifications to Multiple Brokers No SSL

Allow configuring authentication with destination to multiple brokers with


user/password in plaintext mode.
enabled: yes
destinations:
- resource: 'destination1'
type: 'kafka'
host: '10.200.2.92:9091,10.200.2.92:9092,10.200.2.92:9093'
topic: 'topic1'
ssl: false
protocol: 'SASL_PLAINTEXT'

Notifications to Multiple Brokers with SSL

Allow configuring authentication with destination to multiple brokers with


user/password in plaintext mode.
env_bucket_notifications:
enabled: yes
destinations:
- resource: 'destination1'
type: 'kafka'
host: '10.200.2.92:9091,10.200.2.92:9092,10.200.2.92:9093'
topic: 'topic1'
ssl: true
protocol: 'SASL_SSL'

Activating Үour Bucket Notification Configuration

Use the following post-install procedure:

Post-install reconfiguration procedure

1. Go to the federation/ directory of the s3-offline installer.

2023, Scality, Inc 98


2. Enable bucket notification:

Important: If the RING already has a ZooKeeper cluster run-


ning for SOFS/SVSD, reassign the ports used by Backbeat in
env/${ENV_DIR}/group_vars/all:
• env_quorum_port: 22181
• env_quorum_follower_port: 22888
• env_quorum_election_port: 23888

Run:
$ ENV_DIR=s3config
$ ./ansible-playbook -i env/${ENV_DIR}/inventory run.yml -t bucket-
,→notifications

3. List the hosts to be reconfigured:


$ ../repo/venv/bin/ansible -i env/${ENV_DIR}/inventory --list-hosts␣
,→runners_s3

hosts (5):
md1-cluster1
md2-cluster1
md3-cluster1
md4-cluster1
md5-cluster1

4. Reconfigure each S3 service, one host at a time. If the load is balanced between
hosts, first temporarily disable the host from the load balancer configuration.
$ ./ansible-playbook -i env/${ENV_DIR}/inventory run.yml --skip-tags
,→"requirements,run::images" -l md1-cluster1 -t s3

$ ./ansible-playbook -i env/${ENV_DIR}/inventory run.yml --skip-tags


,→"requirements,run::images" -l md2-cluster1 -t s3

... etc.

Once you finish installing the bucket notification feature, refer to Configuring Bucket No-
tification.

2023, Scality, Inc 99


NFSd Feature Support

The NFS daemon supports the features listed in the following table.

Protocol/Process Sup- Not Supported Comments


ported/Compliant
NFSv4 X Except as mentioned
here, NFSv4 is supported,
though not fully tested.
NFSd compatibility is
discussed in the next
table.
NFSv4.1 X
pNFS X
Hardlinks X
POSIX X
ACLs X
Concurrent directories X Works well, with a high
(creating, deleting files number of ops/s because
and folders) it communicates directly
with the metadata engine.
Concurrent Connector File RO RW
access
File Rename X Renaming files is instant
and implemented as pure
metadata operations
Directory Rename System call Renaming folders
unsupported renames each contained
file. mv recursively moves
files efficiently, but must
also create/destroy
directories. Slow
compared to a traditional
file system, but it only
performs metadata
operations, no direct data
copy.
Quotas X
Bucket discovery X
Bucket/Object X
permissions dynamic
mapping
Versioning X
continues on next page

2023, Scality, Inc 100


Table 3 – continued from previous page
Protocol/Process Sup- Not Supported Comments
ported/Compliant
CRR X
Bucket Encryption X
Lifecycle X
UTAPI X
UI to manage the NFS X
configuration

From To Compatible?
frontend TLS backend TLS Yes
NFSd backend TLS No
NFSd frontend TLS Yes
NFSd bucket encryption No
frontend TLS bucket encryption Yes
backend TLS bucket encryption Untested

Enabling NFS Access to S3 Buckets

Warning: NFS access is not compatible with internal SSL communication.

The nfsd filesystem sharing service can be used to expose data contained in buckets,
making it possible to access data from S3 as well as from the client on the locally-
mounted NFS filesystem.

Important: NFS must be configured after the first deployment, and all buckets listed in
the exports dictionary must exist before the NFS service is deployed.

NFS service is configured in the env_nfsd section.


• The enabled:true statement must be uncommented, and an export dictionary
must be set with bucket information and filesystem parameters.
• The dictionary item key is the export ID. This must be a stable integer.

Important: Do not modify the export ID once it is published.

2023, Scality, Inc 101


• location_constraint is the name of the REST endpoint to be used for newly cre-
ated files.
• bucket is the name of the bucket.
• path is the path of the exported filesystem, as seen from the client.
• The prefix corresponds to the subset of the bucket to export, delimited by the pro-
vided commonPrefix.

Note: The prefix is optional.

• The commonPrefix is a single element which results in the grouping of all the keys
that contain the same string between the prefix (if specified) and the first occur-
rence of the delimiter after the prefix.
• uid and gid are the owner and group attributed to exposed files and directories
(optional).
• Use umask to unset bits from the POSIX-default permissions for directories and files
(0677 and 0666, respectively) (optional).
• access_type defines the global write permission on the export, which may be either
RO or RW (optional).
• protocols is a comma-separated list of all protocols supported, 3, 4 and 9 (op-
tional).
Example:

env_nfsd:
enabled: yes
exports:
76:
location_constraint: us-east-1
bucket: foo
path: /foo
prefix: subset/of/this/bucket/
uid: 1000
gid: 1000
umask: '02'
access_type: RW
protocols: 3, 4
74:
location_constraint: us-east-1
bucket: bar
path: /bar

2023, Scality, Inc 102


Note: As stated in Defining the S3 Cluster Inventory, the nfsd filesystem sharing service
can be used to expose data contained in buckets, making it possible to access data from
S3 as well as from the client on the locally-mounted NFS filesystem.
This service is only enabled on the hosts present in the [filers] section of the inventory.

Post-install reconfiguration procedure

1. Go to the federation/ folder of the s3-offline installer.


2. Reconfigure nfsd as needed.
3. Activate nfsd.
$ ENV_DIR=s3config
$ ./ansible-playbook -i env/${ENV_DIR}/inventory run.yml --skip-tags
,→"requirements" -t nfsd

Audit Logs

Audit Logs Message (LEEF)

LEEF:1.0|Scality|S3C|7.10.8 |FRONTEND AUDIT| devTime=1601549196


dst=new-3nodes-storage-2 sev=warn message=scality kms unavailable.
Using file kms backend unless mem specified. pid=271
‘LEEF:1.0|Scality|S3C|7.10.8 |FRONTEND AUDIT|’ is the LEEF header. Followed by a
tag=value format.

Audit Logs Configuration (LEEF)

Enable the Feature

1. Log in as root on the supervisor.


2. Go to the federation directory (by default /srv/scality/s3/s3-offline/
federation).
3. Set env_enable_leef to yes in env/${ENV_DIR}/group_vars/all.
4. Run the following command to activate LEEF.

2023, Scality, Inc 103


ENV_DIR=s3config
./ansible-playbook -i env/${ENV_DIR}/inventory ./run.yml -t leef

This will create three containers:


• scality-leef-s3 translates the S3 log file to LEEF.
• scality-leef-vault translates the Vault log file to LEEF.
• scality-leef-logger-1 rotates the LEEF log files.
By default the translation job will run every 5 minutes to parse Audit Logs, and write
them to a leef file located in {{ env_host_logs }}/scality-leef usually in /var/log/
s3/scality-leef. To change the default schedule, set env_leef_run_every_s in env/
${ENV_DIR}/group_vars/all to the new value in seconds.

LEEF fields setup

The LEEF translation is done following rules given by the JSON file roles/run-leef/
files/auditlog2leef.json.
Depending on the service(s) in use, add the following fields in the file.

CloudServer

• IAMdisplayName
• accountDisplayName

Listing 7: Output Example


IAMdisplayName: IAMdisplayName

Vault

• IAMdisplayName
• accountDisplayName
• STSsessionName
• roleArn
• action
• signatureVersion

2023, Scality, Inc 104


Old Logs

• accountname
• username

Disable the Feature

1. Set env_enable_leef to no in env/${ENV_DIR}/group_vars/all.


2. Run the following command to stop LEEF.

ENV_DIR=s3config
./ansible-playbook -i env/${ENV_DIR}/inventory ./tooling-playbooks/leef-
,→disable.yml

Launching the Installation

Refer to Offline Installation.

Configuring NFS Clients to Access the S3 Connectors

Configuring NFS Linux Clients

Exported buckets are mounted as regular NFS exports.

$ mount <host>:/export /mount/point ## not recommended, see next chapters

The -o noac mount option enables clients to more rapidly detect changes in directories
and file attributes made from another location, be it another NFS client or an S3 API
client. This option disables the attributes cache and directory listing cache in the client.
Navigating through the file system may negate the benefits of caching the directory list-
ing and file attributes.
Altering the mount option can increase the write throughput to NFS. The write-size
parameter (wsize) is preferably set to the internally used part size of 5 MiB (wsize =
5242880). Likewise, relaxing the Linux client’s default behavior by adding an async flag
to the mount option can also increase NFS throughput. By default, Linux requires the
server to perform stable writes, which means that the server must flush its dirty buffers
before returning the write procedure result. Relaxing the stability of writes dramatically
increases write performance while still honoring and preserving fsync’s system-call se-
mantics.

2023, Scality, Inc 105


These optimizations may result in a delay between different clients in detecting modifi-
cations to data and directory structures.

$ mount -o nfsvers=4,rw,async,wsize=5242880 <host>:/export /mount/point

Configuring NFS Windows Clients

An NFS filesystem can be mounted on a Windows host using Windows Services for UNIX
(on systems that support this service). To do this, add/remove Windows components,
and then install a Services for NFS client and a Subsystem for UNIX-based Applications
(SUA).
From the command line:

mount server:/share z:

Refer to https://technet.microsoft.com/en-us/library/bb463214.aspx.

Installing Seamless Ingest

The Seamless Ingest feature eliminates file ingest disruption and maintains performance
during storage node failovers. The feature provides faster detection of a node failure
through the use of an external heartbeat mechanism that monitors the availability of all
the storage nodes.

Important: Even though this feature is supported, the S3 Connector may return a 500
error in case of server loss.

1. Build a ZooKeeper environment and set up sagentd RING storage servers. Refer to
“Installing Seamless Ingest” in RING Installation.
2. Prepare sagentd.yaml for each S3 Connector and move it to {{S3DATAdir}}/
scality-sproxyd/conf/sagentd.yaml.

daemons:
HOSTNAME-sproxyd0:
address: IPADDR
path: /run0/sproxyd
port: 20000
type: sproxyd
HOSTNAME-sproxyd1:
address: IPADDR
path: /run1/sproxyd
(continues on next page)

2023, Scality, Inc 106


(continued from previous page)
port: 20001
type: sproxyd
HOSTNAME-sproxyd2:
address: IPADDR
path: /run2/sproxyd
port: 20002
type: sproxyd
HOSTNAME-sproxyd3:
address: IPADDR
path: /run3/sproxyd
port: 20003
type: sproxyd
monitoring_heartbeat: false
monitoring_heartbeat_hosts:
- ZooKeeperIP1
- ZooKeeperIP2
- ZooKeeperIP3
- ZooKeeperIP4
- ZooKeeperIP5
monitoring_heartbeat_timeout: 5
monitoring_watcher: true

• HOSTNAME and IPADDR must be replaced by each S3 Connector’s hostname


and IP address.
• ZooKeeperIP1 - ZooKeeperIP5 must be replaced by ZooKeeper’s IP addresses.
3. Add the following to federation/roles/run-sproxyd/templates/supervisord.
conf.j2.

[program:sagentd]
command = /usr/bin/sagentd --no-daemon -c %(ENV_CONF_DIR)s/sagentd.yaml -p /
,→var/run/sagentd.pid -l

stdout_logfile = %(ENV_LOG_DIR)s/%(program_name)s.log
stderr_logfile = %(ENV_LOG_DIR)s/%(program_name)s-stderr.log
stdout_logfile_maxbytes=300MB
stdout_logfile_backups=2
stderr_logfile_maxbytes=300MB
stderr_logfile_backups=2
redirect_stderr=true
autorestart = true
autostart = true
user=root

4. Run the Ansible playbook to apply the changes.

2023, Scality, Inc 107


# ENV_DIR=s3config
# cd federation
# ./ansible-playbook -i env/${ENV_DIR}/inventory run.yml -t sproxyd

Configuring S3 Connector Network Ports

The listening ports for the S3 Connector are defined on a per-host basis, through either
the inventory or an additional host_vars or group_vars file in Ansible. Common variables
are often defined in groupvars/all.

Important: All custom ports defined in the kernel local port range must be reserved to
avoid any port conflict with client applications. Refer to Custom Ports for more informa-
tion.

External Service Ports

(INGRESS from a production zone)


The external service ports should be opened to the outside of the storage network (to
receive traffic from the customer/production network).

Default Ports Service Tuning Parameter


80 S3 Frontend env_s3_frontend_port
111 nfsd (portmap) N/A
443 S3 Frontend with SSL env_s3_frontend_ssl_port
2049 nfsd N/A
3030 Vault SAML (optional) env_vault_saml.issuer.port
8000 S3 env_s3_port
8100 UTAPI env_utapi_port
8600 Vault (IAM) Administration env_vault_admin_port
8650 Vault STS env_vault_sts_port
9080 backbeat env_backbeat_replication_port

2023, Scality, Inc 108


External Administration Ports

(INGRESS from an administration zone)


The external administration ports must be open to the administration network.

Default Ports Service Tuning Parameter


8100 UTAPI env_utapi_port
8800 Identisee (IAM Web UI) env_identisee_ui_port
15601 Kibana (Logs Web UI) env_logger.kibana_port
19088 Cosbench Controller WebUI cosbench_port

Internal Storage Ports

(EGRESS and INGRESS from the storage zone, or between DC in stretched deployments)
The internal storage ports must be open to the storage network to receive internal traffic.

Default Ports Service Tuning Parameter


22 SSH default port, N/A
used by Ansible to
connect to hosts
(deployment only)
22181 quorum env_quorum_port
22888 quorum follower env_quorum_follower_port
23888 quorum election env_quenv_quorum_election_portorum_port
2888 quorum follower env_quorum_follower_port
3888 quorum election env_quorum_election_port
4200 Metadata env_metadata_bucket_map_port
(storage/map) (formerly
env_metadata_map_port)
4300 Metadata-Vault env_metadata_vault_map_port
(storage/map) (formerly
env_vault_map_port)
4301 Metadata-Vault env_metadata_vault_map_port`
(storage/map) (formerly
``env_vault_map_port),
for a 3-server model
4501 Metadata env_metadata_bucket_repd_admin_port
Administration
(repd)
4802 utapi/warp10 env_utapi.service_port
continues on next page

2023, Scality, Inc 109


Table 4 – continued from previous page
Default Ports Service Tuning Parameter
5200 Metadata env_metadata_bucket_map_admin_port
Administration
(map)
5300 Metadata-Vault env_metadata_vault_map_admin_port
Administration
6379 Redis env_redis_server_port
(cluster/server)
6479 Redis (local) env_local_redis_port
7600; 7600 + [ S3 Replication env_s3_replication_replay_processor_port
[n_backbeat_containers - Replay Processor
1] * probe
n_replication_endpoints *
n_topics]
[7700; 7700 + S3 Replication env_s3_replication_status_processor_port
n_backbeat_containers - Status Processor
1] (n_backbeat_containers
is defaulted to 2)
[7800; 7800 + [ S3 Replication env_s3_replication_queue_processor_port
[n_backbeat_containers - Queue Processor
1]* n_backbeat_containers
n_replication_endpoints] is defaulted to 2)
[7900; 7900 + S3 Replication env_s3_replication_queue_populator_port
n_backbeat_containers - Queue Populator
1] (n_backbeat_containers
is defaulted to 2)
8100 UTAPI env_utapi_port
8181 Sproxyd Nginx env_sproxyd_port
8500 Vaultd (S3 env_vault_port
service)
8700 Identisee (API) env_identisee_api_port
8800 Identisee (IAM env_identisee_ui_port
Web UI)
8900 backbeat metrics N/A
(API)
9000 Bucketd env_metadata_bucket_dbd_port
(formerly
env_metadata_port)
9010 Bucketd env_metadata_bucket_dbd_port
(formerly
env_metadata_port), for a
3-server model
continues on next page

2023, Scality, Inc 110


Table 4 – continued from previous page
Default Ports Service Tuning Parameter
[9043 -> 9051 * (9043 + Metadata env_metadata_bucket_repd_port
repd_number) (storage/repds) (formerly
env_metadata_repd_port)
[9053 -> 9061 * (9053 + Metadata env_metadata_bucket_repd_port
repd_number) (storage/repds) (formerly
env_metadata_repd_port),
for a 3-server model
9092 backbeat-queue env_backbeat_queue_port
9143 -> 9151 * Metadata env_metadata_bucket_repd_admin_port
Administration
(repd)
9153 -> 9161 * Metadata env_metadata_bucket_repd_admin_port,
Administration for a 3-server model
(repd)
16379 (formerly 26379) Redis env_redis_sentinel_port
(cluster/sentinel)
18088 Cosbench Driver env_cos_driver_port
19200 ElasticSearch env_logger.
elasticsearch_port
19300 ElasticSearch env_logger.
(cluster) elasticsearch_cluster_port
[20000; 20000 + Sproxyd env_sproxyd_fcgi_port
n_sproxyd-1] (n_sproxyd
defaulted to 4)

Custom Ports

When a TCP client application connects to a service, the Linux kernel allocates to this
application a random port from the local port range. If a TCP server application is config-
ured to listen to a port in the local port range, it can face a port conflict issue at service
restart with an error such as “Address already in use”.
The kernel TCP local port range is customized by the RING to the range 20480-65001.
Scality services default ports which are included in the local port range are already re-
served to avoid any port conflict. If custom ports are set up for Scality services and are
part of the local port range, they must be added to the local reserved ports list.

Note: The configuration only affects new connections or port allocations.

1. To add more ports or ranges to the local reserved ports list, con-
figure the variable scality.platform.kernel.sysctl['net.ipv4.

2023, Scality, Inc 111


ip_local_reserved_ports_custom'] in the appropriate pillar in /srv/scality/
pillar/.

Tip: Multiple ports or port ranges can be configured as a comma separated list.
Editing the pillar /srv/scality/pillar/scality-common.sls applies the changes
on all servers.

Listing 8: Configuration example to reserve the port


45067 and the port range 24000-24100.
scality:
[...]
platform:
kernel:
sysctl:
net.ipv4.ip_local_reserved_ports_custom: 45067,24000-24100

2. Refresh the Salt pillar configuration.

{{supervisorServerName}}# salt '*' saltutil.refresh_pillar

3. Perform a dry-run to ensure that all the future custom ports and port ranges are
included in the local reserved ports list.

{{supervisorServerName}}# salt '*' state.sls scality.req.platform.kernel␣


,→test=True

4. Apply the new configuration.

{{supervisorServerName}}# salt '*' state.sls scality.req.platform.kernel

4.4 Offline Installation

S3 Connector can be deployed in an offline installation without internet connectivity.


This is the most common deployment for customers and data centers where network
access is restricted. Offline installation and deployment uses Ansible and is driven from
the Inventory files (described in Defining the S3 Cluster Inventory).
1. Go to the directory designated on the deployment server and run the following com-
mand:

$ cd s3-offline-${VERSION}/federation

2. Modify the Inventory File template as described in Defining the S3 Cluster Inventory.

2023, Scality, Inc 112


3. Modify the Global Variables template as described in Configuring the S3C Cluster.
4. Configure the S3 Vault environment.

$ ENV_DIR=s3config
$ ./ansible-playbook -i env/${ENV_DIR}/inventory tooling-playbooks/generate-
,→vault-env-config.yml

5. On a Site-level Cross-Region Replication setup, copy the admin credentials files


from the source S3 cluster installer to the target S3 cluster installer:

$ scp /path/to/federation/env/${ENV_DIR}/vault/admin-clientprofile/admin1.
,→json \

target_supervisor:/path/to/federation/env/${ENV_DIR}/vault/admin-
,→clientprofile

6. Run the following ansible-playbook command to install Docker/containerd, copy


images, and deploy required services.

$ ./ansible-playbook -i env/${ENV_DIR}/inventory run.yml

7. Follow S3 Connector Operation to create accounts, users, groups and buckets.

4.5 Extending an Installation to Multiple Metadata Clusters

A metadata cluster is a set of machines cooperating to replicate operations safely. It is


used to safely store:
• S3 objects metadata,
• IAM authentication and authorization data.
A cluster consists of at least five active processes (ideally hosted on five servers), and
from zero to five passive servers (also called “warm stand-bys”).
When at least ten servers are available, the S3 cluster is able to use more than one Meta-
data cluster.

Warning: Three-server installs must be upgraded to a five-server layout before multi-


clustering can be implemented.

The following procedure explains how to add a second Metadata cluster to an already
installed single Metadata cluster S3 Connector, as described in Installing the S3 Connec-
tor.
1. Run the following Ansible playbook command:

2023, Scality, Inc 113


$ ENV_DIR=s3config
$ ./ansible-playbook -i env/${ENV_DIR}/inventory tooling-playbooks/check-
,→status-metadata.yml

This command dumps to /tmp/results.txt and lists every Raft session state.
2. Read the results file:

$ cat /tmp/results.txt

These results provide an overview of existing metadata servers, their IP addresses,


and their states.

0 4300 md1-cluster1 10.200.1.215 {"lastPrune": 0, "aseq": 29, "bseq": 0,


,→"term": 1, "vseq": 29}

0 4300 md2-cluster1 10.200.3.73 {"lastPrune": 0, "aseq": 29, "bseq": 0,


,→"term": 1, "vseq": 29}

0 4300 md3-cluster1 10.200.3.79 {"lastPrune": 0, "aseq": 29, "bseq": 0,


,→"term": 1, "vseq": 29}

0 4300 md4-cluster1 10.200.3.82 {"lastPrune": 0, "aseq": 29, "bseq": 0,


,→"term": 1, "vseq": 29}

0 4300 md5-cluster1 10.200.3.85 {"lastPrune": 0, "aseq": 29, "bseq": 0,


,→"term": 1, "vseq": 29}

0 9043 md1-cluster1 10.200.1.215 {"lastPrune": 0, "aseq": 8, "bseq": 0,


,→"term": 1, "vseq": 8}

0 9043 md2-cluster1 10.200.3.73 {"lastPrune": 0, "aseq": 8, "bseq": 0, "term


,→": 1, "vseq": 8}

0 9043 md3-cluster1 10.200.3.79 {"lastPrune": 0, "aseq": 8, "bseq": 0, "term


,→": 1, "vseq": 8}

0 9043 md4-cluster1 10.200.3.82 {"lastPrune": 0, "aseq": 8, "bseq": 0, "term


,→": 1, "vseq": 8}

0 9043 md5-cluster1 10.200.3.85 {"lastPrune": 0, "aseq": 8, "bseq": 0, "term


,→": 1, "vseq": 8}

1 9044 md1-cluster1 10.200.1.215 {"lastPrune": 20000, "aseq": 23336, "bseq


,→": 2, "term": 1, "vseq": 23336}

1 9044 md2-cluster1 10.200.3.73 {"lastPrune": 10000, "aseq": 23336, "bseq":␣


,→2, "term": 1, "vseq": 23336}

1 9044 md3-cluster1 10.200.3.79 {"lastPrune": 20000, "aseq": 23336, "bseq":␣


,→2, "term": 1, "vseq": 23336}

1 9044 md4-cluster1 10.200.3.82 {"lastPrune": 10000, "aseq": 23336, "bseq":␣


,→2, "term": 1, "vseq": 23336}

1 9044 md5-cluster1 10.200.3.85 {"lastPrune": 10000, "aseq": 23336, "bseq":␣


,→2, "term": 1, "vseq": 23336}

2 9045 md1-cluster1 10.200.1.215 {"lastPrune": 20000, "aseq": 23496, "bseq


,→": 2, "term": 1, "vseq": 23496}

2 9045 md2-cluster1 10.200.3.73 {"lastPrune": 20000, "aseq": 23496, "bseq":␣


(continues on next page)

2023, Scality, Inc 114


(continued from previous page)
,→2, "term": 1, "vseq": 23496}

2 9045 md3-cluster1 10.200.3.79 {"lastPrune": 20000, "aseq": 23496, "bseq":␣


,→2, "term": 1, "vseq": 23496}

2 9045 md4-cluster1 10.200.3.82 {"lastPrune": 20000, "aseq": 23496, "bseq":␣


,→2, "term": 1, "vseq": 23496}

2 9045 md5-cluster1 10.200.3.85 {"lastPrune": 20000, "aseq": 23496, "bseq":␣


,→2, "term": 1, "vseq": 23496}

3 9046 md1-cluster1 10.200.1.215 {"lastPrune": 20000, "aseq": 23185, "bseq


,→": 2, "term": 1, "vseq": 23185}

3 9046 md2-cluster1 10.200.3.73 {"lastPrune": 10000, "aseq": 23185, "bseq":␣


,→2, "term": 1, "vseq": 23185}

3 9046 md3-cluster1 10.200.3.79 {"lastPrune": 10000, "aseq": 23185, "bseq":␣


,→2, "term": 1, "vseq": 23185}

3 9046 md4-cluster1 10.200.3.82 {"lastPrune": 10000, "aseq": 23185, "bseq":␣


,→2, "term": 1, "vseq": 23185}

3 9046 md5-cluster1 10.200.3.85 {"lastPrune": 10000, "aseq": 23185, "bseq":␣


,→2, "term": 1, "vseq": 23185}

4 9047 md1-cluster1 10.200.1.215 {"lastPrune": 10000, "aseq": 11761, "bseq


,→": 1, "term": 1, "vseq": 11761}

4 9047 md2-cluster1 10.200.3.73 {"lastPrune": 10000, "aseq": 11761, "bseq":␣


,→1, "term": 1, "vseq": 11761}

4 9047 md3-cluster1 10.200.3.79 {"lastPrune": 10000, "aseq": 11761, "bseq":␣


,→1, "term": 1, "vseq": 11761}

4 9047 md4-cluster1 10.200.3.82 {"lastPrune": 10000, "aseq": 11761, "bseq":␣


,→1, "term": 1, "vseq": 11761}

4 9047 md5-cluster1 10.200.3.85 {"lastPrune": 10000, "aseq": 11761, "bseq":␣


,→1, "term": 1, "vseq": 11761}

5 9048 md1-cluster1 10.200.1.215 {"lastPrune": 20000, "aseq": 23603, "bseq


,→": 2, "term": 1, "vseq": 23603}

5 9048 md2-cluster1 10.200.3.73 {"lastPrune": 20000, "aseq": 23603, "bseq":␣


,→2, "term": 1, "vseq": 23603}

5 9048 md3-cluster1 10.200.3.79 {"lastPrune": 20000, "aseq": 23603, "bseq":␣


,→2, "term": 1, "vseq": 23603}

5 9048 md4-cluster1 10.200.3.82 {"lastPrune": 20000, "aseq": 23603, "bseq":␣


,→2, "term": 1, "vseq": 23603}

5 9048 md5-cluster1 10.200.3.85 {"lastPrune": 20000, "aseq": 23603, "bseq":␣


,→2, "term": 1, "vseq": 23603}

6 9049 md1-cluster1 10.200.1.215 {"lastPrune": 10000, "aseq": 11731, "bseq


,→": 1, "term": 1, "vseq": 11731}

6 9049 md2-cluster1 10.200.3.73 {"lastPrune": 10000, "aseq": 11731, "bseq":␣


,→1, "term": 1, "vseq": 11731}

6 9049 md3-cluster1 10.200.3.79 {"lastPrune": 10000, "aseq": 11731, "bseq":␣


,→1, "term": 1, "vseq": 11731}

6 9049 md4-cluster1 10.200.3.82 {"lastPrune": 0, "aseq": 11731, "bseq": 1,


,→"term": 1, "vseq": 11731}

(continues on next page)

2023, Scality, Inc 115


(continued from previous page)
6 9049 md5-cluster1 10.200.3.85 {"lastPrune": 0, "aseq": 11731, "bseq": 1,
,→"term": 1, "vseq": 11731}

7 9050 md1-cluster1 10.200.1.215 {"lastPrune": 10000, "aseq": 11963, "bseq


,→": 1, "term": 1, "vseq": 11963}

7 9050 md2-cluster1 10.200.3.73 {"lastPrune": 10000, "aseq": 11963, "bseq":␣


,→1, "term": 1, "vseq": 11963}

7 9050 md3-cluster1 10.200.3.79 {"lastPrune": 10000, "aseq": 11963, "bseq":␣


,→1, "term": 1, "vseq": 11963}

7 9050 md4-cluster1 10.200.3.82 {"lastPrune": 10000, "aseq": 11963, "bseq":␣


,→1, "term": 1, "vseq": 11963}

7 9050 md5-cluster1 10.200.3.85 {"lastPrune": 10000, "aseq": 11963, "bseq":␣


,→1, "term": 1, "vseq": 11963}

8 9051 md1-cluster1 10.200.1.215 {"lastPrune": 20000, "aseq": 22884, "bseq


,→": 2, "term": 1, "vseq": 22884}

8 9051 md2-cluster1 10.200.3.73 {"lastPrune": 10000, "aseq": 22884, "bseq":␣


,→1, "term": 1, "vseq": 22884}

8 9051 md3-cluster1 10.200.3.79 {"lastPrune": 20000, "aseq": 22884, "bseq":␣


,→2, "term": 1, "vseq": 22884}

8 9051 md4-cluster1 10.200.3.82 {"lastPrune": 0, "aseq": 22884, "bseq": 0,


,→"term": 1, "vseq": 22884}

8 9051 md5-cluster1 10.200.3.85 {"lastPrune": 10000, "aseq": 22884, "bseq":␣


,→2, "term": 1, "vseq": 22884}

3. Add a cluster of five hosts by copying five instances of the definition block and
instantiating each cluster’s IP address and port assignments from group_vars/all.
The first token (mdX-clusterY), is an alias, followed by an ansible_host variable
definition that is the key to the alias mechanism. Both of these must be present for
each definition. The rest of the information is tuning that can be changed according
to the desired configuration.
• For example, the following is a definition block from a current cluster’s inven-
tory file:

md1-cluster1 ansible_host=10.200.1.215
md2-cluster1 ansible_host=10.200.3.73
md3-cluster1 ansible_host=10.200.3.79
md4-cluster1 ansible_host=10.200.3.82
md5-cluster1 ansible_host=10.200.3.85

• For a new cluster, add the following after the alias definition block in the in-
ventory file:

md1-cluster2 ansible_host=10.200.4.141
md2-cluster2 ansible_host=10.200.3.227
md3-cluster2 ansible_host=10.200.3.171
(continues on next page)

2023, Scality, Inc 116


(continued from previous page)
md4-cluster2 ansible_host=10.200.1.39
md5-cluster2 ansible_host=10.200.4.176

The new hosts must belong to the [runners_metadata] group.


4. Install the new cluster by running the run.yml playbook against the new machines:

$ ./ansible-playbook -i env/${ENV_DIR}/inventory run.yml -l '*-cluster2'

5. Update the topology in all servers: one host per cluster, and one machine at a time.
Stateless servers must also be updated:

Warning: This procedure restarts the metadata, bucketd, and vault contain-
ers, causing S3 clients to suffer a transient service interruption with HTTP error
500. To reduce errors, de-activate each S3 frontend IP address from the load
balancers’ backend configuration before starting this procedure.

$ ENV_DIR=s3config

$ ./ansible-playbook -i "env/${ENV_DIR}/inventory" run.yml --skip-tags


,→"requirements,run::images" --tags metadata -l md1-cluster1

[verbose output]
status: DONE

$ ./ansible-playbook -i "env/${ENV_DIR}/inventory" run.yml --skip-tags


,→"requirements,run::images" --tags metadata -l md2-cluster1

[verbose output]
status: DONE

$ ./ansible-playbook -i "env/${ENV_DIR}/inventory" run.yml --skip-tags


,→"requirements,run::images" --tags metadata -l md3-cluster1

[verbose output]
status: DONE

$ ./ansible-playbook -i "env/${ENV_DIR}/inventory" run.yml --skip-tags


,→"requirements,run::images" --tags metadata -l md4-cluster1

[verbose output]
status: DONE

$ ./ansible-playbook -i "env/${ENV_DIR}/inventory" run.yml --skip-tags


,→"requirements,run::images" --tags metadata -l md5-cluster1

[verbose output]
status: DONE
(continues on next page)

2023, Scality, Inc 117


(continued from previous page)

$ ./ansible-playbook -i "env/${ENV_DIR}/inventory" run.yml --skip-tags


,→"requirements,run::images" --tags metadata -l stateless01

[verbose output]
status: DONE

$ ./ansible-playbook -i "env/${ENV_DIR}/inventory" run.yml --skip-tags


,→"requirements,run::images" --tags metadata -l stateless02

[verbose output]
status: DONE

6. Check the states of all Raft sessions, including that of the freshly added cluster:

$ ./ansible-playbook -i env/${ENV_DIR}/inventory tooling-playbooks/check-


,→status-metadata.yml

Then:

$ cat /tmp/results.txt

This returns results resembling:

0 4300 md1-cluster1 10.200.1.215 {"lastPrune": 0, "aseq": 31, "bseq": 0,


,→"term": 3, "vseq": 31}

0 4300 md2-cluster1 10.200.3.73 {"lastPrune": 0, "aseq": 31, "bseq": 0,


,→"term": 3, "vseq": 31}

[ . . . ]
8 9050 md4-cluster1 10.200.3.82 {"lastPrune": 20000, "aseq": 22884, "bseq":␣
,→2, "term": 3, "vseq": 22884}

8 9050 md5-cluster1 10.200.3.85 {"lastPrune": 20000, "aseq": 22884, "bseq":␣


,→2, "term": 3, "vseq": 22884}

9 9051 md1-cluster2 10.200.4.141 {"lastPrune": 0, "aseq": 0, "bseq": 0,


,→"term": 2, "vseq": 0}

9 9051 md1-cluster2 10.200.4.141 {"lastPrune": 0, "aseq": 0, "bseq": 0,


,→"term": 2, "vseq": 0}

9 9051 md2-cluster2 10.200.3.227 {"lastPrune": 0, "aseq": 0, "bseq": 0,


,→"term": 2, "vseq": 0}

9 9051 md2-cluster2 10.200.3.227 {"lastPrune": 0, "aseq": 0, "bseq": 0,


,→"term": 2, "vseq": 0}

9 9051 md3-cluster2 10.200.3.171 {"lastPrune": 0, "aseq": 0, "bseq": 0,


,→"term": 2, "vseq": 0}

9 9051 md3-cluster2 10.200.3.171 {"lastPrune": 0, "aseq": 0, "bseq": 0,


,→"term": 2, "vseq": 0}

9 9051 md4-cluster2 10.200.1.39 {"lastPrune": 0, "aseq": 0, "bseq": 0, "term


,→": 2, "vseq": 0}

(continues on next page)

2023, Scality, Inc 118


(continued from previous page)
9 9051 md4-cluster2 10.200.1.39 {"lastPrune": 0, "aseq": 0, "bseq": 0, "term
,→": 2, "vseq": 0}

9 9051 md5-cluster2 10.200.4.176 {"lastPrune": 0, "aseq": 0, "bseq": 0,


,→"term": 2, "vseq": 0}

9 9051 md5-cluster2 10.200.4.176 {"lastPrune": 0, "aseq": 0, "bseq": 0,


,→"term": 2, "vseq": 0}

7. Update the healthchecks with:

$ ./ansible-playbook -i env/${ENV_DIR}/inventory run.yml -t publish

8. Update the kudzu inventory by running the following Ansible playbook command:

$ ./ansible-playbook -i env/s3config/inventory run.yml -t kudzu

4.6 Setting Up a Failover Deployment Machine in Multiple Site Architectures

In a stretched multiple-site S3C deployment, the loss of a site (datacenter) must be an-
ticipated. That is why the installer folder, along with the installation configuration, must
be copied from the main deployment server to a server hosted on the second site, called
failover deployment machine.

4.6.1 Prerequisites

The failover deployment machine must comply with the SSH Connection prerequisites.
It may also comply with Preparing Deployment as a Non-Root User if the root user is not
available.
Make sure the main deployment machine is also able to connect to the failover deploy-
ment machine using SSH.

4.6.2 Automate Copying the Federation Directory

The following procedure uses the standard S3C installer path:

/srv/scality/s3/s3-offline

Important: If the S3 installer is not installed on the standard path (/srv/scality/s3/s3-


offline/), a new file must be added to the /etc/scality-backup.conf.d/ directory to config-

2023, Scality, Inc 119


ure the backup of the S3 installer. For example, if the installer is installed in /root/s3-
offline, the new backup configuration file must contain:

root/
root/s3-offline/
root/s3-offline/federation/
root/s3-offline/federation/env/
root/s3-offline/federation/env/**

This example procedure employs the root user throughout.


1. Log in as root on the main deployment server.
2. Create the S3C installer folder.

ssh <failover_server> mkdir -p /srv/scality/s3/s3-offline

3. Launch the first copy.

rsync -ar /srv/scality/s3/s3-offline/* <failover_server>:/srv/scality/s3/s3-


,→offline

4. Automate a daily synchronization.

cat << EOF > /etc/cron.daily/scality-sync-s3-installer


#!/bin/sh
# Scality S3 installer sync to failover server

rsync -ar /srv/scality/s3/s3-offline/* <failover_server>:/srv/scality/s3/s3-


,→offline

EOF

chmod 755 /etc/cron.daily/scality-sync-s3-installer

2023, Scality, Inc 120


CHAPTER

FIVE

UPGRADING TOWARDS 7.10.8

This topic explains how to upgrade the S3C cluster to the latest 7.10.8 version.
Scality recommends upgrading to S3C version 7.10.8 after the RING Supervisor has been
upgraded to , to ensure dependencies and Supervisor’s dashboards compliance.
S3C upgrade uses Federation, a folder built with all the operation tools for S3 cluster. The
machine where Federation is installed is also referred to as the deployment machine.
For more information about the Federation suite, refer to S3 Operation with Federation.

Warning: You must be at least in version 7.4.X to upgrade towards 7.10.8.

Important: By default, Federation attempts to remove the netfilter and connection-


tracking modules. When you have firewall rules, some of the modules either may not
be removable, or removing them may break the configuration. To continue using your
firewall with S3C and leave your netfilter and connection-tracking unchanged during Fed-
eration, use the following option:

-e "env_docker_disable_nat=false"

5.1 Install Dependencies

Install python-3.6, libselinux-python3, and openssl on the deployment machine and


every targeted host.

2023, Scality, Inc 121


Enterprise Linux 7

RHEL 7

# yum install --enablerepo=rhel-7-server-rpms python3 python3-libselinux openssl

CentOS 7

# yum install --enablerepo=base --enablerepo=updates python3 libselinux-python3␣


,→openssl

Enterprise Linux 8

RHEL 8

# yum install --enablerepo=rhel-8-for-x86_64-baseos-rpms --enablerepo=rhel-8-for-


,→x86_64-appstream-rpms python36 python3-libselinux openssl

RockyLinux 8

# yum install --enablerepo=baseos --enablerepo=appstream python36 python3-


,→libselinux openssl

When upgrading from S3C 7.4.x, install python36-docker on all connectors.

# yum install python36-docker

5.2 Download and Extract the Offline Installer

Note: When the RING is upgraded prior to version , the S3C 7.10.8 software may be
already available in /srv/scality/s3/. In that case, jump to step 4 “Extract the S3C
tarball”.

Scality RING software, including S3C, can be found at RING Product Documentation ,
under DELIVERABLES.

2023, Scality, Inc 122


Note: If you cannot download the software, contact the Scality account team for assis-
tance.

Enterprise Linux 7

CentOS 7

1. Download the scality-ring-with-s3-offline-.x_centos_7.run installer.


2. Add the execution flag:
# chmod +x scality-ring-with-s3-offline-.x_centos_7.run
3. Extract the installer using the --extract-only option:
# ./scality-ring-with-s3-offline-.x_centos_7.run --extract-only
When prompted to overwrite contents, type “y”.
4. Extract the S3C tarball:
# cd /srv/scality/s3
# tar xvzf s3-offline-light-centos7-7.10.8.x.tar.gz

RHEL 7

1. Contact Scality Support to retrieve the Scality RING and S3C installer for RHEL 7.
2. Add the execution flag:
# chmod +x scality-ring-with-s3-offline-.x_rhel7.run
3. Extract the installer using the --extract-only option:
# ./scality-ring-with-s3-offline-.x_rhel7.run --extract-only
When prompted to overwrite contents, type “y”.
4. Extract the S3C tarball:
# cd /srv/scality/s3
# tar xvzf s3-offline-light-centos7-7.10.8.x.tar.gz

2023, Scality, Inc 123


Enterprise Linux 8

RHEL 8

1. Contact Scality Support to retrieve the Scality RING and S3C installer for RHEL 8.
2. Add the execution flag:
# chmod +x scality-ring-with-s3-offline-.x_rhel8.run
3. Extract the installer using the --extract-only option:
# ./scality-ring-with-s3-offline-.x_rhel8.run --extract-only
When prompted to overwrite contents, type “y”.
4. Extract the S3C tarball:
# cd /srv/scality/s3
# tar xvzf s3-offline-light-redhat8-7.10.8.x.tar.gz

RockyLinux 8

1. Download the scality-ring-with-s3-offline-.x_rocky8.run installer.


2. Add the execution flag:
# chmod +x scality-ring-with-s3-offline-.x_rocky8.run
3. Extract the installer using the --extract-only option:
# ./scality-ring-with-s3-offline-.x_rocky8.run --extract-only
When prompted to overwrite contents, type “y”.
4. Extract the S3C tarball:
# cd /srv/scality/s3
# tar xvzf s3-offline-light-rocky8-7.10.8.x.tar.gz

5.3 Set ENV_DIR

${ENV_DIR} represents the inventory and configuration folder.


In a standard installation, this folder’s name is s3config.
Set ${ENV_DIR} to the name of your environment directory, example with default
s3config.

2023, Scality, Inc 124


# ENV_DIR=s3config

Note: If you use an alternative name for this folder, replace s3config with your own
folder name before running the commands. If you are unsure about the name to use,
contact Scality Support.

5.4 Copy the Old Environment Files

Copy the old environment folder to the newly extracted 7.10.8 installer folder.

Note: Modify the target s3-offline-x.x.x.x/federation/env in the command.

# cp -r s3-offline/federation/env/${ENV_DIR} s3-offline-7.10.8.x/federation/
,→env/

5.5 Set the Path to the New Installer

1. Replace the existing s3-offline folder or symlink to target the new 7.10.8 installer.

Note: Modify the target s3-offline-x.x.x.x/federation/env in the command.

$ [ -d s3-offline ] && mv s3-offline s3-offline.old.$(date +%d%m%Y)


$ [ -h s3-offline ] && rm -f s3-offline
$ ln -s s3-offline-7.10.8.x s3-offline
2. Navigate to the federation folder.

$ cd s3-offline/federation

Note: All the commands written in the rest of this page are run from the federation
folder.

2023, Scality, Inc 125


5.6 Modify the group_vars/all File

Upgrading from 7.10.6 or higher

Note: To upgrade from a version before 7.10.6 you must complete “Upgrading from
before 7.10.6” tab.

When upgrading to S3C version 7.10.8 from S3C version 7.10.6 or higher, this step is
skipped, as no change is needed, go to Update the Backbeat Variables to continue.

Upgrading from before 7.10.6

1. Open the env/${ENV_DIR}/group_vars/all file. If there is an env_deploy_version


variable, remove it.
2. For 3-node installations, remove the container_name_suffix variable.
3. Ensure the enabled_repositories match the archive and operating system used:

Light Archive Non-light Archive


CentOS enabled_repositories: enabled_repositories:
• scality-internal • scality-internal
• scality-offline • scality-offline
• scality-offline-s3

RHEL enabled_repositories: enabled_repositories:


• scality-offline • scality-offline
• scality-offline-s3

Important: Ensure the repositories listed under enabled_repositories are the


same as the ones used during the installation process.

4. The parameters env_s3_trusted_proxy_cidrs and env_s3_client_ip_header


have been deprecated. If present in the file group_vars/all, replace them as fol-
lows:
• Change env_s3_trusted_proxy_cidrs to env_trusted_proxy_cidrs.
• Change env_s3_client_ip_header to env_client_ip_header.

2023, Scality, Inc 126


5. Search for the env_backbeat_allow_multi_site: yes option. If it is in the file,
delete it.

5.7 Update the Backbeat Variables

If Backbeat is in use and you are upgrading from a cluster with a version below 7.4.6.0,
follow this procedure. Otherwise, go directly to the next section.
1. Determine if Backbeat is in use:

$ grep -r env_enable_backbeat env/${ENV_DIR}

Listing 1: Expected Output


env_enable_backbeat= yes (in inventory) or
env_enable_backbeat: yes (in group_vars/all)

If env_enable_backbeat is not set, or set to “no”, skip this section and go to the next
one.
2. Replace the run_backbeat_quorum and run_backbeat_queue variables by the single
variable run_backbeat:

$ grep -r -e run_backbeat env/${ENV_DIR}/


env/${ENV_DIR}/inventory:run_backbeat_queue=no
env/${ENV_DIR}/inventory:run_backbeat_quorum=no

Both variables take the same value and are merged into run_backbeat.
3. Replace variables by run_backbeat:

# Use ': ' in YAML file env/${ENV_DIR}/group_vars/all


run_backbeat: yes/no
# Use '=' in INI file env/${ENV_DIR}/inventory
run_backbeat=yes/no

4. Replace the env_backbeat_quorum_xx variables by their equivalent starting with


env_quorum_xx:

$ grep -r -e env_backbeat_quorum_ env/${ENV_DIR}/


env/${ENV_DIR}/inventory:env_backbeat_quorum_port=2181
env/${ENV_DIR}/inventory:env_backbeat_quorum_folower_port=2181
env/${ENV_DIR}/inventory:env_backbeat_quorum_election_port=2181

5. Replace variables by their equivalent starting with env_quorum_xx:

2023, Scality, Inc 127


# Use ': ' in YAML file env/${ENV_DIR}/group_vars/all
env_quorum_port: 2181
# Use '=' in INI file env/${ENV_DIR}/inventory
env_quorum_port= 2181

5.8 Import the SAML Identity Provider Certificate

Note: Only apply this section if you are using SAML. If not, skip this section and go to
the Create Management Account Credentials section.

S3C checks SAML response messages signed by the IdP by validating this signature
against an imported certificate.
Version 7.4.6.0 introduced a SAML setting, idp_cert, to locate this certificate.
Skip this step and go to Update Requirements if either:
• env_vault_saml: is commented out in env/${ENV_DIR}/group_vars/all, or
• idp_cert is already present in env_vault_saml.
Take the following steps to import the SAML Identity Provider (IdP) certificate.

5.8.1 Download the Certificate from the SAML IdP

The IdP server is targeted by the env_vault_saml.entry_point setting.


If the IdP server is Microsoft ADFS:
1. Download the server’s XML-formatted metadata file from https://adfs-server/
FederationMetadata/2007-06/FederationMetadata.xml.

Note: If this request is forbidden, ask your ADFS administrator for the
FederationMetadata.xml file.

2. Extract the certificate.

echo "-----BEGIN CERTIFICATE-----" > env/${ENV_DIR}/vault/saml_idp.cert


(xmllint --shell FederationMetadata.xml | grep -v '^/ >' | fold -w 64) <<␣
,→EndOfScript | tee -a env/${ENV_DIR}/vault/saml_idp.cert

setns a=urn:oasis:names:tc:SAML:2.0:metadata
setns b=http://www.w3.org/2000/09/xmldsig#
(continues on next page)

2023, Scality, Inc 128


(continued from previous page)
cat /a:EntityDescriptor/b:Signature/b:KeyInfo/b:X509Data/b:X509Certificate/
,→text()

EndOfScript
echo "-----END CERTIFICATE-----" >> env/${ENV_DIR}/vault/saml_idp.cert

For other IdPs, reach out to the provider’s support site to gather the SAML-response-
signing public certificate.

5.8.2 Import the Certificate

1. Upload the certificate to all S3 connectors.

../repo/venv/bin/ansible -i env/${ENV_DIR}/inventory -m copy -a\


"src=env/${ENV_DIR}/vault/saml_idp.cert dest={{env_vault_saml.tls_key\
| dirname }}" runners_s3

2. Open env/${ENV_DIR}/group_vars/all and add the idp_cert setting under


env_vault_saml.

env_vault_saml:
[...]
tls_key: /etc/linssl/s3.key
tls_cert: /etc/linssl/s3.crt
idp_cert: /etc/linssl/saml_idp.crt

Note: Paths and file names for tls_key and tls_cert are examples here. idp_cert
must have the same path as tls_key and tls_cert.

5.9 Create Management Account Credentials

$ ./ansible-playbook -i env/${ENV_DIR}/inventory tooling-playbooks/generate-


,→vault-env-config.yml

This command has no impact on production. It creates the env/${ENV_DIR}/vault/


management-account-keys.json file to be used by the management account (ID
000000000000) that will be created during the upgrade.

2023, Scality, Inc 129


5.10 Update Requirements

$ ./ansible-playbook -i env/${ENV_DIR}/inventory run.yml -t requirements --skip-


,→tags run::images,publish,cleanup

Note: This command installs the new Python packages and needs to be run during a
maintenance window.

5.11 Upload Binaries

$ ./ansible-playbook -i env/${ENV_DIR}/inventory run.yml -t run::images --skip-


,→tags requirements,publish,cleanup

Note: This command takes time to run, but it has no impact on production. This com-
mand can be performed outside of a maintenance window.

5.12 Back up Backbeat Credentials

Note: Only apply this section if you enabled a bucket’s replication using internal tooling
and if you determined that Backbeat is in use at a previous step.

The Backbeat tooling installed in the scality-backbeat containers (bin/replication.js)


uses AWS CLI-ready credentials stored within the container ~/.aws/credentials to exe-
cute configuration operations (for example, to activate CRR on buckets).

Warning: During an upgrade, the AWS credentials are overwritten and lost. You must
back up these credentials to restore them after the upgrade.

2023, Scality, Inc 130


Enterprise Linux 7

1. From the Supervisor, access the backbeat container names:

$ salt -G roles:ROLE_S3 cmd.run "docker ps | egrep scality-backbeat | grep -


,→v scality-backbeat-queue | awk '{print \$NF}'"

2. Back up the credentials. For each backbeat-container name displayed in the former
command, update the BACKBEAT_CTR variable value and run the following command:

$ BACKBEAT_CTR=scality-backbeat-1
$ salt -G roles:ROLE_S3 cmd.run "docker exec -t -u scality $BACKBEAT_CTR␣
,→bash -c 'cat ~/.aws/credentials'" | tee -a /root/aws_credentials_save_me

This command displays the output on the screen and dumps it in a file which con-
tains a copy of the .aws/credentials. The credentials are safely kept until they are
restored.

Enterprise Linux 8

1. From the Supervisor, access the backbeat container names:

$ salt -G roles:ROLE_S3 cmd.run "ctrctl ps | egrep scality-backbeat | grep -


,→v scality-backbeat-queue | awk '{print \$(NF-2)}'"

2. Back up the credentials. For each backbeat-container name displayed in the former
command, update the BACKBEAT_CTR variable value and run the following command:

$ BACKBEAT_CTR=scality-backbeat-1
$ salt -G roles:ROLE_S3 cmd.run "ctrctl exec --user 0 $BACKBEAT_CTR bash -c
,→'cat /home/scality/.aws/credentials'" | tee -a /root/aws_credentials_save_

,→me

This command displays the output on the screen and dumps it in a file which con-
tains a copy of the .aws/credentials. The credentials are safely kept until they are
restored.

5.13 Upgrade Stateful Components

Stateful components involve services such as metadata management tools. Those com-
ponents run on several servers called stateful hosts or colocated hosts, depending on
the running architecture.

2023, Scality, Inc 131


5.13.1 Upgrade Stateful Components on Stateful-only Hosts

1. List the hosts containing only stateful components:

$ ../repo/venv/bin/ansible --list-hosts -i env/${ENV_DIR}/inventory


,→'runners_metadata:!runners_s3'

Listing 2: Output Example


hosts (3):
md1-cluster1
md2-cluster1
md3-cluster1

Note: If this command returns no hosts, it means that it is a colocated architecture


where stateful and stateless components are deployed on the same servers. In this
case, continue to Upgrade the Stateful Components on Colocated Hosts.

2. Upgrade the first listed host:

$ ./ansible-playbook -i env/${ENV_DIR}/inventory run.yml --skip-tags␣


,→run::images,cleanup,requirements -t stateful --limit <host>

If any of these commands fails, stop. Contact Scality support.

Note: Using the tag -t stateful is needed even for stateful only hosts.

3. Check the containers:

2023, Scality, Inc 132


Enterprise Linux 7

$ ../repo/venv/bin/ansible -i env/${ENV_DIR}/inventory -b -m shell -a


,→'docker ps -a' <host>

Enterprise Linux 8

$ ../repo/venv/bin/ansible -i env/${ENV_DIR}/inventory -b -m shell -a


,→'ctrctl ps -a' <host>

Note: 3-node architectures may show some hosts running both the source and
target versions until all stateful components complete the update.

If no container is in the Running state, stop. Contact Scality support.


4. Check the Metadata cluster’s health:

$ ./ansible-playbook -i env/${ENV_DIR}/inventory tooling-playbooks/gather-


,→metadata-status.yml

If the status is not OK, stop and contact Scality support.


5. Repeat these operations for the remaining hosts, as generated by the first step of
this procedure.

5.13.2 Upgrade Stateful Components on Colocated Hosts

1. List the name of the connectors that host the stateful and stateless components:

$ ../repo/venv/bin/ansible --list-hosts -i env/${ENV_DIR}/inventory


,→'runners_metadata:&runners_s3'

Listing 3: Output Example


hosts (2):
md4-cluster1
md5-cluster1

Note: If this command returns no hosts, it means that the upgrade of stateful
components is finished, continue to Upgrading the Backbeat Queue Protocol

2023, Scality, Inc 133


2. Upgrade the first listed host:

$ ./ansible-playbook -i env/${ENV_DIR}/inventory run.yml -t stateful --skip-


,→tags run::images,cleanup,requirements --limit <host>

If this command fails, stop. Contact Scality support.


3. Check the containers:

Enterprise Linux 7

$ ../repo/venv/bin/ansible -i env/${ENV_DIR}/inventory -b -m shell -a


,→'docker ps -a' <host>

Enterprise Linux 8

$ ../repo/venv/bin/ansible -i env/${ENV_DIR}/inventory -b -m shell -a


,→'ctrctl ps -a' <host>

Note: 3-node architectures may show some hosts running both the source and
target versions until all stateful components complete the update.

If no container is in the Running state, stop. Contact Scality support.


4. Check the Metadata cluster’s health:

$ ./ansible-playbook -i env/${ENV_DIR}/inventory tooling-playbooks/gather-


,→metadata-status.yml

If the status is not OK, stop and contact Scality support.


5. Repeat these operations for the remaining hosts, as generated by the first step of
this procedure.

5.14 Upgrade the Backbeat Queue Protocol

S3C 7.10.1 brings changes to the upgrade backbeat-queue service used by the features
CRR, Bucket Notifications, and Lifecycle Expiration. Backbeat queue uses Kafka which
has changed its protocol. Upgrade to the new format using this procedure.

2023, Scality, Inc 134


Note: This step is required only if advanced data management features such as CRR,
Bucket Notifications and Lifecycle Expiration are enabled.

Important: You must complete the stateful components upgrade prior to this procedure.

1. Open the env/${ENV_DIR}/group_vars/all file and set the Kafka protocol version.
If it was already set, continue to Rebalance the Kafka Topic Partitions
env_kafka_protocol_version: 3.2.0

2. List the name of the connectors that only host the stateful components.
$ ../repo/venv/bin/ansible --list-hosts -i env/${ENV_DIR}/inventory
,→'runners_metadata:!runners_s3'

Output Example:
hosts (3):
md1-cluster1
md2-cluster1
md3-cluster1

Note: If no hosts are returned, continue to step 4.

3. Upgrade one host at a time from the above list.

$ ./ansible-playbook -i env/${ENV_DIR}/inventory run.yml --skip-tags␣


,→requirements,run::images,cleanup -t backbeat,vault --limit <host>

4. List the name of the connectors that host stateful and stateless components.
$ ../repo/venv/bin/ansible --list-hosts -i env/${ENV_DIR}/inventory
,→'runners_metadata:&runners_s3'

Output Example:
hosts (2):
md4-cluster1
md5-cluster1

Note: If no hosts are returned, continue to Rebalance the Kafka Topic Partitions

2023, Scality, Inc 135


5. Upgrade one host at a time from the above list.

5.15 Rebalance the Kafka Topic Partitions

S3C, starting version 7.10.3, supports running Backbeat on a stretched cluster, which
also requires to stretch the stateful backbeat-queue (Kafka) components across sites.
The backbeat-queue topic partition replicas must be distributed according to a rack-
aware configuration:
• To ensure proper distribution across sites.
• To benefit from high availability and failure tolerance of the backbeat-queue service
even during site failure scenarios.
While this process is automatically enforced for initial installations from version 7.10.3
and onward, earlier versions do not enforce those constraints. When upgrading a multi-
site stretched cluster to 7.10.3 or onward from an earlier version, a special tooling play-
book is provided to automatically rebalance existing topic partition replicas and dis-
tribute them in a rack-aware fashion.

Note: This step to rebalance Kafka topic partitions is required only if all of the following
are true:
• The deployment runs on a multi-site stretched cluster.
• Advanced data management features such as CRR, bucket notifications, or Lifecy-
cle Expiration are enabled, which all rely on the backbeat-queue service.
• The upgrade is executed from a version earlier than 7.10.3, or the kafka-rebalance
tooling playbook was not executed in prior upgrades.

Important: You must complete the stateful components upgrade and Backbeat Queue
Protocol upgrade prior to this procedure.

1. Execute the kafka-rebalance tooling playbook, without extra environment vari-


ables, and wait for it to complete execution.

$ ./ansible-playbook -i env/${ENV_DIR}/inventory tooling-playbooks/kafka-


,→rebalance.yml

2023, Scality, Inc 136


5.16 Upgrade Stateless Components

Stateless components run on several servers called stateless hosts or colocated hosts,
depending on the running architecture

5.16.1 Upgrade Stateless Components in Active/Active Load-Balancing Architectures

If S3 connections are handled by a load balancer configured in active/active mode, follow


this procedure:
1. List the name of the hosts containing the stateless components:

$ ../repo/venv/bin/ansible --list-hosts -i env/${ENV_DIR}/inventory runners_


,→s3

Listing 4: Output Example


hosts (4):
stateless01
stateless02
md1-cluster1
md2-cluster1

2. Deactivate the first host in the load balancer’s backend configuration.


3. Upgrade the first host:

$ ./ansible-playbook -i env/${ENV_DIR}/inventory run.yml -t stateless --


,→skip-tags run::images,cleanup,requirements --limit <host>

If it fails, stop. Contact Scality support.


4. Check the containers:

Enterprise Linux 7

$ ../repo/venv/bin/ansible -i env/${ENV_DIR}/inventory -b -m shell -a


,→'docker ps -a' <host>

2023, Scality, Inc 137


Enterprise Linux 8

$ ../repo/venv/bin/ansible -i env/${ENV_DIR}/inventory -b -m shell -a


,→'ctrctl ps -a' <host>

Note: 3-node architectures may show some hosts running both the source and
target versions until all stateful components complete the update.

If no container is in the Running state, stop. Contact Scality support.


5. Enable the first host in the load balancer back end configuration.
6. Repeat these operations for the remaining hosts, as generated by the first step of
this procedure.

5.16.2 Upgrade Stateless Components Directly Used by S3 Clients

If the S3 clients directly use the S3 connectors:

Important: Without a load balancer, applications using S3 connectors will face tempo-
rary service interruptions during the upgrade.

1. Upgrade all the stateless components:

$ ./ansible-playbook -i env/${ENV_DIR}/inventory run.yml -t stateless --


,→skip-tags run::images,cleanup,requirements

If this command fails, stop. Contact Scality support.


2. Check all the containers:

Enterprise Linux 7

$ ../repo/venv/bin/ansible -i env/${ENV_DIR}/inventory -b -m shell -a


,→'docker ps -a' <host>

2023, Scality, Inc 138


Enterprise Linux 8

$ ../repo/venv/bin/ansible -i env/${ENV_DIR}/inventory -b -m shell -a


,→'ctrctl ps -a' <host>

Note: 3-node architectures may show some hosts running both the source and
target versions until all stateful components complete the update.

If no container is in the Running state, stop. Contact Scality support.

5.17 Cleanup

Once all hosts are upgraded, clean up unused docker images and previous versions on
all hosts:

$ ./ansible-playbook -i env/${ENV_DIR}/inventory run.yml -t cleanup --


,→skip-tags run::images,requirements

5.18 Restore Backbeat Credentials

Note: Only apply this section if you enabled a bucket’s replication using internal tooling.

Restore AWS credentials by following the bin/replication.js procedure described in


Enabling a Bucket’s Replication during the initial CRR setup.

2023, Scality, Inc 139

You might also like