Site Reliability Engineering

Uploaded by

Adnan Pervaiz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views3 pages

Site Reliability Engineering

Uploaded by

Adnan Pervaiz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 3

Site reliability engineering (SRE)

Site reliability engineering (SRE) is a software engineering approach to IT

operations. SRE teams use software as a tool to manage systems, solve problems,
and automate operations tasks.
SRE takes the tasks that have historically been done by operations teams, often
manually, and instead gives them to engineers or operations teams who use
software and automation to solve problems and manage production systems.
SRE is a valuable practice when creating scalable and highly reliable software
systems. It helps manage large systems through code, which is more scalable and
sustainable for system administrators (sysadmins) managing thousands or
hundreds of thousands of machines.
SRE can also reduce or remove much of the natural friction between development
teams because some teams want to continually release new or updated software
into production. However, operations teams don't want to release any type of
update or new software without being sure it won't cause outages or other
operations problems. As a result, while not strictly required for DevOps, SRE aligns
closely with DevOps principles and can play an important role in DevOps success.

What is site reliability engineering?

Site reliability engineering (SRE) uses software engineering to automate IT
operations tasks - for example production system management, change
management, incident response, even emergency response - that would otherwise
be performed manually by systems administrators (sysadmins).
The principle behind SRE is that using software code to automate oversight of large
software systems is a more scalable and sustainable strategy than manual
intervention - especially as those systems extend or migrate to the cloud.
SRE can also reduce or remove much of the natural friction between development
teams who want to continually release new or updated software into production,
and operations teams who don't want to release any type of update or new software
without being sure it won't cause outages or other operations problems. As a result,
while not strictly required for DevOps, SRE aligns closely with DevOps principles and
can be play an important role in DevOps success.
The concept of SRE is credited to Ben Treynor Sloss, VP of engineering at Google,
who famously wrote that "SRE is what happens when you ask a software engineer to
design an operations team.

What does a site reliability engineer do?

A site reliability engineer is a unique role that requires either a background as a
sysadmin, a software developer with additional operations experience, or someone
in an IT operations role that also has software development skills.
SRE teams are responsible for how code is deployed, configured, and monitored, as
well as the availability, latency, change management, emergency response, and
capacity management of services in production.
SRE teams determine the launch of new features by using service-level agreements
(SLAs) to define the required reliability of the system through service-level
indicators (SLI) and service-level objectives (SLO).
An SLI measures specific aspects of provided service levels. Key SLIs include request
latency, availability, error rate, and system throughput. An SLO is based on the
target value or range for a specified service level based on the SLI.
An SLO for the required system reliability is then based on the downtime
determined to be acceptable. This downtime level is referred to as an error budget—
the maximum allowable threshold for errors and outages.

SRE and DevOps

DevOps is a modern way to deliver higher quality applications faster - by
automating the software delivery lifecycle, and by giving development and
operations teams more shared responsibility and more input into each other’s work.
Like SRE, DevOps makes a business more agile by balancing the need to deliver
more applications and changes faster with the need to avoid 'breaking' the
production environment. And like SRE, DevOps aims to achieve this balance by
establishing an acceptable risk of errors. In fact, SRE and DevOps seem so similar
that some experts say they're the same thing—but most see SRE practices as
excellent ways to implement DevOps principles.

SREs can code

you throw people at a reliability problem and keep pushing (sometimes for a year or
more) until the problem either goes away or blows up in your face.
Not so in SRE. Both the development and SRE teams share a single staffing pool, so
for every SRE that is hired, one less developer headcount is available (and vice
versa). This ends the never-ending headcount battle between Dev and Ops, and
creates a self-policing system where developers get rewarded with more teammates
for writing better performing code (i.e., code that needs less support from fewer
SREs).
SRE benefits
 Gain greater visibility into service health
 Quantify the cost of downtime
 Optimize incident response
 Build a modern network operations center

Migration from traditional IT and on-premises data centers to hybrid

cloud environments is one of the chief reasons that the average enterprise
generates two to three times more operations data every year. Increasingly, SRE is
seen as being critical for leveraging this data to automate systems administration,
operations and incident response, and to improve enterprise reliability even as the
IT environment becomes more complex.

Site Reliability Engineering Handbook by Anupam Singh
No ratings yet
Site Reliability Engineering Handbook by Anupam Singh
299 pages
Site Reliability Engineering v2
No ratings yet
Site Reliability Engineering v2
115 pages
Petrol Pump Management System Project Report
88% (8)
Petrol Pump Management System Project Report
136 pages
Vladyslav Ukis - Establishing SRE Foundations - A Step-By-Step Guide To Introducing Site Reliability Engineering in Software Delivery Organizations-Addison-Wesley Professional (2022)
No ratings yet
Vladyslav Ukis - Establishing SRE Foundations - A Step-By-Step Guide To Introducing Site Reliability Engineering in Software Delivery Organizations-Addison-Wesley Professional (2022)
557 pages
SRE Essentials: Key Principles & Practices
100% (1)
SRE Essentials: Key Principles & Practices
20 pages
SRE and Incident Management
No ratings yet
SRE and Incident Management
58 pages
Site Reliability Engineering Ebook PDF
No ratings yet
Site Reliability Engineering Ebook PDF
21 pages
Ebook 10 Essential Skills of A Site Reliability Engineer Sre
100% (3)
Ebook 10 Essential Skills of A Site Reliability Engineer Sre
18 pages
Site Reliability Engineering Ebook
100% (2)
Site Reliability Engineering Ebook
21 pages
Unit 05 - SRE
No ratings yet
Unit 05 - SRE
15 pages
SRE Success: Philosophy, Tools, Habits
No ratings yet
SRE Success: Philosophy, Tools, Habits
31 pages
What Is SRE
100% (1)
What Is SRE
40 pages
SRE SRE at Google. Jamie Wilkinson, Hope Is Not A Strategy. - DOTC Melbourne 2018
100% (2)
SRE SRE at Google. Jamie Wilkinson, Hope Is Not A Strategy. - DOTC Melbourne 2018
43 pages
Google Cloud DevOps Engineer Exam Prep Sheet
No ratings yet
Google Cloud DevOps Engineer Exam Prep Sheet
16 pages
Site Reliability Engineering
No ratings yet
Site Reliability Engineering
9 pages
Sre 250821 235741
No ratings yet
Sre 250821 235741
5 pages
Switch Board Installation Guide Revision 1.2 Playstation Mainboard (Pu-20)
No ratings yet
Switch Board Installation Guide Revision 1.2 Playstation Mainboard (Pu-20)
11 pages
Cloud ITIL
No ratings yet
Cloud ITIL
92 pages
SRE Insights for Google Cloud Users
No ratings yet
SRE Insights for Google Cloud Users
58 pages
SAP S/4HANA & SuccessFactors Integration Guide
0% (1)
SAP S/4HANA & SuccessFactors Integration Guide
18 pages
SRE and DevSecOps Training Content - 20231023
No ratings yet
SRE and DevSecOps Training Content - 20231023
5 pages
Site Reliability Engineering (SRE)
No ratings yet
Site Reliability Engineering (SRE)
3 pages
From Sqli To Shell II
No ratings yet
From Sqli To Shell II
37 pages
Site Reliability Engineer Nanodegree Program Syllabus
No ratings yet
Site Reliability Engineer Nanodegree Program Syllabus
16 pages
SRE Google Notes
100% (1)
SRE Google Notes
8 pages
Google SRE: Engineering Web Reliability
No ratings yet
Google SRE: Engineering Web Reliability
21 pages
C# Practical File
No ratings yet
C# Practical File
15 pages
CP R80.20 RemoteAccessVPN AdminGuide
No ratings yet
CP R80.20 RemoteAccessVPN AdminGuide
161 pages
JD - Chief Engineer SRE
No ratings yet
JD - Chief Engineer SRE
5 pages
Bca 2024-25 I and Ii Sem Syllabus
No ratings yet
Bca 2024-25 I and Ii Sem Syllabus
19 pages
DevOps, SRE and Platform Engineering
No ratings yet
DevOps, SRE and Platform Engineering
7 pages
White Paper - EDT11 - Site Reliability Engine
No ratings yet
White Paper - EDT11 - Site Reliability Engine
7 pages
Developing A SRE Culture-English
No ratings yet
Developing A SRE Culture-English
4 pages
Balls
No ratings yet
Balls
3 pages
Catchpoint 2021 SRE Report
No ratings yet
Catchpoint 2021 SRE Report
33 pages
SRE & Error Budgets for Reliability
No ratings yet
SRE & Error Budgets for Reliability
45 pages
Ebook The Sre Transformation
No ratings yet
Ebook The Sre Transformation
8 pages
SRE SRE: Site Reliability Engineering
No ratings yet
SRE SRE: Site Reliability Engineering
3 pages
Site Reliability Engineering Course Content (SRE)
No ratings yet
Site Reliability Engineering Course Content (SRE)
5 pages
Postscript: Postscript (PS) Is A Page Description Language in The Electronic
No ratings yet
Postscript: Postscript (PS) Is A Page Description Language in The Electronic
9 pages
Catchpoint 2018 SRE Report
No ratings yet
Catchpoint 2018 SRE Report
15 pages
M2 - DevOps, SRE, and Why They Exist
No ratings yet
M2 - DevOps, SRE, and Why They Exist
34 pages
Wepik Integrating Site Reliability Engineering and Devops For Enhanced Operational Excellence 20240822082600iu2w
No ratings yet
Wepik Integrating Site Reliability Engineering and Devops For Enhanced Operational Excellence 20240822082600iu2w
8 pages
Unit 4 New
No ratings yet
Unit 4 New
129 pages
An Architect's Guide to SRE
No ratings yet
An Architect's Guide to SRE
375 pages
SRE Foundation V1 - 0 - Value Added Resources 11 - 2019
No ratings yet
SRE Foundation V1 - 0 - Value Added Resources 11 - 2019
8 pages
SRE Paper
No ratings yet
SRE Paper
26 pages
SREF Blueprint
No ratings yet
SREF Blueprint
1 page
SRE Blueprint: Mastering SLOs for Success
No ratings yet
SRE Blueprint: Mastering SLOs for Success
4 pages
Feedback
No ratings yet
Feedback
9 pages
SRE 21 ShivagamiGugan SlideDeck
No ratings yet
SRE 21 ShivagamiGugan SlideDeck
27 pages
Enterprise Site Reliability Engineering Contino
No ratings yet
Enterprise Site Reliability Engineering Contino
19 pages
On-Call in Action
No ratings yet
On-Call in Action
13 pages
Career Framework - SRE
No ratings yet
Career Framework - SRE
12 pages
0 - DevOps - Course Description
No ratings yet
0 - DevOps - Course Description
5 pages
SRE Job Description
No ratings yet
SRE Job Description
4 pages
M6 - Apply SRE in Your Organization
No ratings yet
M6 - Apply SRE in Your Organization
41 pages
M1 - Introduction To The Course
No ratings yet
M1 - Introduction To The Course
23 pages
SRE Basics for IT Professionals
No ratings yet
SRE Basics for IT Professionals
5 pages
SRE Principles
No ratings yet
SRE Principles
15 pages
Sharp - MX M550 620 700
No ratings yet
Sharp - MX M550 620 700
12 pages
RP State of Sre Report 2022
No ratings yet
RP State of Sre Report 2022
46 pages
The SRE Report 2024 - Catchpoint
No ratings yet
The SRE Report 2024 - Catchpoint
59 pages
A Handy DevOps Roadmap and Study Guide
No ratings yet
A Handy DevOps Roadmap and Study Guide
52 pages
Marker Based Maze Game Developed On Unity Software: A Project Report On
No ratings yet
Marker Based Maze Game Developed On Unity Software: A Project Report On
22 pages
SRE Course for FAANG Aspirants
No ratings yet
SRE Course for FAANG Aspirants
13 pages
Hiring Site Reliability Engineers
No ratings yet
Hiring Site Reliability Engineers
5 pages
Unit 2 Decops Lifecycle
No ratings yet
Unit 2 Decops Lifecycle
37 pages
04 - Signaling in MTP
No ratings yet
04 - Signaling in MTP
68 pages
SRE Report 2023 Catchpoint
No ratings yet
SRE Report 2023 Catchpoint
56 pages
SRE Best Practices Guide
No ratings yet
SRE Best Practices Guide
11 pages
Site Reliability Engineer Nanodegree Program Syllabus
No ratings yet
Site Reliability Engineer Nanodegree Program Syllabus
13 pages
The DevOps Universe
No ratings yet
The DevOps Universe
25 pages
10 1 1 1 6269 PDF
No ratings yet
10 1 1 1 6269 PDF
231 pages
G3 - R-Tree, R+-Tree
No ratings yet
G3 - R-Tree, R+-Tree
47 pages
Dell EMC PowerEdge C6525 - FSM
No ratings yet
Dell EMC PowerEdge C6525 - FSM
124 pages
Swift Programming The Ultimate Beginner S Guide To Learn Swift Programming Step by Step 3nd Edition Alexander Aronowitz & NLN LNC (Aronowitz PDF Download
100% (1)
Swift Programming The Ultimate Beginner S Guide To Learn Swift Programming Step by Step 3nd Edition Alexander Aronowitz & NLN LNC (Aronowitz PDF Download
42 pages
GPON ONT for Triple-Play Services
No ratings yet
GPON ONT for Triple-Play Services
2 pages
Top10VPN GWI Global VPN Usage Report 2020
No ratings yet
Top10VPN GWI Global VPN Usage Report 2020
20 pages
Iseries Password Recovery
No ratings yet
Iseries Password Recovery
5 pages
A Survey On RISC-V Security: Hardware and Architecture: Tao Lu
No ratings yet
A Survey On RISC-V Security: Hardware and Architecture: Tao Lu
39 pages
The Cool TEX Automation Tool: User Manual
No ratings yet
The Cool TEX Automation Tool: User Manual
23 pages
Customer Support (Resume)
No ratings yet
Customer Support (Resume)
2 pages
Dbvisit 7 Eleven Case Study
No ratings yet
Dbvisit 7 Eleven Case Study
2 pages
Drawing and Detailing With Solidworks: A Workbook For Solidworks 2001/2001plus
No ratings yet
Drawing and Detailing With Solidworks: A Workbook For Solidworks 2001/2001plus
2 pages
Tibetan Windows Software Overview
No ratings yet
Tibetan Windows Software Overview
7 pages
Projects List
No ratings yet
Projects List
7 pages

Site Reliability Engineering

Uploaded by

Site Reliability Engineering

Uploaded by

Site reliability engineering (SRE)

Site reliability engineering (SRE) is a software engineering approach to IT

What is site reliability engineering?

What does a site reliability engineer do?

SRE and DevOps

SREs can code

Migration from traditional IT and on-premises data centers to hybrid

You might also like