aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • DevOps
  • Engineering
  • Software

(Almost) Everything You Need To Know About SRE

  • aster.cloud
  • December 26, 2020
  • 3 minute read

Site Reliability Engineering (SRE) is a hot topic, but what exactly does it entail? And do you have to follow the principles to a T in order to achieve benefits from it? If you’re searching for answers to these common questions, look no further.

In this episode of the Cloud & Culture podcast, VMware Tanzu’s Hannah Foxwell explains the what, why, and how of SRE—from key principles (such as SLI, SLO, and error budgets) to real-life examples of enterprise adoption. Importantly, she also makes clear that while SRE has clear benefits around uptime and efficient use of resources and energy, it also can be a boon to employees’ quality of life.


Partner with aster.cloud
for your next big idea.
Let us know here.



From our partners:

CITI.IO :: Business. Institutions. Society. Global Political Economy.
CYBERPOGO.COM :: For the Arts, Sciences, and Technology.
DADAHACKS.COM :: Parenting For The Rest Of Us.
ZEDISTA.COM :: Entertainment. Sports. Culture. Escape.
TAKUMAKU.COM :: For The Hearth And Home.
ASTER.CLOUD :: From The Cloud And Beyond.
LIWAIWAI.COM :: Intelligence, Inside and Outside.
GLOBALCLOUDPLATFORMS.COM :: For The World's Computing Needs.
FIREGULAMAN.COM :: For The Fire In The Belly Of The Coder.
ASTERCASTER.COM :: Supra Astra. Beyond The Stars.
BARTDAY.COM :: Prosperity For Everyone.

Shooting for 99.999% might be overkill

“Some of these [attempts to achieve maximum availability are] driven by some of the more legacy practices that we always had in operations—like ‘every outage is a disaster’ and ‘there’s no such thing as an acceptable outage.’ And when you get into that mindset . . . then you start to really over-engineer your solutions to try and achieve the impossible, which is 100 percent.

“And I think these are tendencies that exist within all engineering teams—preparing for the absolute worst case scenario, over-engineering solutions. And if you’re achieving a very high level of availability that really your users don’t need, it means that you have probably invested too much in it, whether that be through engineering time [or] whether that be through redundant resources. Maybe you’ve built a lot of resilience into your system that really wasn’t needed. Maybe it was through automation that you’d then need to maintain over the long term and creates toil for your team. 

“There are lots of ways that this excessive amount of reliability actually costs you, not [least in] the fact that you actually could be potentially shipping features faster to your users and taking a few more risks in the application software development life cycle.”

Start with what works for you

“You’re not going to learn all of this new stuff and implement everything overnight. I think as long as you have an intent to start and continuously improve, then you’re doing alright. I can talk about teams who don’t use SLIs and SLOs, but they do do blameless postmortems on their incidents. They do create that blameless space where not every outage is a disaster and it’s an opportunity to learn and improve. That in isolation delivers an amount of value. 

“And also, when we talk about eliminating toil and using automation to do that—to build a software solution to what would be a manual or human repetitive task—that’s again something that has value as a standalone practice. You can reproduce more consistent environments using infrastructures as code and configuration management systems. You can rebuild your pre-production environments overnight if you script it in the right way. 

“These things improve the consistency and reliability of those things in isolation, but you’re not necessarily going to get all of the benefits that all of the other SRE practices bring you.”

An audio excerpt of Foxwell laying some concrete first steps.

Read More  Upcoming: Learn 5 Key Things About Running Databases In Containers vs. VMs

 

Why she got into SRE (and why you should look into it)

“I got involved in the DevOps community to start with because I saw that good engineering practices made the environment for the humans working in software development so much better. It was about the health and wellbeing of my own team to start with—like how can we get out of this cycle of rushing towards three monthly release dates, having that enormous crunch of testing and fixing at the end. I started to research continuous delivery, and that’s how I discovered DevOps, that’s how I got interested in automation tooling and cloud. All of these things come together to actually make the life of the average software engineer better. 

“And that’s what really matters to me, because I’ve seen the impact of bad practices on people. I’ve seen burnouts, I’ve had engineers on call having relentless sleepless nights because of fragile systems in production. And that hurts. That hurt me as a manager, but it hurt my team more—it hurt their families, it hurt their relationships. It’s a very human benefit to getting these things right.

“And that’s why my career took this direction. That’s why I’m here doing this stuff and teaching our customers today, because I really do think that the teams who adopt these practices are going to be happier and healthier and more sustainable.”


For enquiries, product placements, sponsorships, and collaborations, connect with us at [email protected]. We'd love to hear from you!

Our humans need coffee too! Your support is highly appreciated, thank you!

aster.cloud

Related Topics
  • Cloud & Culture
  • Site Reliability Engineering
  • SRE
  • VMware
  • VMWare Tanzu
You May Also Like
View Post
  • Software
  • Technology

Canonical Releases Ubuntu 25.04 Plucky Puffin

  • April 17, 2025
View Post
  • Software
  • Technology

IBM Accelerates Momentum in the as a Service Space with Growing Portfolio of Tools Simplifying Infrastructure Management

  • March 27, 2025
View Post
  • Engineering
  • Technology

Guide: Our top four AI Hypercomputer use cases, reference architectures and tutorials

  • March 9, 2025
View Post
  • Computing
  • Engineering

Why a decades old architecture decision is impeding the power of AI computing

  • February 19, 2025
View Post
  • Engineering
  • Software Engineering

This Month in Julia World

  • January 17, 2025
View Post
  • Engineering
  • Software Engineering

Google Summer of Code 2025 is here!

  • January 17, 2025
View Post
  • Data
  • Engineering

Hiding in Plain Site: Attackers Sneaking Malware into Images on Websites

  • January 16, 2025
View Post
  • Computing
  • Design
  • Engineering
  • Technology

Here’s why it’s important to build long-term cryptographic resilience

  • December 24, 2024

Stay Connected!
LATEST
  • college-of-cardinals-2025 1
    The Definitive Who’s Who of the 2025 Papal Conclave
    • May 7, 2025
  • conclave-poster-black-smoke 2
    The World Is Revalidating Itself
    • May 6, 2025
  • 3
    Conclave: How A New Pope Is Chosen
    • April 25, 2025
  • Getting things done makes her feel amazing 4
    Nurturing Minds in the Digital Revolution
    • April 25, 2025
  • 5
    AI is automating our jobs – but values need to change if we are to be liberated by it
    • April 17, 2025
  • 6
    Canonical Releases Ubuntu 25.04 Plucky Puffin
    • April 17, 2025
  • 7
    United States Army Enterprise Cloud Management Agency Expands its Oracle Defense Cloud Services
    • April 15, 2025
  • 8
    Tokyo Electron and IBM Renew Collaboration for Advanced Semiconductor Technology
    • April 2, 2025
  • 9
    IBM Accelerates Momentum in the as a Service Space with Growing Portfolio of Tools Simplifying Infrastructure Management
    • March 27, 2025
  • 10
    Tariffs, Trump, and Other Things That Start With T – They’re Not The Problem, It’s How We Use Them
    • March 25, 2025
about
Hello World!

We are aster.cloud. We’re created by programmers for programmers.

Our site aims to provide guides, programming tips, reviews, and interesting materials for tech people and those who want to learn in general.

We would like to hear from you.

If you have any feedback, enquiries, or sponsorship request, kindly reach out to us at:

[email protected]
Most Popular
  • 1
    IBM contributes key open-source projects to Linux Foundation to advance AI community participation
    • March 22, 2025
  • 2
    Co-op mode: New partners driving the future of gaming with AI
    • March 22, 2025
  • 3
    Mitsubishi Motors Canada Launches AI-Powered “Intelligent Companion” to Transform the 2025 Outlander Buying Experience
    • March 10, 2025
  • PiPiPi 4
    The Unexpected Pi-Fect Deals This March 14
    • March 13, 2025
  • Nintendo Switch Deals on Amazon 5
    10 Physical Nintendo Switch Game Deals on MAR10 Day!
    • March 9, 2025
  • /
  • Technology
  • Tools
  • About
  • Contact Us

Input your search keywords and press Enter.