エピソード

  • You've Just Been Paged. Now What?
    2024/10/23

    Covering the challenges of managing a self-hosted observability stack and Dan’s unique on-call experiences, like troubleshooting an incident from an airplane! Later we dive into the Production Engineering initiative at HashiCorp designed to immerse software engineers in proactively exploring observability data to gain a deeper understanding of system behavior. Of course we cover the significance of understanding roles in incident management, like incident leads and scribes, and how to identify your personal style of leading incidents.

    Check out Dan's course Leading Incidents and catch up on his Monitorama talk "No Observability Without Theory" or read the blog version. Connect with him on Mastodon

    Follow Off-Call

    • Mastodon
    • X (Twitter)
    • https://offcall.simplecast.com/

    Check out Chronosphere at www.chronosphere.io

    続きを読む 一部表示
    44 分
  • Data, data everywhere
    2024/10/01
    • [00:30] Matthew's First Monitoring Tool was.....
    • [02:08] 11 years (and counting) on-call!
    • [07:40] That Time Storing A Password In Cleartext Saved The Day
    • [12:29] Unwinding Off Call & Signs of Burnout
    • [17:13] Responding To A Customer's Malicious Insider Incident
    • [19:32] What app devs need to know about monitoring databases
    • [28:32] Has database monitoring gotten better over time?
    • [32:15] Operating DBs in the time of Kubernetes
    • [34:44] Wrap Up

    Paige chats with Matthew Sanabria, Staff SRE at Cockroach Labs, about memorable moments from his 11 years on-call, the warning signs of burnout, and what developers and operators can do for each other to improve database observability for everyone!


    Find Matthew

    • Personal Site https://matthewsanabria.dev/
    • Twitter https://twitter.com/sudomateo
    • Mastodon https://mastodon.online/@sudomateo@mastodon.online
    • Cockroach Labs https://www.cockroachlabs.com/
    • Platform Engineering New York (PENY) Meetup https://www.meetup.com/platform-engineering-new-york/

    Follow Off-Call

    • Mastodon
    • X (Twitter)
    • https://offcall.simplecast.com/

    Check out Chronosphere at www.chronosphere.io

    続きを読む 一部表示
    37 分
  • It's The Network (part 3)
    2024/09/13

    Overview

    • 00:45 | Network Observability FTW!
    • 07:09 | How has networking changed in the last 5-10 years?
    • 10:28 | Has Kubernetes made network engineers' lives easier?

    Paige and Leon cover need to know networking concepts for developers and answer the very important questions “How much has networking really changed in the last 5-10 years?” AND “Has Kubernetes made any of this stuff easier?!”

    Leon’s Links

    • Adatosystems.com
    • LinkedIn
    • Mastodon
    • Bluesky


    Recommended Resources

    • Alerts don't suck YOUR alerts suck!
    • Technically Religious
    • Imperfect Genius - Rachel Foster
    • Screaming In The Cloud - Corey Quinn
    • www.paigerduty.com
    • Telemetry Now
    • What's New at Kentik
    • Kentik Close-Up

    Follow Off-Call

    • Mastodon
    • X (Twitter)
    • https://offcall.simplecast.com/

    Check out Chronosphere at www.chronosphere.io

    続きを読む 一部表示
    17 分
  • It's The Network with Leon Adato (part 2)
    2024/09/06

    It's here! Part 2 (of 3) of my conversation with Leon Adato!.

    We chat about how to relate monitoring to the only 3 things businesses care about, how the medical field's "See One, Teach One, Do One" can be adapted for onboarding devs, the real life give and take with on-call, and our requirements for paging alerts!

    • 03:33 On-Call Advice for Newbies
    • 09:25 Leon's Off-Call Life
    • 13:32 Tempering On-Call With A Dose of Reality
    • 15:27 The Only Experience That Matters Is The Users'
    • 18:14 Wrap Up

    Find Leon around the web:

    • Alerts don't suck YOUR alerts suck!
    • Adatosystems.com
    • LinkedIn
    • Mastodon
    • Bluesky
    • Telemetry Now
    • What's New at Kentik

    Follow Off-Call

    • Mastodon
    • X (Twitter)
    • https://offcall.simplecast.com/

    Check out Chronosphere at www.chronosphere.io

    続きを読む 一部表示
    21 分
  • It's The Network with Leon Adato (part 1)
    2024/07/31

    About Leon Adato

    In my sordid career, I have been an actor, bug exterminator and wild-animal remover (nothing crazy like pumas or wildebeests. Just skunks, snakes, and raccoons.), electrician, carpenter, stage-combat instructor, ASL interpreter, and Sunday school teacher. Oh, yeah, I’ve also worked with computers.

    While my first keyboard was an IBMs electric, and my first digital experience was on an Atari 400, my professional work in tech started in 1989 (when you got Windows 286 for free on twelve 5¼” when you bought Excel 1.0). Since then I’ve worked as a classroom instructor, courseware designer, helpdesk operator, desktop support staff, sysadmin, network engineer, and software distribution technician.

    Then, about 25 years ago, I got involved with monitoring. I’ve worked with a wide range of tools: Tivoli, BMC, OpenView, janky perl scripts, Nagios, SolarWinds, DOS batch files, Zabbix, Grafana, New Relic, and other assorted nightmare fuel. I’ve designed solutions for companies that were modest (~10 systems), significant (5,000 systems), and ludicrous (250,000 systems). In that time, I’ve learned a lot about monitoring and observability in all its many and splendid forms.

    Find Leon around the web:

    • Adatosystems.com
    • LinkedIn
    • Mastodon
    • Bluesky
    • Telemetry Now
    • What's New at Kentik

    Follow Off-Call

    • Mastodon
    • X (Twitter)
    • https://offcall.simplecast.com/

    Check out Chronosphere at www.chronosphere.io

    続きを読む 一部表示
    26 分
  • Introducing Off-Call
    2024/06/14
    Introducing Off-Call, a podcast that explores the ways in which software and systems break, the people behind the pagers, and the data they use to put everything back together. Brought to you by Paige Cruz and Chronosphere. Follow Off-Call - Mastodon - X (Twitter) - https://offcall.simplecast.com/ Check out Chronosphere at www.chronosphere.io
    続きを読む 一部表示
    1 分