Chaos Engineering Meets AIOps - Michele Dodič & Francesco Sbaraglia | Chaos Carnival 2023

147 Aufrufe
Published
Chaos Engineering Meets AIOps - Michele Dodič & Francesco Sbaraglia - SRE tech leads in ASG, Accenture | Chaos Carnival 2023

Speakers: Michele Dodič & Francesco Sbaraglia

About the talk:
Imagine this: you’re a Site Reliability Engineer (SRE) at a major tech giant and you are responsible for the overall system health, which is running in prod. Numerous alerts, server crashes, Jira tickets, incidents and an avalanche of responsibilities, which sometimes simply feel like a ticking time bomb. These are just some of the daily struggles an average SRE needs to go through. But why should it be like that? Well, it shouldn’t - thanks to a term coined by Gartner in 2016. AIOps, meet audience. Audience, meet AIOps.

Let’s extend this scenario. On top of all of the above mentioned issues, our poor SRE needs to watch out for potential security breaches and make sure nothing ever gets in through the cracks. However, by conducting proactive experimenting, continuous verification and improvement, he makes sure that the system is able to withstand these turbulent and malicious times that we’re living in. Do these notions ring any bells? They sure do! Chaos Engineering (CE), meet audience. Audience, meet Chaos Engineering.

What’s our angle, you’re wondering? AIOps and CE are two concepts, which are often kept separate. In this talk, we will discuss (and show you!) how both practices combined can significantly increase cyber resiliency, while at the same time maintain full E2E transparency and observability of your entire system.

For this session, we have prepared and analyzed several use cases, followed main principles, summarized best practices and prepared a live demo through a combination of CE and AIOps tools.

Above all, we are SRE Engineers. As such, during this session, we will stay close to the SRE principles and best practices that we used to achieve our goals, e.g. reduce organizational silos, measure everything, learn from failures, analyze changes holistically, etc… As we proceed with our talk, the audience will be able to identify how these are related to AIOps, as well as CE, and finally, how it all ties together.

About the speakers:
Michele Dodič - Michele is an SRE DevOps Specialist at Accenture. Given his previous experience as a Software Engineer in the field of AI and Industrial Automation, Michele is now helping large customers enable Observability and AIOps into their systems. Aside from implementing novel technical solutions and POCs, Michele is a lead member of several DevOps communities, where he is predominantly focusing on SRE-related offerings, with the goal of promoting SRE principles, practices and culture. Michele is also a public speaker (Conf42: Chaos Engineering 2021, Splunk .conf22 Las Vegas, DevOpsCon 2022 Berlin, data2day 2022, Conf42: DevOps 2023).
Connect with Michele: https://www.linkedin.com/in/michele-dodic/

Francesco Sbaraglia - Francesco is a Site Reliability Engineering and DevSecOps Coach and SME. He has over 20 years of experience solving production problems in corporate, startups, and governments. He has deep expertise in sre, automation, security, observability, multi-cloud, and chaos engineering. He is currently growing the SRE & AiOps Capability at Accenture Germany.

Connect with Francesco: https://francescosbaraglia.com/
Kategorien
Corona Virus aktuelle Videos Gesundheits Tipps
Kommentare deaktiviert.