MENU

Fun & Interesting

How I solved production issues with OpenTelemetry (and how you can too)

OpenValue 260 5 months ago
Video Not Working? Fix It Now

In today's fast-paced world, ensuring the reliability of your Java applications is critical. But how do you effectively identify and resolve production issues before they escalate? With cloud-native applications, it can be even more difficult because you can't log into the system to get some of the data you need. The answer lies in observability - and more specifically, OpenTelemetry. In this session, I'll take you behind the scenes of several production problems I've solved using OpenTelemetry. You'll learn how I uncovered critical problems that were invisible without the right telemetry data - and how you can do the same. From tracking down elusive bugs with traces to uncovering system bottlenecks with metrics, OpenTelemetry provides the tools you need to truly understand what's happening in your application in real time. A key concept in all this are traces, especially in microservices landscapes. That's because architecture diagrams often don't tell the whole story. I'll show you how these traces can help you build a service graph and save you hours in a crisis. A service graph gives you the overview and helps you to pinpoint where to look for problems. Whether you're new to observability or an experienced professional, this session will give you the practical insights and tools to dramatically improve the observability of your application - and change the way you handle production issues. Solving problems is much easier when you have the right data at your fingertips. About Cees Observability enthusiast and Grafana Champion, always looking for ways to improve software with better observability and what new insights can be gained. Delivering reliable software is always my priority. Combining existing tools often results in new ways to get an even better view. I work for OpenValue as a software & observability engineer and SRE.

Comment