Help build the future of open source observability software Open positions

Check out the open source projects we support Downloads

Grot cannot remember your choice unless you click the consent notice at the bottom.

AI-powered insights for continuous profiling: introducing Flame graph AI in Grafana Cloud

AI-powered insights for continuous profiling: introducing Flame graph AI in Grafana Cloud

15 May, 2024 5 min

Like many in the observability space, we see a lot of potential in harnessing AI to enhance the developer experience. As we continue to evolve and expand our observability platform, we strive to develop features that not only solve complex problems, but make it easier to access and derive value from tools like Grafana Pyroscope

Today, after several months of dedicated development and refinement, we are thrilled to announce Flame graph AI, our AI-powered flame graph analysis feature. Available now in Grafana Cloud Profiles — our hosted continuous profiling tool powered by Grafana Pyroscope — Flame graph AI uses a large-language model (LLM) to assist with flame graph data interpretation so you can identify bottlenecks, root causes, and suggested fixes faster.

A gif showing the Flame graph AI feature.

The challenge of turning profiling data into actionable decisions

We often find ourselves in conversations with customers or community members who understand the high-level value of profiling, in terms of cutting costs, resolving incidents faster, and decreasing latency. Once added to your toolbelt, profiling is the most effective tool to both proactively and reactively understand the root cause of costs, issues, or latency in your code.  

However, many get stuck when it comes to enabling their development teams to turn flame graphs and profiling data into the actionable code changes that result in these positive outcomes.

A popular meme of a woman concentrating.

This is because instrumenting applications to collect profiling data is typically simple, but the analysis and interpretation of this data, once ingested, tends to be challenging. Flame graphs present a valuable and dense visualization for profiling data, yet even seasoned developers can find them difficult to understand. The learning curve is compounded in a real-world setting where developers are simultaneously trying to derive value out of different observability signals.

Bots vs. brains: Who’s better at flame graph interpretation?

Recognizing these challenges, the Grafana Pyroscope team saw an opportunity to use AI to simplify flame graph interpretation. 

In a recent Grafana Hackathon, we put OpenAI’s LLM to a real-world test. We sent the same flame graph (pictured below) to a diverse group of people.

A screenshot of a flame graph.

These people were categorized by their expertise in flame graph analysis: beginner, advanced, or expert.

A pie chart showing flame graph experience levels.

Then, we gave this prompt to both the AI system and the users, asking them to interpret the flame graph and answer three specific questions related to identifying and addressing a performance bottleneck:

interpret this flamegraph for me and answer the following three questions:
- **Performance Bottleneck**: What's slowing things down?
- **Root Cause**: Why is this happening?
- **Recommended Fix**: How can we resolve it?
[ ... specially compressed flamegraph data ]

When we compared the responses from the AI system to those from the users, we found that AI is better than (most) humans at interpreting flamegraphs. Here’s a closer look at some of those findings:

  • Flame graph expert users: 83% passed. They demonstrated high accuracy and detailed understanding, quickly pinpointing issues and interpreting them correctly
  • Flame graph advanced users: 70% passed. Their responses varied; some were spot on, while others didn’t dig far enough into the flame graph to identify the root cause.
  • Flame graph beginners (aka, non-technical professionals): 23% passed. This group most frequently selected the “I don’t know” response, especially when asked about the root cause and recommended fix – some very entertaining guesses though!
  • AI interpreter: 100% passed (based on 10 iterations with the same prompt). The AI interpreter consistently outperformed beginners and advanced users, providing accurate, albeit less detailed/nuanced, interpretations than the experts.
A bar chart showing the quiz results.

Flame graph AI: from concept to core feature

Following the Hackathon — and recognizing the value of AI in helping with flame graph analysis — we accelerated the development of the AI-powered flame graph tool, refining it based on extensive user feedback. 

Flame graph AI uses the LLM plugin for Grafana to provide an LLM using the OpenAI API. The feature has graduated from experimental to generally available within Grafana Cloud, and exemplifies our commitment to enhancing the usability and effectiveness of developer tools through innovation.

Flame graph AI acts as an intuitive guide through the dense data of flame graphs, highlighting key information and suggesting optimizations automatically. This not only makes flame graphs more accessible, but significantly speeds up the problem-solving and troubleshooting process.

A gif showing Flame graph AI.

Taking AI profiling further: integration with GitHub for line-level insights

While our initial AI-assisted flame graph interpretations are extremely valuable on their own, we’ve taken our AI capabilities one step further. Our new GitHub integration allows the AI system to access the source code directly linked to specific nodes in your flame graphs. This integration enables a more precise analysis by combining line-level performance data with the actual code, offering targeted recommendations and identifying anti-patterns.

Here’s what a typical workflow looks like:

  1. Identify the bottleneck: Use Flame graph AI to pinpoint performance issues in your application.
  2. Inspect via the AI-GitHub integration: The identified node can then be inspected in detail using the related source code via GitHub.
  3. Receive AI-driven recommendations: Based on the evaluation of the surrounding code, AI provides specific suggestions and advice on code optimization directly within the Grafana interface.
A gif demonstrating the GitHub integration.

The GitHub integration doesn’t just show you the issues in your code — it guides you straight to the source, enabling immediate and impactful corrections right where they are needed. Profiling is the only tool that is capable of using AI to transform insights into action with this level of efficiency and accuracy.

Try Flame graph AI now in Grafana Cloud

Experience firsthand how our AI-powered flame graph analysis can transform your approach to profiling. Sign up for Grafana Cloud today and start turning complex data into actionable insights. We have a generous forever-free tier and plans for every use case. To get started with Flame graph AI, you can also reference our technical documentation.