AI Engineering Podcast

AI Engineering Podcast



This show is your guidebook to building scalable and maintainable AI systems. You will learn how to architect AI applications, apply AI to your work, and the considerations involved in building or customizing new models. Everything that you need to know to deliver real impact and value with machine learning and artificial intelligence.

Support the show!

04 September 2025

Revolutionizing Production Systems: The Resolve AI Approach - E59

Rewind 10 seconds
1X
Skip 30 seconds ahead
0:00/0:00

Share on social media:


Summary
In this episode of the AI Engineering Podcast, CEO of Resolve AI Spiros Xanthos shares his insights on building agentic capabilities for operational systems. He discusses the limitations of traditional observability tools and the need for AI agents that can reason through complex systems to provide actionable insights and solutions. The conversation highlights the architecture of Resolve AI, which integrates with existing tools to build a comprehensive understanding of production environments, and emphasizes the importance of context and memory in AI systems. Spiros also touches on the evolving role of AI in production systems, the potential for AI to augment human operators, and the need for continuous learning and adaptation to fully leverage these advancements.

Announcements
  • Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systems
  • Your host is Tobias Macey and today I'm interviewing Spiros Xanthos about architecting agentic capabilities for operational challenges with managing production systems.
Interview
  • Introduction
  • How did you get involved in machine learning?
  • Can you describe what Resolve AI is and the story behind it?
  • We have decades of experience as an industry in managing operational complexity. What are the critical failures in capabilities that you are addressing with the application of AI?
    • Given the existing capabilities of dedicated platforms (e.g. Grafana, PagerDuty, Splunk, etc), what is your reasoning for building a new system vs. a new feature of existing operational product?
  • Over the past couple of years the industry has developed a growing number of agent patterns. What was your approach in evaluating and selecting a particular approach for your product?
  • One of the complications of building any platform that supports operational needs of engineering teams is the complexity of integrating with their technology stack. This is doubly true when building an AI system that needs rich context. What are the core primitives that you are relying on to build a robust offering?
  • How are you managing the learning process for your systems to allow for iterative discovery and improvement?
    • What are your strategies for personalizing those discoveries to a given customer and operating environment?
  • One of the interesting challenges in agentic systems is managing the user experience for human-in-the-loop and machine to human handoffs in each direction. How are you thinking about that, especially given the criticality of the systems that you are interacting with?
  • As more of the code that is running in production environments is co-developed with AI, what impact do you anticipate on the overall operational resilience of the systems being monitored?
  • One of the challenges of working with LLMs is the cold start problem where every conversation starts from scratch. How are you approaching the overall problem of context engineering and ensuring that you are consistently providing the necessary information for the model to be effective in its role?
  • What are the most interesting, innovative, or unexpected ways that you have seen Resolve AI used?
  • What are the most interesting, unexpected, or challenging lessons that you have learned while working on Resolve AI?
  • When is Resolve AI the wrong choice?
  • What do you have planned for the future of Resolve AI?
Contact Info
Parting Question
  • From your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?
Closing Announcements
  • Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you've learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com with your story.
Closing Announcements
  • Thank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you've learned something or tried out a project from the show then tell us about it! Email hosts@aiengineeringpodcast.com with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers.
Links
The intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0

Share on social media:


Listen in your favorite app:



More options

Here are shows you might like

See show recommendations
Data Engineering Podcast
Tobias Macey
The Python Podcast.__init__
Tobias Macey