Developersummit
  • HOME
  • SPEAKERS
  • SESSIONS
  • SCHEDULE
  • FAQ
  • BUY TICKETS
  • ONDEMAND
  • CONTACT
saltmarch

GIDS news media, articles, insights and virtual events educate and illuminate its audiences so they can be fully prepared to deal with the new realities at work and in their professions.

Saltmarch On-Demand
Media

Our Experts

Videos On Demand

Insights

Call for Papers

Connect

About Us

Privacy Policy

Terms & Conditions

Contact Us

Subscribe to Developersummit

Get the latest event updates, and insights from today's leading voices.

© 2026-2027 Saltmarch. All rights reserved.

It Works in the Demo. Will It Work in Production? Evaluating and Debugging AI Agents
RegisterTwitterLinkedInFacebook

< session />

It Works in the Demo. Will It Work in Production? Evaluating and Debugging AI Agents

Fri, April 24 at 2:00 PM - 3:00 PM GMT+5:30DeepTech ArchitectureTechLead

An agent that performs well in a demo still faces a harder test in production, where real users, changing prompts, and unstable tools expose hidden weaknesses. This session focuses on turning a working agent into a system you can trust. Using a single concrete agent as the running example, the session defines what reliability means for multi step, tool using behavior, including success, partial success, and failure modes. It then shows how to design evaluations that reflect real usage by building golden datasets grounded in actual user intent and scenario based tests that cover full action paths. You will also learn how to structure evaluation runs, score outcomes to surface brittleness and silent failures, and set up regression tests that detect breakage as prompts, tools, or APIs change over time.

What You will Learn

  • How to define and classify reliability for multi step agents, including partial success and failure modes

  • How to build production relevant evaluation suites using golden datasets and scenario based action path tests

  • How to operationalize evaluation through scoring, regression testing, and monitoring for prompt, tool, and API drift

Who Should Attend

  • Developers building or maintaining AI agents

  • Engineers responsible for testing and reliability of AI systems

  • AI and ML practitioners deploying agents into production environments

  • Technical Leads overseeing quality and long term robustness of agent based systems

< speaker_info />

About the speaker

Apurva Misra

Apurva Misra

AI Consultant, Sentick

Apurva Misra is an AI Consultant at Sentick, focusing on assisting startups with their AI strategy and building solutions. Apurva helps startups and mid-size companies start integrating AI and develop tailored solutions that align with their business goals. She is also a speaker, regularly presenting at conferences and online events about AI applications, strategy, and innovation. Passionate about democratizing education, Apurva shares her knowledge widely to make AI more accessible to everyone. Outside of work, she is learning Spanish and loves discovering hidden gem eateries, always open to new recommendations.

Related Talks

Building LLM-Powered Agents with Real-Time Reasoning Loops

Thu, April 23

Building LLM-Powered Agents with Real-Time Reasoning Loops

Apurva Misra
GraphRAG and Explainable AI: Building Trustworthy LLM Outputs

Thu, April 23

GraphRAG and Explainable AI: Building Trustworthy LLM Outputs

Rohit Bhardwaj
Agentic RAG in Production: Orchestration, Evaluation, and ROI

Wed, April 22

Agentic RAG in Production: Orchestration, Evaluation, and ROI

Rohit Bhardwaj

On-Demand Talks

Deep Fakes 2.0 - How Neural Networks are Changing our World

Deep Fakes 2.0 - How Neural Networks are Changing our World

Thomas Endres, Martin Förtsch
Solving Analytical Problems using Apache Spark

Solving Analytical Problems using Apache Spark

Rohit Bhardwaj
Snaking Python Into Kubernetes

Snaking Python Into Kubernetes

Jonathan Johnson
Optimizing User Interactions with Intelligent Chatbots

Optimizing User Interactions with Intelligent Chatbots

Harshad Parchand
Time Synchronization using ML Techniques in Electronic Trading

Time Synchronization using ML Techniques in Electronic Trading

Bishal Mazumdar
Machine Learning Platforms

Machine Learning Platforms

Brian Sletten
All On-Demand »