Agents
and
Actions
Articles
Podcast
Learnings
Newsletter
Graph
About
Subscribe
← All episodes
Episode 11
Evaluating Agents: What Actually Matters
January 4, 2025
·
51:30
·
990 views
🎙
Connect your YouTube channel to display the actual video here
See setup instructions
Most eval frameworks measure what's easy to measure. This episode walks through building evals that actually catch the failure modes that matter in production.