AI development has two loops, Meta-evaluation lives in the inner loop.
We also walked through a live demo of this loop in practice, iteratively improving the judge and showing measurable gains at each step.
AI development has two loops, Meta-evaluation lives in the inner loop.
We also walked through a live demo of this loop in practice, iteratively improving the judge and showing measurable gains at each step.
Watch the session below 👇
Covered the basics of observability + evals, and showed via a Mastra agent how to set up tracing, run evals, & start your iteration cycle.
Check it out here 🚀
www.youtube.com/watch?v=qQGQ...
Watch the session below 👇
Covered the basics of observability + evals, and showed via a Mastra agent how to set up tracing, run evals, & start your iteration cycle.
Check it out here 🚀
www.youtube.com/watch?v=qQGQ...
Covered the basics of observability + evals, and showed via a Mastra agent how to set up tracing, run evals, & start your iteration cycle.
Check it out here 🚀
www.youtube.com/watch?v=qQGQ...
This lets tools like Claude query prompts, datasets, and experiment results directly from a Phoenix instance (cloud or self-hosted).
Check out our docs to learn more about how to spin it up 👇
Here’s how it works:
www.youtube.com/watch?v=mHeZ...
This lets tools like Claude query prompts, datasets, and experiment results directly from a Phoenix instance (cloud or self-hosted).
Check out our docs to learn more about how to spin it up 👇
Here’s how it works:
www.youtube.com/watch?v=mHeZ...
Our newest blog post on @hf.co has you covered!
This post shows you how to use @arize-phoenix.bsky.social to trace and evaluate your smolagents.
Credit to @srichavali.bsky.social and @aymeric-roucher.bsky.social
Our newest blog post on @hf.co has you covered!
This post shows you how to use @arize-phoenix.bsky.social to trace and evaluate your smolagents.
Credit to @srichavali.bsky.social and @aymeric-roucher.bsky.social