Arize AI
banner
arize.bsky.social
Arize AI
@arize.bsky.social
Arize is an AI engineering platform focused on evaluation and observability. It helps engineers develop, evaluate, and observe AI applications and agents.
Reposted by Arize AI
In our latest Evals Series webinar, we covered how to evaluate your evaluator.

AI development has two loops, Meta-evaluation lives in the inner loop.

We also walked through a live demo of this loop in practice, iteratively improving the judge and showing measurable gains at each step.
December 12, 2025 at 7:08 PM
ICYMI: we gathered with builders in NYC last night to dive into what really moves the needle after an agent is deployed.

A lot of progress starts in production. This means learning from how your agent behaves in the wild and feeding those findings back into your development.

Full video ⬇️
December 11, 2025 at 5:31 PM
AI is woven into customer journeys at TheFork, one of Europe’s leading restaurant discovery and booking platforms.

And Arize AX is layered across TheFork's stack, with tracing helping drive tangible wins in terms of lower latency, clearer cost signals, and faster iteration.
December 10, 2025 at 1:30 PM
🗽NYC: join us for an evening of talks at betaworks on building, scaling, and improving AI agents in production. RSVP: luma.com/xx88ixvb
NYC AI Builders Meetup: Building and Improving Agents with Arize AI & CrewAI · Luma
Join Arize AI for an evening of technical talks and networking focused on building, scaling, and improving AI agents in production. Hear from industry experts…
luma.com
December 8, 2025 at 6:38 PM
Metals and mining giant Rio Tinto now relies on Arize AX as it evaluates and deploys new gen-AI use cases.
December 5, 2025 at 7:28 PM
Get certified 🎓 in AI Agent Mastery in our new, free course led by Srilakshmi Chavali: courses.arize.com/l/pdp/ai-ag...

This course covers the latest in:
🟣 Agent architectures and frameworks
🟣 Tools & MCP
🟣 Agentic RAG
🟣 Agent evaluation
🟣 Post-deployment and monitoring

Each module has a lab.
December 5, 2025 at 2:30 PM
Reposted by Arize AI
Spoke at @arize.bsky.social’s AI Builder Meetup a few weeks back & the talk is now live!

Covered the basics of observability + evals, and showed via a Mastra agent how to set up tracing, run evals, & start your iteration cycle.

Check it out here 🚀
www.youtube.com/watch?v=qQGQ...
TypeScript Agents: How To Build and Evaluate
YouTube video by Arize AI
www.youtube.com
December 4, 2025 at 7:23 PM
Reposted by Arize AI
Check out this walkthrough on bringing observability and evals into LLM workflows, plus a Phoenix demo with helpful context for anyone building agents in TypeScript.

Watch the session below 👇
Spoke at @arize.bsky.social’s AI Builder Meetup a few weeks back & the talk is now live!

Covered the basics of observability + evals, and showed via a Mastra agent how to set up tracing, run evals, & start your iteration cycle.

Check it out here 🚀
www.youtube.com/watch?v=qQGQ...
TypeScript Agents: How To Build and Evaluate
YouTube video by Arize AI
www.youtube.com
December 4, 2025 at 7:24 PM
Our LLM-as-a-Judge 101 virtual workshop was so popular, we're returning for LLM-as-a-Judge 102. 🎓 RSVP: luma.com/ab78cmgo

In this session focused on meta-evaluation, you’ll learn advanced techniques -- like using high-temperature stress tests to detect prompt ambiguity or unstable reasoning.
December 2, 2025 at 3:10 PM
In case you missed it last week: we released day 0 support for Claude Opus 4.5 in Phoenix! Try it out in the prompt playground today!

Learn about the prompt playground:
arize.com/docs/phoeni...

Sign up for Phoenix Cloud:
app.phoenix.arize.com/

Release notes:
github.com/Arize-ai/ph...
December 1, 2025 at 9:56 PM
Arize AX + AWS Bedrock AgentCore = a complete production system where you can deploy agents with confidence and improve them continuously based on real data.

From the floor of #reinvent, a new notebook + blog runs through a travel planning agent example.

Dive in: arize.com/blog/aws-be...
AWS Bedrock AgentCore Observability with Arize AX: Operationalizing AI Agents At Scale
Building an AI agent in a notebook is straightforward. Getting that agent to run reliably at scale is a different challenge entirely. Most teams hit the same production walls: agents...
arize.com
December 1, 2025 at 7:20 PM
If you've been debugging agents by scrolling through spans, Arize AX can help you do better!

Agent Graph gives you a node-based visual map of your agent workflows, so you can instantly see execution paths, identify failure points, spot self-looping behavior, and more!

arize.com/docs/ax/obs...
November 28, 2025 at 5:00 PM
Arize AX Monitors: set threshold-based alerts for what matters in your LLM apps—latency, hallucination rates, eval failures, token usage, errors. One-click setup for common metrics, or fully custom. Get notified before your users notice something's wrong.

Learn more:
arize.com/docs/ax/obs...
November 27, 2025 at 5:00 PM
Just dropped: a new deep-dive on debugging your Google ADK Agent with Arize Phoenix!

𝗪𝗮𝘁𝗰𝗵 𝘁𝗵𝗲 𝗳𝘂𝗹𝗹 𝘁𝘂𝘁𝗼𝗿𝗶𝗮𝗹 𝘁𝗼 𝗹𝗲𝗮𝗿𝗻 𝗵𝗼𝘄 𝘁𝗼:
▲ Implement tracing.
▲ Debug complex agent logic and tool usage.
▲ Run automated evaluations.

Full video: www.youtube.com/watch?v=fBs...
How to build reliable AI Agents?
Join Ivan and Aparna Dhinakaran, CPO of Arize AI, as they discuss the challenges of moving AI agents from prototype to production. This tutorial demonstrates...
www.youtube.com
November 26, 2025 at 5:04 PM
The team is gearing up for an epic @AWSreInvent! We have fun happenings all week (chocolate tastings, dinners, happy hours) — join us! arize.com/aws-reinven...
November 24, 2025 at 8:01 PM
Arize AX is listed as an Emerging Leader in the "Emerging Market Quadrant for Generative AI Engineering" in Gartner's latest "Innovation Guide for Generative AI Engineering" report (13 November).
November 20, 2025 at 2:00 PM
Microsoft's red teaming agent in Microsoft Foundry generates sophisticated prompts designed to simulate adversarial attacks. Arize AX can help make these vulnerabilities visible and actionable.

New blog + notebook outlines how to create self-improving agent security: arize.com/blog/how-to...
November 19, 2025 at 7:47 PM
Microsoft Foundry + Arize AX = everything you need for self-improving agents.

From the floor of #MSIgnite, a new notebook + blog walks through a concrete content safety evaluation example.

📓 Explore: arize.com/blog/evalua...
November 18, 2025 at 8:58 PM
Learn how to build better AI applications with this introduction to LLM-as-a-judge evaluation!

Check out the full video:
www.youtube.com/watch?v=pnl...
LLM-as-a-Judge 101
Curious about AI evals, but not sure where to start? In this hands-on, beginner-friendly session, we walk you through the core building blocks of LLM-as-a-ju...
www.youtube.com
November 18, 2025 at 5:00 PM
We benchmarked Prompt Learning (prompt optimizer) against GEPA and saw similar/better results in a fraction of the time.

Since we launched Prompt Learning in July, the most common question we get is:
“Prompt Learning or GEPA — which should I use?”
We break down the results below.
November 17, 2025 at 9:22 PM
Our MCP Server Tracing Assistant's integration with Google Gemini CLI enables you to ask natural-language questions like:

🟪 “Instrument this app using Arize AX.”

🟪 “How can I redact sensitive information from my spans?”

🔍 Find: Go to the extensions tab of geminicli.com/ and type in 'arize'.
Build, debug & deploy with AI
geminicli.com
November 14, 2025 at 11:21 PM
Google ADK + Arize AX = a unified experience for building, deploying, and refining multiagent systems.

𓂅 Try both on for size with this new how-to + code that walks through a ✈️ travel concierge example: arize.com/blog/tracin...
Tracing, Evaluation, and Observability for Google ADK (How To)
Walks through a concrete example with code on setting up an AI agent travel concierge with Google ADK and evals, tracing, and observability.
arize.com
November 14, 2025 at 5:17 PM
Our thanks to Google Cloud for hosting in Sunnyvale and speaking alongside Meta AI and the Arize crew at "Agents In Action" last night!
November 13, 2025 at 8:00 PM
Quickly add valuable human insights to your sessions without breaking your flow!

With our new Session Annotations, you can now add notes directly from the Session Page, eliminating the need to switch between views or lose context.
November 13, 2025 at 5:00 PM