As companies move beyond prototypes and pilots, launching AI agents in production, evaluation becomes one of the hardest unsolved problems. How do you measure reliability, catch regressions, and build trust in systems that are non-deterministic and increasingly autonomous? How do you prevent embarrassing drift? How do you stop your customer support agent from offering discounts on products that don't exist, or going completely off-script? That's the core tension we want to explore. Speakers: Kilian Lieret, PhD - AI Research Scientist, Meta Superintelligence Zhou Yu - Co-Founder & CEO of Arklex.AI & CS Professor at Columbia University (Full speaker lineup to be announced) Agenda: 5 to 10 min: Welcome and framing 30 min: Moderated panel with live audience Q&A 30 min: Networking Who should attend: Whether you're an engineer building agent systems, a product leader deciding where to deploy AI, or an executive navigating the risks of putting AI agents in front of customers, this event is for you. Food and beverages provided Space is limited to 50 attendees.

Agents Behaving Badly: The Perils of Pushing AI Agents into Production

About This Event

Share Event

Date & Time

Location