
Reading Group (+🧋): Senior SWE-Bench
About This Event
Join the Snorkel AI Reading Group, a recurring forum to explore the latest frontier developments in AI while building meaningful connections within the community. In this session, lead researcher Henry Ehrenberg will present Senior SWE-Bench , Snorkel's new open-source benchmark for evaluating coding agents on the work we actually give them. Agenda: 4 pm - doors open 4:30 pm - talk begins 🧋🧋🧋 Boba tea and other refreshments will be provided ! 🧋🧋🧋 Among other things, you'll learn: Why most coding benchmarks treat agents like junior engineers (over-specified requirements, graded mainly on whether the code runs) when most of us already treat agents like senior engineers who work from a Slack message, not a spec. How Senior SWE-Bench's validation agent breaks the trade-off between realistic instructions and reliable grading: it writes behavioral tests adapted to each agent's actual solution, using expert-designed recipes the solving agent never sees. Why the benchmark's bug and performance tasks are sourced from real PRs with evidence of significant runtime investigation (logs,…
See the rest of the description and register on Luma.
Share Event
Date & Time
Wednesday, July 15, 2026
4:00 PM - 6:30 PM
Location
101 Second Street, San Francisco, CA 94105, USA