eyond the Benchmarking Paradigm: Audits & Evaluation in the Age of Artificial Intelligence
Despite great potential, there is a growing gap between what AI systems promise and what they deliver, with real human costs.
AI auditing is the practice of independently evaluating deployed AI systems to determine how they behave, what risks they pose, and whether they meet their intended objectives. This interdisciplinary endeavor requires both a technical expansion of our current AI evaluation paradigm and a framework for ensuring that audit investigations are sufficiently material for downstream legal actions and normative debates. At the intersection of law and public policy, applied economics and computer science, we can advance AI auditing policy & practice in material ways ??" by anchoring notions of engineering responsibility in AI development, expanding our vocabulary of AI evaluation methods, and pushing to connect AI audit outcomes to organizational and legal consequences. Through case studies of AI use in healthcare and government, we demonstrate how novel evaluation methods such as incident reporting, workflow simulations and pilot experiments can supplement standard practices like data benchmarking to more adequately inform AI governance, shaping a range of outcomes from documentation and procurement to regulatory enforcement and product safety compliance. As auditing makes its way into key policy proposals as a primary mechanism for AI accountability, we must think critically about the necessary technical and institutional infrastructure required for this form of oversight to successfully enable safe widespread AI adoption.
Speaker: Inioluwa Deborah Raji, UC Berkeley
Thursday, 02/26/26
Contact:
Website: Click to VisitCost:
FreeSave this Event:
iCalendarGoogle Calendar
Yahoo! Calendar
Windows Live Calendar
