Understanding and Steering Generative AI Systems
Large language models, vision-language models, and other generative AI systems are rapidly permeating society - when it was released, ChatGPT was the fastest-growing app in history. With the rapid proliferation of this technology, we need tools for society to understand and steer its effects.
One route to understanding is measuring the growth in overall capabilities of AI systems. I’ll discuss my work on the MATH and MMLU datasets, which have been used to track large language model capabilities, and revealed that experts were systematically underestimating the rate of progress in the field.
However, benchmarks alone tell us a limited story, since the promise (and complexity) of generative AI lies in its open-ended behavior. To tackle this complexity, we need tools that can adaptively query an AI model to find unexpected behaviors, then categorize them into human-interpretable patterns. I’ll describe systems that we built for this task, and how we can leverage AI as part of this pipeline.
Finally, it is not enough to understand AI models - we also need to steer them based on our understanding. I will show how, by understanding the structure of neural representations, we can steer AI models to be more accurate and truthful.
Speaker: Jacob Steinhardt, UC Berkeley
Wednesday, 09/04/24
Contact:
Website: Click to VisitCost:
FreeSave this Event:
iCalendarGoogle Calendar
Yahoo! Calendar
Windows Live Calendar
