Interactive Language Agents: Training, Evaluation, and Interface

Alane Suhr

The increasing capability of Large Language Models (LLMs) makes them appealing for adoption in labor-intensive human tasks. For example, significant efforts have recently focused on developing agents -- systems that map observations and instructions to executable actions -- and their benchmarks in real-world tasks like web navigation. In this talk, I will discuss recent work in training and improving such models through interactions with human users, and developing better evaluations for these agents, which in turn can be used to automatically improve agent performance without requiring any demonstration data or human annotation. However, in developing systems like this, and in applying LLMs and other large pre-trained models to real-world problems, we should be aware of their fundamental limitations; for example, their sensitivity to design considerations like prompt formatting. I will detail recent work where we find that LLMs can be incredibly sensitive to arbitrary design decisions, like choices of separators or multiple choice labels.

Speaker: Alane Suhr, UC Berkeley

Thursday, 03/06/25

04:00 PM - 04:50 PM

Contact:

Website: Click to Visit

Cost:

Free

Save this Event:

iCalendar
Google Calendar
Yahoo! Calendar
Windows Live Calendar

Sonoma State Dept. of Engineering Science

1801 East Cotati Ave
Cerent Engineering Science Complex, Salazar Hall Room #2009A
Rohnert Park, CA 94928

Phone: (707) 664-2030
Website: Click to Visit

<						>
S	M	T	W	T	F	S
	01	02	03	04	05	06
07	08	09	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30

Thursday, 03/06/25

Contact:

Cost:

Save this Event:

Sonoma State Dept. of Engineering Science

Categories: