Unscripted Grounded Visual Learning - Rescheduled
Computer vision has made remarkable advances through data-driven learning of image-text associations. Large-scale vision and language models like CLIP, SAM, and ChatGPT can generate compelling descriptions of images. However, these models, trained with scripted data and limited grounding, often struggle to provide detailed visual evidence and to generalize across a diverse range of infrequent visual concepts during testing. In contrast, human infants develop robust visual understanding from limited experiences, even before acquiring language. This contrast raises crucial questions: What are we missing? Do we not see without naming our visual experiences? Can vision be developed entirely from visual data without predefined labels and semantic knowledge? I will present our research progress on how we can computationally learn to abstract and generalize visual concepts directly from images and videos.
Speaker: Stella Yu, University of Michigan
Editor's Note: This talk has been rescheduled for May 27, 2025.
Thursday, 03/20/25
Contact:
Website: Click to VisitCost:
FreeSave this Event:
iCalendarGoogle Calendar
Yahoo! Calendar
Windows Live Calendar
Sonoma State Dept. of Engineering Science
Cerent Engineering Science Complex, Salazar Hall Room #2009A
Rohnert Park, CA 94928
Phone: (707) 664-2030
Website: Click to Visit