Statistical methods for assessing the factual accuracy of large language models
We develop new conformal inference methods for obtaining validity guarantees on the output of large language models (LLMs). Prior work in language modeling (Mohri & Hashimoto, 2024) identifies a subset of the text that satisfies a high-probability guarantee of correctness. These methods work by filtering a claim from the LLM’s original response if a scoring function evaluated on the claim fails to exceed some estimated threshold. Existing methods in this area suffer from two deficiencies. First, the guarantee is not conditionally valid. The trustworthiness of the filtering step may vary based on the topic of the response. Second, because the scoring function is imperfect, the filtering step can remove many valuable and accurate claims. Our work addresses both of these challenges via two new conformal prediction methods. First, we show how to issue an error guarantee that is both valid and adaptive: the guarantee remains well-calibrated even though it can depend on the prompt (e.g., so that the final output retains most claims). We will also show how to optimize the accuracy of the scoring function used in this procedure, e.g., by ensembling multiple scoring approaches. We will explain how this methodology works and demonstrate its performance on several real-world examples.
Speaker: John Cherian, Stanford UniversityF
Monday, 12/02/24
Contact:
Website: Click to VisitCost:
FreeSave this Event:
iCalendarGoogle Calendar
Yahoo! Calendar
Windows Live Calendar