Predicting Uncertainty of Generative LLMs with MARS: Meaning-Aware Response Scoring

Yavuz Faruk Bakman, Duygu Nur Yaldiz, Baturalp Buyukates, Salman Avestimehr, Chenyang Tao, Dimitrios Dimitriadis

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Generative Large Language Models (LLMs) have recently been widely utilized for their unprecedented capabil-ities across many tasks. Considering their use in high-stakes environments and for mission-critical applications, the fact that LLMs often can generate inaccurate or misleading results can be potentially harmful, which motivates us to study the correctness of generative LLM outputs. Uncertainty Estimation (UE) in generative LLMs is a developing area, with state-of-the-art probability-based techniques frequently using length-normalized scoring. As an alternative to length-normalized scoring in UE, in this work, we propose Meaning-Aware Response Scoring (MARS). The key idea of MARS is to consider the semantic contribution of each token of the generated sequence to the context of the question during UE. Through extensive experiments on three question-answering datasets across five pretrained LLMs, we show that utilizing MARS during UE results in a universal and significant improvement in UE performance.
Original languageEnglish
Title of host publication2024 IEEE International Symposium on Information Theory (ISIT)
PublisherIEEE
Pages2033-2037
Number of pages5
ISBN (Electronic)9798350382846
ISBN (Print)9798350382853
DOIs
Publication statusPublished - 19 Aug 2024
Externally publishedYes
Event2024 IEEE International Symposium on Information Theory (ISIT) - Athens, Greece
Duration: 7 Jul 202412 Jul 2024
https://2024.ieee-isit.org/home

Publication series

NameIEEE International Symposium on Information Theory
PublisherIEEE
ISSN (Print)2157-8095
ISSN (Electronic)2157-8117

Conference

Conference2024 IEEE International Symposium on Information Theory (ISIT)
Abbreviated titleIEEE ISIT2024
Country/TerritoryGreece
CityAthens
Period7/07/2412/07/24
Internet address

Keywords

  • Uncertainty
  • Large language models
  • Semantics
  • Mission critical systems
  • Estimation
  • Task analysis
  • Information theory

Fingerprint

Dive into the research topics of 'Predicting Uncertainty of Generative LLMs with MARS: Meaning-Aware Response Scoring'. Together they form a unique fingerprint.

Cite this