Neurips_2024
Two papers accepted to NeurIPS 2024! LACIE: Listener-Aware Finetuning for Confidence Calibration in Large Language Models uses pragmatics to calibrate LLMs, and GTBench: Uncovering the Strategic Reasoning Limitations of LLMs via Game-Theoretic Evaluations introduces a new game-theoretic benchmark.