publications | Elias Stengel-Eskin

2025

COLM

Retrieval-Augmented Generation with Conflicting Evidence

Han Wang, Archiki Prasad, Elias Stengel-Eskin, and Mohit Bansal

arXiv preprint arXiv:2504.13079 2025
COLM

Task-Circuit Quantization: Leveraging Knowledge Localization and Interpretability for Compression

Hanqi Xiao, Yi-Lin Sung, Elias Stengel-Eskin, and Mohit Bansal

arXiv preprint arXiv:2504.07389 2025
COLM

Learning to Generate Unit Tests for Automated Debugging

Archiki Prasad*, Elias Stengel-Eskin*, Justin Chih-Yao Chen, Zaid Khan, and Mohit Bansal

arXiv preprint arXiv:2502.01619 2025
COLM

GenerationPrograms: Fine-grained Attribution with Executable Programs

David Wan, Eran Hirsch, Elias Stengel-Eskin, Ido Dagan, and Mohit Bansal

arXiv preprint arXiv:2506.14580 2025
ICCV

CAPTURe: Evaluating Spatial Reasoning in Vision Language Models via Occluded Object Counting

Atin Pothiraj, Elias Stengel-Eskin, Jaemin Cho, and Mohit Bansal

arXiv preprint arXiv:2504.15485 2025
ACL

Multi-Attribute Steering of Language Models via Targeted Intervention

Duy Nguyen, Archiki Prasad, Elias Stengel-Eskin, and Mohit Bansal

arXiv preprint arXiv:2502.12446 2025
ACL

LAQuer: Localized Attribution Queries in Content-grounded Generation

Eran Hirsch, Aviv Slobodkin, David Wan, Elias Stengel-Eskin, Mohit Bansal, and Ido Dagan

arXiv preprint arXiv:2506.01187 2025
arxiv

Context-Informed Grounding Supervision

Hyunji Lee, Seunghyun Yoon, Yunjae Won, Hanseok Oh, Geewook Kim, Trung Bui, Franck Dernoncourt, Elias Stengel-Eskin, and 2 more authors

arXiv preprint arXiv:2506.15480 2025
arxiv

CLaMR: Contextualized Late-Interaction for Multimodal Content Retrieval

David Wan, Han Wang, Elias Stengel-Eskin, Jaemin Cho, and Mohit Bansal

arXiv preprint arXiv:2506.06144 2025
arxiv

Movie Facts and Fibs (MF \^ 2): A Benchmark for Long Movie Understanding

Emmanouil Zaranis, António Farinhas, Saul Santos, Beatriz Canaverde, Miguel Moura Ramos, Aditya K Surikuchi, André Viveiros, Baohao Liao, and 23 more authors

arXiv preprint arXiv:2506.06275 2025
arxiv

CLATTER: Comprehensive Entailment Reasoning for Hallucination Detection

Ron Eliav, Arie Cattan, Eran Hirsch, Shahaf Bassan, Elias Stengel-Eskin, Mohit Bansal, and Ido Dagan

arXiv preprint arXiv:2506.05243 2025
arxiv

Executable Functional Abstractions: Inferring Generative Programs for Advanced Math Problems

Zaid Khan, Elias Stengel-Eskin, Archiki Prasad, Jaemin Cho, and Mohit Bansal

arXiv preprint arXiv:2504.09763 2025
arxiv

Symbolic Mixture-of-Experts: Adaptive Skill-based Routing for Heterogeneous Reasoning

Justin Chih-Yao Chen*, Sukwon Yun*, Elias Stengel-Eskin*, Tianlong Chen, and Mohit Bansal

arXiv preprint arXiv:2503.05641 2025
arxiv

UPCORE: Utility-Preserving Coreset Selection for Balanced Unlearning

Vaidehi Patil, Elias Stengel-Eskin, and Mohit Bansal

arXiv preprint arXiv:2502.15082 2025
arxiv

On the Trustworthiness of Generative Foundation Models: Guideline, Assessment, and Perspective

Yue Huang, Chujie Gao, Siyuan Wu, Haoran Wang, Xiangqi Wang, Yujun Zhou, Yanbo Wang, Jiayi Ye, and 57 more authors

arXiv preprint arXiv:2502.14296 2025
CVPR

VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos

Ziyang Wang*, Shoubin Yu*, Elias Stengel-Eskin*, Jaehong Yoon, Feng Cheng, Gedas Bertasius, and Mohit Bansal

CVPR 2025
ICLR Spotlight

DataEnvGym: Data Generation Agents in Teacher Environments with Student Feedback

Zaid Khan, Elias Stengel-Eskin, Jaemin Cho, and Mohit Bansal

ICLR (Spotlight) 2025
ICLR

System-1.x: Learning to balance fast and slow planning with language models

Swarnadeep Saha, Archiki Prasad, Justin Chih-Yao Chen, Peter Hase, Elias Stengel-Eskin, and Mohit Bansal

ICLR 2025
ICLR

See It from My Perspective: Diagnosing the Western Cultural Bias of Large Vision-Language Models in Image Understanding

Amith Ananthram, Elias Stengel-Eskin, Carl Vondrick, Mohit Bansal, and Kathleen McKeown

ICLR 2025
NAACL

Teaching Models to Balance Resisting and Accepting Persuasion

Elias Stengel-Eskin, Peter Hase, and Mohit Bansal

NAACL 2025
NAACL

MAMM-Refine: A Recipe for Improving Faithfulness in Generation with Multi-Agent Collaboration

David Wan, Justin Chih-Yao Chen, Elias Stengel-Eskin, and Mohit Bansal

NAACL 2025
NAACL

AdaCAD: Adaptively Decoding to Balance Conflicts between Contextual and Parametric Knowledge

Han Wang, Archiki Prasad, Elias Stengel-Eskin, and Mohit Bansal

NAACL 2025

2024

NeurIPS

LACIE: Listener-Aware Finetuning for Confidence Calibration in Large Language Models

Elias Stengel-Eskin, Peter Hase, and Mohit Bansal

NeurIPS 2024
TMLR

Fundamental Problems With Model Editing: How Should Rational Belief Revision Work in LLMs?

Peter Hase, Thomas Hofweber, Xiang Zhou, Elias Stengel-Eskin, and Mohit Bansal

TMLR 2024
ACM

MIRACLE: An Online, Explainable Multimodal Interactive Concept Learning System

Ansel Blume, Khanh Duy Nguyen, Zhenhailong Wang, Yangyi Chen, Michal Shlapentokh-Rothman, Xiaomeng Jin, Jeonghwan Kim, Zhen Zhu, and 22 more authors

ACM Conference on Multimedia 2024
ECCV

Contrastive Region Guidance: Improving Grounding in Vision-Language Models without Training

David Wan, Jaemin Cho, Elias Stengel-Eskin, and Mohit Bansal

ECCV 2024
ICML

Language-guided Skill Learning with Temporal Variational Inference

Haotian Fu, Pratyusha Sharma, Elias Stengel-Eskin, George Konidaris, Nicolas Le Roux, Marc-Alexandre Côté, and Xingdi Yuan

ICML 2024
CoLLAs

Sub-goal Distillation: A Method to Improve Small Language Agents

Maryam Hashemzadeh, Elias Stengel-Eskin, Sarath Chandar, and Marc-Alexandre Cote

Third Conference on Lifelong Learning Agents 2024
ACL

Soft Self-Consistency Improves Language Model Agents

Han Wang*, Archiki Prasad*, Elias Stengel-Eskin*, and Mohit Bansal

ACL 2024
NeurIPS

GTBench: Uncovering the Strategic Reasoning Limitations of LLMs via Game-Theoretic Evaluations

Jinhao Duan, Renming Zhang, James Diffenderfer, Bhavya Kailkhura, Lichao Sun, Elias Stengel-Eskin, Mohit Bansal, Tianlong Chen, and 1 more author

NeurIPS 2024
ICML

ReGAL: Refactoring Programs to Discover Generalizable Abstractions

Elias Stengel-Eskin*, Archiki Prasad*, and Mohit Bansal

ICML 2024
ICML

MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models

Justin Chih-Yao Chen, Swarnadeep Saha, Elias Stengel-Eskin, and Mohit Bansal

ICML 2024
ICLR

Zero and Few-shot Semantic Parsing with Ambiguous Inputs

Elias Stengel-Eskin, Kyle Rawlins, and Benjamin Van Durme

ICLR 2024
ICLR

Rephrase, Augment, Reason: Visual Grounding of Questions for Vision-Language Models

Archiki Prasad, Elias Stengel-Eskin, and Mohit Bansal

ICLR 2024
arxiv

LASeR: Learning to Adaptively Select Reward Models with Multi-Armed Bandits

Duy Nguyen, Archiki Prasad, Elias Stengel-Eskin, and Mohit Bansal

arXiv 2024
arxiv

MAgICoRe: Multi-Agent, Iterative, Coarse-to-Fine Refinement for Reasoning

Justin Chih-Yao Chen, Archiki Prasad, Swarnadeep Saha, Elias Stengel-Eskin, and Mohit Bansal

arXiv 2024
arxiv

Are language models rational? The case of coherence norms and belief revision

Thomas Hofweber, Peter Hase, Elias Stengel-Eskin, and Mohit Bansal

arXiv 2024

2023

EMNLP

Did You Mean...? Confidence-based Trade-offs in Semantic Parsing

Elias Stengel-Eskin, and Benjamin Van Durme

EMNLP 2023
TACL

Calibrated Interpretation: Confidence Estimation in Semantic Parsing

Elias Stengel-Eskin, and Benjamin Van Durme

TACL 2023
ACL

Why Did the Chicken Cross the Road? Rephrasing and Analyzing Ambiguous Questions in VQA

Elias Stengel-Eskin, Jimena Guallar-Blasco, Yi Zhou, and Benjamin Van Durme

ACL 2023
CVPR

Super-CLEVR: A Virtual Benchmark to Diagnose Domain Robustness in Visual Reasoning (CVPR Highlight)

Zhuowan Li, Xingrui Wang, Elias Stengel-Eskin, Adam Kortylewski, Wufei Ma, Benjamin Van Durme, and Alan Yuille

CVPR 2023

2022

MASCSLL

Automatic Evaluation of Chit-chat via Semantic Parsing

Shalaka Vaidya, Elias Stengel-Eskin, and João Sedoc

Mid-Atlantic Student Colloquium on Speech, Language and Learning 2022
EMNLP

The Curious Case of Control

Elias Stengel-Eskin, and Benjamin Van Durme

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing 2022
EMNLP

When More Data Hurts: A Troubling Quirk in Developing Broad-Coverage Natural Language Understanding Systems

Elias Stengel-Eskin, Emmanouil Antonios Platanios, Adam Pauls, Sam Thomson, Hao Fang, Benjamin Van Durme, Jason Eisner, and Yu Su

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing 2022
NAACL

Visual Commonsense in Pretrained Unimodal and Multimodal Models

Chenyu Zhang, Benjamin Van Durme, Zhuowan Li, and Elias Stengel-Eskin

In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) Jul

2021

CoRL

Guiding Multi-Step Rearrangement Tasks with Natural Language Instructions

Elias Stengel-Eskin*, Andrew Hundt*, Zhuohong He, Aditya Murali, Nakul Gopalan, Matthew Gombolay, and Gregory D. Hager

In 5th Annual Conference on Robot Learning Jul
ICCV

Calibrating Concepts and Operations: Towards Symbolic Reasoning on Real Images

Zhuowan Li, Elias Stengel-Eskin, Yixiao Zhang, Cihang Xie, Quan Tran, Benjamin Van Durme, and Alan Yuille

In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Oct
UnImplicit

Human-Model Divergence in the Handling of Vagueness

Elias Stengel-Eskin, Jimena Guallar-Blasco, and Benjamin Van Durme

In Proceedings of the 1st Workshop on Understanding Implicit and Underspecified Language Aug
TACL

Joint Universal Syntactic and Semantic Parsing

Elias Stengel-Eskin, Kenton Murray, Sheng Zhang, Aaron Steven White, and Benjamin Van Durme

Transactions of the Association for Computational Linguistics 2021 Aug
SCiL

Exploring Human-Model Divergence Through Vagueness

Elias Stengel-Eskin, Jimena Guallar-Blasco, and Benjamin Van Durme

Proceedings of the Society for Computation in Linguistics 2021 Feb
TACL

Iterative Paraphrastic Augmentation with Discriminative Span Alignment

Ryan Culkin, J. Edward Hu, Elias Stengel-Eskin, Guanghui Qin, and Benjamin Van Durme

Transactions of the Association for Computational Linguistics 2021 May

2020

ACL

Universal Decompositional Semantic Parsing

Elias Stengel-Eskin, Aaron Steven White, Sheng Zhang, and Benjamin Van Durme

In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics May
LREC

The Universal Decompositional Semantics Dataset and Decomp Toolkit

Aaron Steven White, Elias Stengel-Eskin, Siddharth Vashishtha, Venkata Subrahmanyan Govindarajan, Dee Ann Reisinger, Tim Vieira, Keisuke Sakaguchi, Sheng Zhang, and 3 more authors

In Proceedings of The 12th Language Resources and Evaluation Conference May

2019

EMNLP

A Discriminative Neural Model for Cross-Lingual Word Alignment

Elias Stengel-Eskin, Tzu-Ray Su, Matt Post, and Benjamin Van Durme

In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) May

2017

Interspeech

Polyglot and Speech Corpus Tools: A System for Representing, Integrating, and Querying Speech Corpora.

Michael McAuliffe, Elias Stengel-Eskin, Michaela Socolof, and Morgan Sonderegger

In INTERSPEECH May