Elias Stengel-Eskin
  • about
  • blog
  • publications
  • cv
  • talks

Laser

October 3, 2024

2024

New preprint! LASeR: Learning to Adaptively Select Reward Models with Multi-Armed Bandits led by Duy Nguyen and Archiki Prasad with Mohit Bansal on using bandit methods to pick the best-suited RM to optimize at an instance level, improving LLMs on reasoning, instruction-following, and long-context understanding.

© Copyright 2025 Elias Stengel-Eskin. Powered by Jekyll with al-folio theme.