Talk Title: Reasoning with Inference-Time Compute
Abstract: One of the most striking findings in modern research on large language models (LLMs) is that scaling up compute at training time leads to better final results. However, there is another lesser-mentioned scaling phenomenon, where adopting more sophisticated methods and/or scaling compute at inference time can result in significantly better outputs from LLMs. In this talk, I will talk about our lab's recent work on using inference-time strategies to enable better reasoning. This includes training models to think prior to steps of formal mathematical proving, leveraging strong evaluation models to enable easy-to-hard generalization, and inference scaling laws that optimally balance cost and performance. Together, these advances point to a new paradigm of scaling compute at inference time.
To checkout other talks in our full NLP Seminar Series, please visit: https://www.youtube.com/playlist?list=PLcToZXRYv-6bEJziZtz7gQO_vivVDmMt_