Abstract: We will discuss how vLLM combines continuous batching with speculative decoding with a focus on enabling external contributors. Topics include proposer/scorer/verifier framework, proposal methods, lookahead scheduling, dynamic speculative decoding, and future contribution ideas. Speaker: Cade Daniel Slides: https://docs.google.com/presentation/d/1p1xE-EbSAnXpTSiSI0gmy_wdwxN5XaULO3AnCWWoRe4/edit