Structured Output from LLMs: Grammars, Regex, and State Machines

Efficient NLP 3,047 lượt xem 4 months ago

Video Not Working? Fix It Now

Try Voice Writer - speak your thoughts and let AI handle the grammar: https://voicewriter.io

Structured outputs are essential for applications that integrate LLMs to make decisions in downstream tasks. In this video, I explain how structured output generation works - a topic that is very relevant and also an area of active research.

First, we look at OpenAI API's ability to produce structured outputs using formats like Pydantic or Zod. For open-source alternatives, I cover the Outlines library, which operates using state machines and regex under the hood.

However, in many cases, we need to generate outputs according to a context-free grammar (CFG), which introduces the need for pushdown automata. Learn how advanced techniques address the challenges of grammar terminals mismatching with LLM tokenization, why this is a problem, and some creative solutions from recent research papers.

0:00 - Introduction
1:06 - OpenAI API example
3:02 - Outlines library example
4:07 - Pydantic to regex conversion
4:57 - Finite state machines and regex
5:58 - Regex matching with LLMs
8:41 - Context free grammars
9:40 - Incremental parsing of CFGs
11:22 - Pushdown automata
12:18 - Token-terminal mismatch problem
14:26 - Vocabulary-aligned subgrammars
15:12 - State machine composition
16:06 - Format restriction and LLM performance

OpenAI Structured Outputs API: https://platform.openai.com/docs/guides/structured-outputs

Outlines library: https://github.com/dottxt-ai/outlines

References

Willard, Brandon T., and Rémi Louf. "Efficient guided generation for large language models." arXiv preprint arXiv:2307.09702 (2023). https://arxiv.org/abs/2307.09702

Geng, Saibo, et al. "Grammar-Constrained Decoding for Structured NLP Tasks without Finetuning." EMNLP 2023. https://arxiv.org/abs/2305.13971

Beurer-Kellner, Luca, Marc Fischer, and Martin Vechev. "Guiding LLMs The Right Way: Fast, Non-Invasive Constrained Generation." ICML 2024. https://arxiv.org/abs/2403.06988

Koo, Terry, Frederick Liu, and Luheng He. "Automata-based constraints for language model decoding." COLM 2024. https://arxiv.org/abs/2407.08103

Tam, Zhi Rui, et al. "Let Me Speak Freely? A Study on the Impact of Format Restrictions on Performance of Large Language Models.“ EMNLP 2024. https://arxiv.org/abs/2408.02442

Comment