How is new knowledge integrated in your neural network interconnect of your transformer architecture? Strange phenomenons happen. We explore all of them. Data efficient learning for AI.
All rights authors:
How new data permeates LLM knowledge and
how to dilute it
Chen Sun1, Renat Aksitov1, Andrey Zhmoginov1, Nolan Andrew Miller1, Max Vladymyrov1, Ulrich Rueckert1,
Been Kim1 and Mark Sandler1
1 Google DeepMind
Identifying and Mitigating the Influence of the Prior
Distribution in Large Language Models
Liyi Zhang
Department of Computer Science
Princeton University
Princeton, NJ 08540, USA
[email protected]
Veniamin Veselovsky
Department of Computer Science
Princeton University
Princeton, NJ 08540, USA
[email protected]
R. Thomas McCoy
Department of Linguistics
Yale University
New Haven, CT 06511, USA
[email protected]
Thomas L. Griffiths
Department of Psychology and Computer Science
Princeton University
Princeton, NJ 08540, USA
[email protected]
Memorization vs. Reasoning: Updating LLMs with New Knowledge
Aochong Oliver Li
Computer Science, Cornell University
[email protected]
Tanya Goyal
Computer Science, Cornell University
[email protected]
Recommended:
-------------------------
SYNTHETIC CONTINUED PRETRAINING
Zitong Yang∗
Department of Statistics
Stanford University
Neil Band∗
Department of Computer Science
Stanford University
Shuangping Li
Department of Statistics
Stanford University
Emmanuel Cand`es
Department of Statistics
Stanford University
Tatsunori Hashimoto
Department of Computer Science
Stanford University
Multi‑Agent Reinforcement Learning
MCP - Model Context Protocol
A2A Agent 2 Agent
Latency Drift
MARL Tutorial
RL Failure Modes
AI Research
Distributed Systems
Agent Coordination
AI data pipeline