Organised by Evolution AI - AI data extraction from financial documents - https://www.evolution.ai/
Abstract: Over more than a decade there has been an extensive research effort on how to effectively utilize recurrent models and attention. While recurrent models aim to compress the data into a fixed-size memory (called hidden state), attention allows attending to the entire context window, capturing the direct dependencies of all tokens. This more accurate modeling of dependencies, however, comes with a quadratic cost, limiting the model to a fixed-length context. We present a neural long-term memory module that learns to memorize historical context and helps attention to attend to the current context while utilizing long-past information. We show that this neural memory has the advantage of fast parallelizable training. From a memory perspective, we argue that attention due to its limited context but accurate dependency modeling performs as a short-term memory, while neural memory due to its ability to memorize the data, acts as a long-term, more persistent, memory. Based on these two modules, we introduce a new family of architectures, called Titans, and present three variants to address how one can effectively incorporate memory into this architecture. Our experimental results on language modeling, common-sense reasoning, and time series tasks show that Titans are effective compared to baselines, while they can effectively scale to larger than 2M context window size in needle-in-haystack tasks.
Speaker bio: Ali Behrouz is a Ph.D. student in the Computer Science Department at Cornell University and a research intern at Google Research. His research spans topics from deep learning architectures to their applications in healthcare and neuroscience, and appeared at NeurIPS, ICML, KDD, WWW, CHIL, VLDB, ... conferences. His work has been featured with two Best Paper awards, a Best Paper Honorable Mention award, a Best Paper Award candidate, and oral and spotlight presentations.