attention module

Approximation of the standard attention module of quadratic complexity in Transformers with a linear attention module in Performers obtained by decoupling the matrices in lower rank decomposition.

The recent paper “Rethinking Attention with Performers” introduced the Performer, a new model that approximates Transformer architectures and significantly improves their space and time complexity. A new blog post by our Sepp Hochreiter and his team, “Looking at the Performer from a Hopfield point of view”, explains the model in detail and discusses the connection between Performers and classical Hopfield networks.

Transformers are powerful neural network architectures that achieved impressive results in several areas of machine learning, including natural language processing (NLP), conversation, image and music generation, and bioinformatics. Transformers rely on a trainable attention module that identifies complex dependencies between the elements of each input sequence. They scale quadratically with the size of the input sequence, which makes them computationally expensive for large inputs. Performers overcome this problem by constructing attention mechanisms that scale linearly – a major breakthrough in improving Transformer models.

Performers provide accurate and unbiased estimate of the softmax-based attention in Transformers. The linear attention module in Performers is implemented using the Fast Attention Via Positive Orthogonal Random Features (FAVOR+) algorithm. This method is of broad interest beyond Transformers as a more scalable replacement for the regular attention. Similar to parallels between Transformers and continuous modern Hopfield networks, the attention module of Performers resembles the update rule of classical Hopfield networks.

0 Comments

Leave a reply

Your email address will not be published. Required fields are marked *

*

©2020 IARAI - INSTITUTE OF ADVANCED RESEARCH IN ARTIFICIAL INTELLIGENCE

Imprint | Privacy Policy

Stay in the know with developments at IARAI

We can let you know if there’s any

updates from the Institute.
You can later also tailor your news feed to specific research areas or keywords (Privacy)
Loading

Log in with your credentials

or    

Forgot your details?

Create Account