Modern Hopfield Networks and Attention for Immune Repertoire Classification
Michael Widrich, Bernhard Schäfl, Hubert Ramsauer, Milena Pavlović, Lukas Gruber, Markus Holzleitner, Johannes Brandstetter, Geir Kjetil Sandve, Victor Greiff, Sepp Hochreiter, and Günter Klambauer

An immune repertoire X is represented by large sets of immune receptor sequences si. A neural network h serves to recognize patterns in each of the sequences and maps them to sequence-representations zi. A pooling function f is used to obtain a repertoire-representation z for the input object. Finally, an output network o predicts the class label y.
In our companion paper, we introduced a new Hopfield network with continuous states that can store exponentially many patterns and has a very fast convergence. We also showed that the update rule of the new Hopfield network is equivalent to the attention mechanism of the Transformer architecture. Here we exploit high storage capacity of the proposed model.
We present a novel method DeepRC that integrates Transformer-like attention into deep learning architectures. We use this method to solve a multiple instance learning problem in computational biology: immune repertoire classification. This task consists in classifying the immune status of an individual based on a very large set of immune receptors presented as amino acid sequences. This problem is challenging as only a small fraction of specific receptors determines immune status to a particular disease. We tackle immune repertoire classification for hundreds of thousands of instances per set, with a very low rate of discriminating instances. We demonstrate that DeepRC outperforms all other methods in predictive performance on large-scale experiments, including simulated and real-world virus infection data, and enables the extraction of sequence motifs related to a particular disease. Accurate and interpretable machine learning methods solving immune repertoire classification can provide important biological insights and pave the way towards new vaccines and therapies.
Download source code and datasets.
arXiv:2007.13505, 2020-07-16.