MC-LSTM: Mass-Conserving LSTM
Pieter-Jan Hoedt, Frederik Kratzert, Daniel Klotz, Christina Halmich, Markus Holzleitner, Grey Nearing, Sepp Hochreiter, and Günter Klambauer

MC-LSTM architecture
Many real-world systems are governed by conservation of mass, energy, number of particles, or other properties. For modeling such systems, specialized mechanisms should be used to conserve and redistribute certain inputs across the system storage. We will refer to any type of conserved quantity as “mass”.
In this work, we introduce Mass-Conserving Long Short-Term Memory (MC-LSTM) network that enforces the conservation of mass. The original LSTM model incorporated memory cells into recurrent neural networks (RNNs), which alleviated the vanishing gradient problem. Here, the memory cells are used as mass accumulators, or mass storage. The sum of the memory cells in the network represents the current mass of the system and is conserved over time. The MC-LSTM gates operate as control units on mass flux. The inputs are divided into mass inputs which are conserved and auxiliary inputs which are used to control the gates. We show that MC-LSTM provides a powerful neural arithmetic unit. We apply MC-LSTM to traffic forecasting, modeling a pendulum with friction, and modeling hydrological processes, and demonstrate that MC-LSTM excels at task requiring conservation of mass.
arXiv:2101.05186, 2021-01-13.