Cost Optimization at Early Stages of Design Using Deep Reinforcement Learning
Lorenzo Servadei, Jiapeng Zheng, José Arjona-Medina, Michael Werner, Volkan Esen, Sepp Hochreiter, Wolfgang Ecker, and Robert Wille

Pointer Network.
With the increasing complexity of modern system on chips (SoC), the hardware/software interfaces (HSI) play a key role in their functionality and security. HSI satisfy a dedicated purpose and typically have a similar structure. Industrial design of sophisticated HSI relies on automation and model-based approach. In determining the best HSI model, its cost is a main criteria but it is not known until the configuration is realized. HSI design is a multiple-objective optimization problem, as it depends on various hardware and software metrics.
Here, we introduce a new approach for HSI design optimization. Our approach is based on deep reinforcement learning – a combination of reinforcement learning (RL) and deep learning. As a backbone for implementing deep RL algorithms, we use Pointer Network, a neural network specifically applied for combinatorial problems. We adapt three deep RL methods from literature to determine configurations optimized towards multiple objectives. To evaluate these methods, we generated three datasets with varying lengths using an industrial design generation framework. Our results show that deep RL methods show significant improvement for larger datasets. For the largest sequence, RUDDER outperforms other methods.
Proceedings of the 2020 ACM/IEEE Workshop on Machine Learning for CAD, 37-42, 2020-11-16.