TransU-Net++: Rethinking Attention Gated TransU-Net for Deforestation Mapping
Ali Jamali, Swalpa Kumar Roy, Jonathan Li, and Pedram Ghamisi

An overview of the proposed architecture.
Deforestation has become a major cause of climate change, and as a result, both characterizing the drivers and estimating segmentation maps of deforestation have piqued the interest of researchers. In the computer vision domain, Vision Transformers (ViTs) have shown their superiority compared to extensively utilized convolutional neural networks (CNNs) over the last couple of years. Although, ViTs has several challenges, specifically in remote sensing image processing, including their significant complexity that increases the computation costs and their need for much higher reference data than that of CNNs. As such, in this paper, we introduce an attention gates aided TransU-Net, called TransU-Net++ for semantic segmentation with an application of deforestation mapping in two South American forest biomes, i.e., the Atlantic Forest and the Amazon Rainforest. The heterogeneous kernel convolution (HetConv), U-Net, attention gates, and ViTs are all utilized in the proposed TransU-Net++ to their advantage. The TransU-Net++ significantly increased the performance of TransU-Net’s over the Atlantic Forest dataset by about 4%, 6%, and 16%, respectively, in terms of overall accuracy, F1-score, and recall, respectively.Moreover, the results show that the developed TrasnU-Net++ model (0.921) achieves the highest Area under the ROC Curve value in the 3-band Amazon forest dataset as compared to other segmentation models, including ICNet (0.667), ENet (0.69), SegNet (0.788), U-Net (0.871), Attention U-Net-2 (0.886), R2U-Net (0.888), TransU-Net (0.889), Swin U-Net (0.893), ResU-Net (0.896), U-Net+++ (0.9), and Attention U-Net (0.908), respectively. The code is available on GitHub.
International Journal of Applied Earth Observation and Geoinformation, 120, 103332, 2023-06-01.